Tuesday, 30 November 2010

Coding antipatterns: excessive nesting

With apologies to Dave Neary for stealing his excellent word, 'antipatterns', from his talk at MeeGo Conference.

This is an opinion piece. Feel free to skip it if you already know what you're doing when writing code.

We all have things we dislike in code, sometimes it's indentation, sometimes it's a bit less trivial. One of my current pet hatreds is excessive nesting.

For those of you that don't know what I mean, if you see something like this:
void foo(bool bar, bool lol, bool hax, bool meep)
{
    if (bar) {
        if (lol || hax) {
            if (meep) {
                // do something
            }
        }
    }
}

Then you're probably a victim of excessive nesting.

If you're writing code like this then all I can say is you're probably doing it wrong.

That above example might be written better like so:
void foo(bool bar, bool lol, bool hax, bool meep)
{
    if (!bar)
        return;

    if (!lol && !hax)
        return;

    if (!meep)
        return;

    // do something
}

Generally speaking, you want your code to be like a river: to flow from one point to another, and branch off gracefully, but still continue flowing nonetheless.

In more technical terms, this makes it a lot easier to take appropriate actions at any of those junctions (if, say someone wants logging added for all of those returns in future) and makes your code a lot easier to read.

Please stop and think about what you're doing, and refactor if necessary rather than blindly adding another conditional. It might be easier to add another level of nesting, but you'll suffer in the long term. Think about it.

As a general rule of thumb, I personally think if you have more than three levels of nesting, you might need to think whether you need a new method, or whether you need to rethink your code's flow.

23 comments:

  1. On a related note, I don't like when code returns early from a function, so here's what I'd do in this case:

    void foo(bool bar, bool lol, bool hax, bool meep)
    {
    if (bar && (lol || hax) && meep {
    // do something
    }
    }

    Btw, I agree with the overall point behind the post.

    ReplyDelete
  2. And worst most of the time this 'if' test are use to test value where a exception managment is more efficient.

    Simple example in python :

    values = (10,9,8,7,6,5,4,3,2,1,0)
    for value in values:
    try:
    print unicode(10/value)
    except ZeroDivisionError,err
    print u'Infinite'

    Here you didn't loose time to test value when there isn't any needs

    ReplyDelete
  3. credit where due: "antipatterns" was coined by Andrew Koenig in 1995. http://en.wikipedia.org/wiki/Anti-pattern

    :)

    ReplyDelete
  4. FYI "antipattern" word wasn't invented by Dave Neary ;-) It existed long ago.

    ReplyDelete
  5. I like functions that return in the middle of code, but some of my coworkers follow the "one return in a function" rule. For me, the nesting seems to happen when you are reducing the returns or you end up putting a guard value to avoid the nesting, e.g.:

    bool processed;

    if (bob)
    {
    processed = true;
    }

    if (!processed && gary)
    {
    processed = true;
    }

    return processed;

    And I find that hard to read and debug. But, different strokes for different folks.

    ReplyDelete
  6. Pretty much the only nice thing about 8-character-wide tab indentation is that excessive nesting is really, really visually jarring, which generally snaps people out of it.

    ReplyDelete
  7. Personally, I can accept a range of returns at the top of a function, but otherwise I tend to lean on MISRA rule 82 (which is only an advice and not a requirement, but still):

    "82 (adv): A function should have a single exit point"

    ReplyDelete
  8. Full ack!

    hehe, and for as many pro arguments you can make, someone will invent a counter argument :p

    ReplyDelete
  9. @Aaron, Arkadiusz: I wasn't aware of that, though I did suspect it was likely that it had been coined earlier. I'll update the text later. Thanks!

    ReplyDelete
  10. I don't think you should dismiss nesting so easily. Excessive nesting is not a good sign, true, but to consider 3 levels excessive seems.... well excessive :)

    I would definitely prefer:

    if (step1()) {
    if (step2()) {
    if (step3()) {
    finalize();
    }
    }
    }

    to:

    bool ok = step1();
    if (ok) {
    ok = step2();
    }
    if (ok) {
    ok = step3();
    }
    if (ok) {
    finalize();
    }

    because the first conveys the "image" of choice while in the second example I need to read the code to make sure I'm not missing anything.

    ReplyDelete
  11. Hmmm, nice, the preview showed indentation but the final result didn't. Sorry, this way my example loses much of the "image" I was talking about ;)

    ReplyDelete
  12. I understand your point regarding code clarity, but you forgot to consider the hardware.

    Regarding branching, there's a rule which stands for almost all CPU architectures: static prediction is implemented so that forward-pointing branches will not be taken. That's in direct contradiction with what you consider being better. The first time your code is executed, it generates a branch misprediction implying several cycles to recover, and then a fetch in the Instruction Unit of the unlikely branch instructions from L1 cache, L2 cache or main storage, implying more wasted cycles.

    I think that's more of a tradeoff between performance and readability, and it depends on the projects you work on.

    ReplyDelete
  13. So whats the better way, loicm? I have gone back and forth on early return or single return. Can you give an example?

    ReplyDelete
  14. No apologies necessary! And as Aaron says, the phrase antipattern is older than my granddad's old socks.

    You forgot the related "other extreme" antipattern, which is "excessive branching"... If you have 10 returns in a function, that's a whole bunch of extra code paths to debug. I've seen programs in Java that caught & "handled" exceptions (by ignoring them) that resulted in bugs that took ages to find, because code which was assumed to have been run wasn't getting run - the exception was returning from the function before the key part of the function was reached.

    I think Joel Spolsky wrote about this at one point. Yes, he did. Here it is: http://www.joelonsoftware.com/items/2003/10/13.html

    See "Too many exit points".

    Dave.

    ReplyDelete
  15. @Dave

    I fully agree that a function shouldn't have too many exit points, but I don't think this applies here. The three additional exit points in the example can be considered one (and will probably even be reduced to one by the compiler). They are early exits for sometimes common conditions where it doesn't make sense to do anything. In Qt this pattern is used all over the place.

    I guess ideally this "early exit" part of a function should be inlined, so that an actual function call can be avoided altogether. But often this isn't possible when you also want to keep binary compatibility, and it would make the code uglier.

    ReplyDelete
  16. The metric that was invented to defeat stuff like your antipattern here is called cyclomatic complexity.

    So those who discuss about structured programming principles (single exit point vs function guards) are perhaps missing the point.

    ReplyDelete
  17. Also, regarding branch prediction: Consider using switch statements over conditionals.

    I see this
    if (ev->type() == Blah) {
    ...
    } else if (ev->type() == Meh) {
    ...
    }

    all over the place in Qt's source code. In event handling code. Like, code that is run thousands of times in every normal app ...

    ReplyDelete
  18. mikhas4711:

    "all over the place in Qt's source code. In event handling code. Like, code that is run thousands of times in every normal app ..."

    This is because switch statements only work with integral types. For non-integral types, all you have is if - else if

    ReplyDelete
  19. @Venemo: testing against flags?

    ReplyDelete
  20. Overall, Brendan's condition handling is more logical.

    ReplyDelete
  21. In regards to the multiple exit points: I would not only have no objection to it, but would actually *prefer* it... if only C++ had a "finally" clause à la Java! As it is, it's too easy to write a function with a bunch of exit points, then later add something near the top which requires cleanup - and forget that you might never reach the cleanup code.

    ReplyDelete
  22. This article has one solution that works well for me. It turns some forms of nesting into a simple switch.

    http://www.codeproject.com/KB/cpp/switch_flags.aspx

    ReplyDelete
  23. @eleccham: in C++, this is usually done through RAII. Take a look at QScopedPointer, for instance.

    {
    QScopedPointer p(new Foo);

    // the Foo instance will be deleted no matter how the function is exited, once p goes out of scope
    }

    ReplyDelete