To assert or not to assert

(or rather: how & when to assert). Assertion is probably one of my favourite programming tools, however there are still few areas that I’m not entirely sure how to solve in an optimal way. First, let’s start with things that I’m rather convinced about, few simple guidelines for my ideal assert macro (funnily enough, during my career I’ve never seen implementation respecting all those rules at once):

  • when it breaks into debugger make sure it stops in the same line assertion actually failed, not two hierarchy levels down (so every time it’s triggered one has to rewind first to see what’s wrong exactly)! Simple rule, yet I’d say it’s only 50% chance in a typical codebase to be implemented this way (I think the record I’ve seen is 3 callstack levels away, that’s also how default MSVC assert behaves). Easiest way to do it:

    // (assertion macro)
    do
    {
        if (!(expr))
        {
            if (!AssertionFailureHandler(#expr, __FILE__, __LINE__))
                BREAK_TO_DEBUGGER;
        }
    } while (false)
    All your fancy code (logging etc) goes into handler, then return false (true/code/whatever you like) to indicate we should break to debugger.

  • Use per-module/per-level assertions. The way I like to do it is to have a ASSERT_ALWAYS macro for generic assert implementation. Then, global ASSERT macro that can be disabled in some builds/enabled in other, ASSERT_PEDANTIC that’s only enabled in ‘very debug’ builds, then per-module assertions, which can be enabled/disabled independently from global settings. This way I can quickly build, say, only math library with assertions enabled (all those: ASSERT(x >= 0) in sqrt wrappers etc) while hunting for obscure calculation bugs etc.

  • Assertion overhead should be as little as possible. Sure, it is a debugging tool, but at the same time we should try to avoid bringing whole application to crawl whenever we enable it. Usually, it’s platform specific, so refer to documentation/tutorials. For X360, there’s publicly available Gamefest 2008 presentation by Bruce Dawson himself - ‘Xbox360 Compiler and PgoLite Update. Making your code faster, with little effort’. Microsoft tends to change document locations all the time, so I don’t dare to put links here, Google should help.

Now, assuming we have our perfect assertion system, there are still some problems left to solve. Things get more complicated here, personally, I don’t have definite answers for the following issues:

  • when should assertions be enabled? There are various ‘schools’ here. I’ve seen the following solutions so far: only in debug builds, in debug + QA builds and always. Problem with the first one is obvious - usually only programmers run debug builds, so only they would benefit from assertions. Actually, there’s another problem - it’s quite often that true, 100% debug builds (with no optimizations and all assertions enabled) run so slow they are virtually unplayable with bigger levels. The ‘solution’ usually is to compile some libraries (third-party or rarely modified and throughly tested) with optimizations enabled/asserts disabled (depending on policy). Having at least some assertions enabled in QA builds (builds that are used by the team during development) is way to go, IMHO. I know there are some titles shipped with (selected) assertions enabled in retail builds, but that goes against basic idea that final program should work without assertions in exactly same way as with them.

  • assuming we do have assertions enabled in builds that are distributed amongst the designers/graphicians, people running it without debuggers, next question is: what should assertion handler do? Should assertion be treated as fatal error, report, close the game, exit or should it be skippable? In an ideal world, assertions are used to handle code errors only, so terminating sounds like a sane idea. After all, we found ourselves in a situation that’s most probably not recoverable. Sadly, there are places, especially in the lower level libraries where you just can’t guarantee that input won’t change depending on the data. Basic example - out of range indexing in your vector class:

    T& operator[](int i) const
    {
        ASSERT(i < Size());
        return data[i];
    Imagine we use this structure to store vertices, then index it using triangle data loaded from disk. If mesh is corrupted, we get out-of-bounds access & fatal error. This also means that 100 people can’t work now, because level crashes at the start. That’s simplified example, but you get the picture. That’s another subject on its own actually - should game care about invalid data?

  • Should we have special code paths for final builds dealing with fatal errors? Consider the following snippet:

    Object* obj = ObjectForId(m_idOpponent);
    ASSERT(obj);
    if (obj) // should this test be here or not?
        obj->DoSomething();
    Here, I’m certain: there should be no special test. If opponent ID can be invalid, remove the assert. If you’re sure it’s always valid - put the assert, remove the test, let it crash. That’s actually one of places where I hate to see people ‘fixing’ problems like that with cheap workarounds. Too many times I have seen a situation when programmer ‘fixed’ the crash by adding if (ptr) and happily moved to other tasks. NEVER DO THIS, that’s the worst thing you can do. You don’t fix problem this way, you merely sweep the dirt under the carpet and make the real root of the bug even harder to track down (as now, it has a big chance to crash many calls later). If you are absolutely sure ptr can’t be NULL there - don’t try to work around it, it’s a fatal error and should be treated as such. Sure, you may have some higher level architecture preventing application from a hard crash (like SEH), but log it, do NOT exit quietly. Of course, try to find why ptr was NULL in the first place. Is it a bug or a valid code path? Generally, I’d say we should try to minimize number of functions that care about NULL pointers, do tests at the highest level possible. I treat if (ptr) explosion that’s sometimes visible in later stages of the project as sign of laziness and lack of understanding of program architecture. ‘The best’ of this category that I’ve actually seen:
    if (ptr == (Foo*)0xcdcdcdcd || ptr == (Foo*)0xbaadf00d || ptr == (Foo*)0xfeeefeee)
        return; // YAY, I'm so smart!

My (rough) idea of good assertion/warning/error system:

  • distinguish between tests of internal program state (assertions) & general error handling (file not found, networking, user input). There are games out there that use assertions as error handling mechanisms. Let me tell you, it’s not pretty. Game without bugs should work with the first system disabled (second is still there in final build). That’s another way to look at it - ‘proper’ error handling for situations that can happen in retail builds, asserts for those that can not.

  • deal with missing resources by using generated placeholders (pink cubes etc), not needed in final builds.

  • game can assume that provided data format is always valid, no special error handling here. It’s tool/exporter/data cooker responsibility to guarantee it. Invalid data does not get exported, period. In practice it is impossible to keep it that way 100% of the time and the game will crash on invalid data from time to time, but in my experience it’s pretty rare and usually happens more often at the beginning of the production process. With mature tools it’s a reasonable goal. Assertions still may be a good idea, to provide more information in case of crash. Nothing too fancy, though, just basic stuff, like ranges etc, no tests if shadow geometry is valid. That’s exporter responsibility, again, here’s it’s a different story, tests should be as strict as possible. It’s probably lesser evil to have two artists unable to export their mesh because limits are too strict than to have 70 people stuck because invalid data has been exported.

  • asserts are fatal, game terminates, every failed assert should be logged, at least to local file (that’s later attached to every crash report). I’d definitelly would like to experiment with central assert database (see Mick West’s article). Later it can searched for recent asserts, see which fail most often etc. There is no good way to work around failed assert (it’s too general & too low level, how do you handle out-of-range indexing for example?).

  • if there are situations that can cause problems, but it’s possible to continue (usually data/input related) - make it a warning, not an assert. Ideally, there should be only few warning checks in game and hundreds in tools. Do not spam screen with warnings or people will stop caring. See Scott Bilas’ ‘Optimizing the Development Pipeline’ presentation for good ideas. I especially like the big warning counter. Warnings could go to database as well, this way producers could see who’s responsible for the oldest active ones and make victim life miserable until he fixes it.

  • auto-test integration. Every time a new build is one, build machine tries to run the game for few minutes with random/recorded input. Any assert/crash causes the build to be considered broken, so that people don’t sync to it. This should deal with 90% of simple data problems/oversensitive assert conditions.

Your thoughts?

Old comments

Liam 2010-01-06 22:32:21

Whilst I agree with most of what you write and as someone (Martin Fowler?) said fail early. There is a place to use assertions and also places where they should not be used, for example in Unit Test code. When TDD’ing there is generally (no place) for asserts yet not everything in games can be unit tested (well not easily), asserts are that last thing you want to hit when writing a unit test for legacy or code written without tests.

Alexey Orlov 2010-01-07 03:54:26

to Liam
not necceseraly, you can define assert as
if( !cond && HandleAssert(…) )
HaltExecution();
and in debug\test build safely check for assert firing.
Actually, assert is the most useful “debug” feature that should be enabled till gold-master - the info is valuable – you cant get it anyway else.

Maciej Miasik 2010-01-07 09:59:20

Maciek, tell why what you propose as the end of your post seems to be a common knowledge among developers worth their salt, yet we both haven’t seen that in action when we worked together? This is not my complaint towards you, but rather more general question… And I still don’t see this in effect now, unfortunately, but that’s probably because the experience of our programming team has significantly lowered in recent time.
Anyway, I think that asserts should stop execution - this is the strongest incentive to get the problem fixed as soon as possible otherwise the common answer from programmers would be: “Oh, just click ‘Continue’ every damn time you see this asset message box and stop whining.”
If the data is corrupt and assert catches it then it’s probably good idea to fix the data immediately instead of keeping 100 people running the game with corrupt data and then receive hundreds crash reports of unknown origin.
The good error logging is crucial - not only assert condition, but other pertinent info is always nice. Instead of using generic assert macros which usually (or not) dump assert condition and its location (file/line), use extended assert macros with additional information e.g. function name, parameters, etc. At least print data filename if this assert concerns corrupt data for God’s sake - some artists are really capable of figuring out what’s wrong when they see informative message about the problem.
I’m a bit ambivalent about warnings. They seems to be generally completely ignored by people who should be concerned about them. I still remember The Witcher running on our team’s monitor and flooding them with hundreds warning messages (right justified for some unknown reason). I took me some time to persuade someone to turn on-screen logging off by default seeing that nobody cared about the warnings and they only spammed the view. We ignored those warning till beta when we managed to force people to look into the logs and clean the problems (and I’m not sure if we managed to get totally clean even then).
If there’s reason for warning there’s also reason to look into it: fix the problem, provide placeholder, etc. Maybe they should have a kind of “best before” date after which they turn into permanent errors?

Liam 2010-01-07 11:37:08

Alexey
if( !cond && HandleAssert(รข??) )
HaltExecution();
and in debug\test build safely check for assert firing.
That piece of code just makes me ask many questions :)
But how do you check that assert has fired? I mean the programme has terminated and we have no result data that tells us the exact problem of which test failed using what input? All we know if that an assert was fired and possibly the line number and file.
Does HandleAssert rely on global state that you set when Unit Testing?
Is there more than one path of execution?

Lee Winder 2010-01-07 12:57:10

Its good to know that the assert system we use at work meets all three of your ideal assert points :)
Regarding the ability (or lack) of unit tests to use asserts. A good solution I’ve used (especially where you want to test that asserts have fired when dealing with invalid data) is to have the ability to disable, but record, assert triggers.
So for example, have a debug function DisableAssertBreak() which turns off the ‘BREAK_TO_DEBUGGER’ call, but within the assert function keep a record of the asserts fired regardless. Before the function in the unit test call ResetAssertCount() and then after call GetAssertCount() which returns how many asserts were triggered. Then test this against the expected value as required.
Since unit tests are there to test the public face of the code, it doesn’t get more public than your function blocking execution in an expected manner.
Obviously if you have assert modules or groups, or if the code is multi-threaded, then this could be extended (and be more complicated), but it’s only a simple example to prove the point that asserts can be used in Unit Tests and can be tested if required.

admin 2010-01-07 13:58:42

Great input everyone, thanks.
@Maciej - without going into internal details (which we can discuss over the beer) - it’s my way of looking at things, not the way (I actually did fix assert macro in W2, so that it breaks in correct place, though :p). Second, more important reason is that it’s much easier if you start with such guidelines from the very beginning. For example, try switching from non-crashing to crashing asserts in the middle of big project, people will eat you alive (unless there’s a strong support of higher management for such move, but that’s another story). It also depends what discipline ‘runs’ the studio, it’s much harder in groups where artists/designers dominate, usually coders end up writing work-arounds for invalid/old data, which is the worst you can do, IMHO, but again - that’s just me.

Noel 2010-01-07 17:54:32

It doesn’t do everything you want, but the assert implementation we used at Power of Two Games was the best I’ve seen so far. Charles wrote all the details (plus MIT-licensed code) here: http://cnicholson.net/2009/02/stupid-c-tricks-adventures-in-assert

admin 2010-01-07 20:46:28

Yes, I should have probably mentioned Charles’ article, it’s one of the best on this subject. I stolen shut-up-the-compiler-in-release-build-but-don’t-generate-any-code trick from it (sizeof(expr)).

ed 2010-01-09 14:09:58

I disagree that assert should stop execution. If a programmer foresees a problem in a particular place he should deal with it instead of “i’ll-leave-it-for-later-maybe-it-will-not-happen-crossing-my-fingers”.
That’s why assert should behave like exceptions. You catch the problem, return from function or execute alternative path. I’ve seen that in one game (ASSERT_RETURN and ASSERT_ELSE).
Stoping the game/tool on assert doesn’t do anything good. It may be good for a 1-2 programmers and some pedantic manager. But it is never good for an user.
If you really, really, really want to stop the game on the ASSERT, do this ONLY during initialization. Stoping the game/tool in the middle of the session is a really bad thing and a complete waste of time for an user/tester/artist/designer etc. ESPECIALLY in the tool in which single assert may waste 2-3hrs of work.
Of course that is why also you need a lead/manager who will catch a problem ASAP and find anyone whos responsible for this.

admin 2010-01-09 16:52:34

I’m not convinced. I mean, if you foresee a problem, it’s not a place for assert. Asserts are for conditions that have to be satisfied in order for the program to work correctly. Once assertion fails, all bets are off. How would you suggest dealing with out of range indices in [] operator? Returning dummy values? What to do when we’re writing to array?
Asserts crashing tool/game can be a problem, but in my experience - not that critical. I worked in studios that used skippable asserts (or no asserts in release builds), worked in those that had asserts terminating the game (teams of 100+, not 1-2 programmers :), the latter seemed to work better (in my opinion). But yeah, that’s one of the things I’m not entirely sure about, just a hunch.

Maciej Miasik 2010-01-09 18:17:22

Well, if there’s an alternative route/solution for the assert condition it only means that this isn’t something assert are for. This is just error handling.
Also, such approach diminishes feedback - instead of fixing the problem, it is only swept under a rug only to resurface later and/or elsewhere. It’s very difficult to judge any productivity gains from extended assert handling to those delayed loses which are result from using improper data or alternative solutions.
Anyone who loses 2-3 hours of work due to tool assert/crash should be educated how to use unstable, WIP tools first. Save often. Backup you work. Be prepared.
More frequently the problems that are indicated by asserts are fixed the better for end user.

ed 2010-01-10 02:04:47

My guess is that you didn’t get my point :)
There are only two kinds of problems: those you foresee and those you don’t. Those you foresee are always “marked” with assert. If you place an assert in []operator then it means that you actually foresee a potential problem here: “index out bound = bad”.
Placing an exit() means “I don’t have time know, and I’ll deal with it when it happens”. And that is actually “sweeping under a rug” because this will never save your time. You will eventually spend that extra time to fix the problem. Question is why anyone besides you must waste his time ?
My point is: if program closes, I don’t care if that was SIGSEGV crash or an assert with message “Doh!”. This is always the worst thing may happen to user. You will loose lots of time (even 5min of work), you will waste your time on start/loading again and you will waste time waiting till problem gets fixed. That sumes up to a lot of wasted time.
You want feedback then track problems often. Database (as you mentioned), instant messaging, anything is ok. Placing exit() will not improve feedback. People (and I mean artists :) ) usually will get tool/game crashed 50 times before they’ll start whining about it. And running to a programmer, every time program crashes is another waste of time.
So, never let the tool/game crash. Check the list of the known issues often, and be user friendly.
About [] operator. In general solution, sure, you may return static variable (protect against read/write) but it would be much better if you place an assert before you use [] operator. There aren’t many places with [] usage. You can always place an assert at function entry. It will be easier for you to decide what to do if “index > size()” and skip plenty of code.
Yes, assert is “just” error handling (during a production time). But definitely it shouldn’t be yet another fancy way to crash a program.

admin 2010-01-10 12:46:58

OK, I see. I think we’re talking about same stuff, roughly. I mentioned separating error handling from asserts. Asserts are like last resort option that triggers if higher-level code didn’t deal with invalid input (like invalid index for example). With automated test system & automated error reporting (so that artists don’t have to run to you every time it crashes) it’s less painful than it may sound.
One problem I see with being too tolerant for data errors is that you end up with code filled with if (this) or if (that) that’s only needed during development (and probably shouldn’t be needed atl all with a little bit of effort and better tools).
Whole matter should also probably be treated in a little bit different way in tools & in game. Tools should be much more forgiving, I agree.

Mateusz Loskot 2010-02-03 12:08:45

@Liam Regarding assertions in TDD. This comment lacks of details about context to judge use of assertions this way as a good or bad practice. I’m not saying I don’t agree with Fowler. Though, it’s nothing wrong to decide to use assertions as checkpoints in test cases. I don’t discuss if it is convenient or not, but from testing point of view, they fit it well - assertion checks if assumed condition is fulfilled or not.

Pavel Shevaev 2010-05-15 07:35:12

… and it looks like using exceptions for errors which should not be asserted is not an option in gamedev, right ? ;)

admin 2010-05-15 19:58:07

Pavel: Not really. It’s not even about the performance, they’re simply not always fully supported on some platforms.