A little bit of fun

This recent tweet:

from Ryg reminded me of another old story related to bitfields in C++. Consider the following piece of code:

typedef char BOOL;
...
// Some other file
struct SomeStruct
{
    BOOL m_something : 1;
    BOOL m_isVisible : 1;
    BOOL m_somethingElse : 1;
    BOOL m_someOtherStuff : 5;
    ... even more stuff
};
...
void SomeFunction(SomeStruct* s)
{
    if (s->m_isVisible == TRUE)
    {
        // Do something
        ...

That’s a very rough copy of fragment of codebase I worked with few years ago. Initially, all BOOLs in SomeStruct were individual member variables (not bitfields) and everything worked perfectly, world was beautiful. Then, somewhen in the final stage of the project, we were trying to cut down the memory usage and someone converted all those BOOLs to bitfield. W00t! Instant savings, 1 byte instead of 4 (actual savings were bigger, but that’s not the point)! The only problem was, not long time later we started getting bug reports, some of our objects were rendering incorrectly. Can you spot a problem here?

So, I started debugging this issue and to my surprise found out that block of code inside ‘if’ in SomeFunction was never executed, even for objects with isVisible set to true! The root of the problem originated probably few more years back, when somebody else decided to introduce BOOL type. That was quite typical in code that was initially created in C, as it had no native bool. Even if it was later ported to C++, people sometimes didn’t bother with converting types (it worked, after all). Now, in MSVC char is signed by default (you have to use /J to change it). This means that even if this particular bit representing isVisible can be only 0 or 1 (it’s still just a bit, after all), after reading/converting compiler adds sign information, so it now becomes 0 or -1 -> the actual conversion looks like this:

mov         al,byte ptr [f]
shl         al,6
sar         al,7
movsx       ecx,al   ; ECX contains 'full' representation of isVisible

SomeFunction would still have worked correctly with just if (m_isVisible) test, but for unknown reasons it was testing explicitly against TRUE (1) (some people like to type too much, I guess). In this case, though, it actually turned out to be a good thing, as it allowed us to find it quickly…

Just for kicks, here’s how it looks for unsigned 1-byte types (like proper C++ bool or unsigned char):

mov         al,byte ptr [f]
shr         al,1
and         al,1
movzx       ecx,al

It was one of those situations that arise when various people modify code over the span of few years (and original author isn’t there anymore), I guess we just assumed that BOOL must be unsigned.

Old comments

dmytro 2011-04-03 07:38:55

Beatiful :)

Reg 2011-04-13 06:45:45

This is another argument for using an unsigned type, wherever we don’t need negative numbers, not signed int/short/char, as some programmers do by default.