A little bit of fun
2/Apr 2011
This recent tweet:
from Ryg reminded me of another old story related to bitfields in C++. Consider the following piece of code:"SomeEnum blah : 8;" Added new enum elements so max value is now >127. GUESS WHAT HAPPENS.
— Fabian Giesen (@rygorous) April 1, 2011
typedef char BOOL;
...
// Some other file
struct SomeStruct
{
BOOL m_something : 1;
BOOL m_isVisible : 1;
BOOL m_somethingElse : 1;
BOOL m_someOtherStuff : 5;
... even more stuff
};
...
void SomeFunction(SomeStruct* s)
{
if (s->m_isVisible == TRUE)
{
// Do something
...
That’s a very rough copy of fragment of codebase I worked with few years ago. Initially, all BOOLs in SomeStruct were individual member variables (not bitfields) and everything worked perfectly, world was beautiful. Then, somewhen in the final stage of the project, we were trying to cut down the memory usage and someone converted all those BOOLs to bitfield. W00t! Instant savings, 1 byte instead of 4 (actual savings were bigger, but that’s not the point)! The only problem was, not long time later we started getting bug reports, some of our objects were rendering incorrectly. Can you spot a problem here?
So, I started debugging this issue and to my surprise found out that block of code inside ‘if’ in SomeFunction was never executed, even for objects with isVisible set to true! The root of the problem originated probably few more years back, when somebody else decided to introduce BOOL type. That was quite typical in code that was initially created in C, as it had no native bool. Even if it was later ported to C++, people sometimes didn’t bother with converting types (it worked, after all). Now, in MSVC char is signed by default (you have to use /J to change it). This means that even if this particular bit representing isVisible can be only 0 or 1 (it’s still just a bit, after all), after reading/converting compiler adds sign information, so it now becomes 0 or -1 -> the actual conversion looks like this:
mov al,byte ptr [f]
shl al,6
sar al,7
movsx ecx,al ; ECX contains 'full' representation of isVisible
SomeFunction would still have worked correctly with just if (m_isVisible) test, but for unknown reasons it was testing explicitly against TRUE (1) (some people like to type too much, I guess). In this case, though, it actually turned out to be a good thing, as it allowed us to find it quickly…
Just for kicks, here’s how it looks for unsigned 1-byte types (like proper C++ bool or unsigned char):
mov al,byte ptr [f]
shr al,1
and al,1
movzx ecx,al
It was one of those situations that arise when various people modify code over the span of few years (and original author isn’t there anymore), I guess we just assumed that BOOL must be unsigned.
Old comments
dmytro 2011-04-03 07:38:55
Beatiful :)
Reg 2011-04-13 06:45:45
This is another argument for using an unsigned type, wherever we don’t need negative numbers, not signed int/short/char, as some programmers do by default.