NaN memories
12/Mar 2015
A nice thing about Twitter is that single tweet can bring so many memories. It all started when Fiora posted this:
It reminded me of an old bug I encountered few years ago when working on a multi-platform (PC/X360/PS3) title. One day we started getting strange bug reports. Apparently, if you jumped down the roof at a very specific position, player would start to slide across the map. To make things more interesting, this was only happening on consoles, PC version was fine. After few minutes of investigation I narrowed it down to a fragment of code that looked roughly like this (don’t have access to the original code anymore, trying to recreate it from memory, that was player movement module):This reminds me of how HLSL very carefully defines "saturate" in a way that makes NaNs turn into 0: https://t.co/0tWzaa6H2K @rygorous
— Fiora (@FioraAeterna) March 11, 2015
float x = y / sfactor;
float vx = fmax(0.0001f, fmin(x, MAX_RESPONSE));
Looks innocent, but it’d break completely for y==0.0. The whole block has been added recently and the ‘sfactor’ property was exposed to the editor and controlled by designers. As it turned out, it’s been set to 0.0 as well. If y == 0.0 and sfactor == 0.0, then x is NaN. fmax and fmin are, as you probably have guessed, floating-point versions of the min/max functions. For PC we were using a naive/reference implementation, ie:
float fmax(float a, b) { return a > b ? a : b; }
float fmin(float a, b) { return a < b ? a : b; }
Let’s see what happens in our case. fmin(NaN, MAX_RESPONSE) returns MAX_RESPONSE, as any comparison against NaN returns false. It’ll be fed to fmax and since it’s greater than 0.0001f, vx = MAX_RESPONSE.
Things are a little bit more interesting on PPC, though. If you coded for PowerPC, you’re aware of two facts:
it hates branches,
it has a ‘fsel’ instruction that allows for branchless floating-point code. Basically it ‘selects’ one of two given values based on another value (depending if it’s greater or less than zero).
The usual way of implementing fmin & fmax using fsel would be:
float fmax(float a, float b) { return fsel(a - b, a, b); }
float fmin(float a, float b) { return fsel(a - b, b, a); }
You can probably spot a problem already. If not, refer to this bit of fsel description (+my notes): “If the value in FRA [a - b in our case] is less than zero or is a NaN, floating point register FRT is set to the contents of floating-point register FRB [3rd argument]”. Think about what happens as our NaN flows through the fmin & fmax functions in this case:
fmin(x, MAX_RESPONSE): a - b is still NaN, so ‘a’ (x) is returned,
fmax(0.0001f, x): a - b is NaN, b (x) is returned.
In this case, NaN would just go through both functions and we ended up with an invalid vx. In this particular case it was an unfortunate coincidence that the original author used fmin(x, MAX_RESPONSE) – fmin(MAX_RESPONSE, x) would have been “fine”, but at least it helped us find the actual problem which was invalid value of the ‘sfactor’ property (..and incompatibilities between fmin/fmax on different platforms).