A Halloween Story - get_temporary_buffer
1/Nov 2017
A little bit late, as Halloween was yesterday, but I think std::get_temporary_buffer is scary enough to qualify. A co-worker called me today to show me an ‘interesting’ crash. It was deep in guts of the STL, more specifically in the std::_Merge method.
Corresponding C++ line seemed innocent:
*_Dest++ = _Move(*_First1++);
Nothing very interesting, not our code and yet it was crashing with: Exception thrown at 0x01293C05 in foo.exe: 0xC0000005: Access violation reading location 0xFFFFFFFF
Let’s open the disassembly window and see what instruction is causing problems exactly:
01293C05 movaps xmmword ptr [ebx],xmm0
EBX=0x001d0638 here, not 0xFFFFFFFF, though, so what’s going on? It can be a little bit confusing at first, but if you worked with SSE/MSVC before you probably know you can’t always trust the debugger here. It’s true that we hit an access violation, but it’s not because our memory region is invalid, it’s simply a misaligned access (notice the movaps instruction that expects data to be aligned at a 16-byte boundary 0x001d0638 is not). OK, progress, let’s backtrack and see what our code is doing. Again, fairly simple:
void SomeFunc(SIMDStruct* begin, SIMDStruct* end)
{
std::stable_sort(begin, end);
}
(SIMDStruct is some struct containing __m128 elements inside). I thought it’d be easy, it’s probably just not aligned properly, but we inspected both structure declaration and our container allocates and they were handling it properly. begin was nicely aligned. Hmph, OK, let’s dig deeper and see where does EBX even come from… At first I assumed it must have been an element of given array, but the address (0x001d0638) was not in the [begin, end) range. Luckily, I had some vague recollection of stable_sort using merge sort and as we know, merge sort requires an extra buffer. Just to confirm, let’s look at our callstack and inspect function signatures:
// We crash inside this one when writing to _Dest
_OutIt _Merge(_InIt1 _First1, _InIt1 _Last1,
_InIt2 _First2, _InIt2 _Last2,
_OutIt _Dest, _Pr _Pred, bool _In_place = false);
void _Chunked_merge(_BidIt _First, _BidIt _Last, _OutIt _Dest,
_Diff _Chunk, _Diff _Count, _Pr _Pred);
As you can see _Dest is passed down the callstack, but if we keep going up we eventually hit _Buffered_merge_sort that does:
_Chunked_merge(_First, _Last, _Tempbuf._Init(), _Chunk, _Count, _Pred);
- try to somehow replace get_temporary_buffer with own version
- use own stable sort
- do not mix SIMD types and std::stable_sort