Debugging heap corruptions, part 2
9/Oct 2010
Few years ago I wrote a short article about dealing with heap corruptions (buffer over/underruns). I mentioned that even after applying described techniques, the biggest problem with overruns is that they’re detected when memory block is freed, not when the actual corruption happens. Those two moments can be very far apart from each other, which complicates debugging. I described some workarounds, but they were not really anything I’d call a satisfying solution. Not long later I found much better way, but somehow forgot to update the article, let’s try it now, better late then never, right?
Actually, I mentioned (partially) the solution in the old post as well – gflags. This option is not applicable to any bigger application, but we can always implement something similar and take some control over memory overhead. General idea is: every allocated block is followed by a non-accessible page of memory. As soon as application tries to read/write to that region – system reports an error. Worst case memory overhead of my implementation is 8192+4095 bytes (assuming 4k pages). Yes, it’s much, it definitelly doesn’t make sense to plug this system in a global memory allocator. [EDIT: As pointed out by Arseny in the comment, it’s actually even worse as VirtualAlloc has granularity of 64k (well, SYSTEM_INFO::dwAllocationGranularity, but it accounts for 64k in my experience). I still didn’t have memory problems when applying it only locally/on per object basis] Luckily, memory problems are quite rare and usually we can easily narrow down possible culprits (ie. we know which block is getting trashed, we just don’t know when). Nothing easier than only override allocation routines for this particular object type/execution point. Sample implementation (again, assuming Win32 + 4k pages):
void* AllocateProtected(size_t bytes)
{
static const size_t pageSize = 4096;
const size_t numPages = ((bytes + sizeof(void*)) / pageSize) + 1;
void* ptr = VirtualAlloc(NULL, (numPages + 1) * pageSize,
MEM_COMMIT, PAGE_READWRITE);
if (ptr == 0)
return 0;
unsigned char* retMem = (unsigned char*)ptr + (numPages * pageSize);
retMem -= bytes;
*((void**)retMem - 1) = ptr;
DWORD oldProtect;
VirtualProtect(retMem + bytes, 1, PAGE_NOACCESS, &oldProtect);
return retMem;
}
void FreeProtected(const void* ptr)
{
void* originalPtr = *((void**)ptr - 1);
VirtualFree(originalPtr, 0, MEM_RELEASE);
}
...
char* buffer = (char*)AllocateProtected(16);
for (char i = 0; i <= 16; ++i)
{
buffer[i] = i;
}
FreeProtected(buffer);
If you run this piece of code under debugger, it should break execution at the last iteration (for i == 16). For completion, you probably want ConstructProtected/DestructProtected templates, but that should be trivial.
As mentioned, normally you don’t want to have those enabled, it’s your little AT squad. Apply when & where needed, squash the bug, retreat.
Old comments
Arseny Kapoulkine 2010-10-10 08:03:24
- A more thorough (but even more virtual space hungry, unless you’re on x64) approach is to VirtualProtect with PAGE_NOACCESS on free, and never release - this ensures lack of access to deleted objects.
- I’m using this in my unit tests for a long time, and I had to to a malloc instead of VirtualAlloc. Unfortunately, VirtualAlloc has a granularity of 64K, which means that the memory overhead is 64K for small allocations.
Simon Kozlov 2010-10-10 16:34:34
AppVerifier will do that and more for you.
I like the idea of protecting just a particular set of allocations though.
Also, on Windows you can use instrumented heap, which logs and checks a lot of things with less overhead and good chance of debugging the problem without additional re-runs.
Advanced Windows Debugging (http://www.amazon.com/Advanced-Windows-Debugging-Mario-Hewardt/dp/0321374460) has a good discussion on that.
admin 2010-10-10 16:38:58
Yeah, the problem with AppVerifier is that I never managed to even progress past loading a very simple game level, memory overhead was just too big.
castano 2010-10-11 22:09:32
In the past I’ve used electric fence for that, there’s no need to use it globally, you can instead use EF_malloc/EF_free just where needed. There’s a win32 port here:
http://code.google.com/p/electric-fence-win32/