Circular buffers (to the rescue)
22/Dec 2018
Circular buffers are one (if not the) of my favorite data structures for some quick&dirty debugging. A simple, not production ready version can be implemented in a few lines of code (not ashamed to admit, I usually just copy paste these and remove when I’m done) and they’re a great tool to “record” a history of generic events. Any time I run into a seemingly random/unpredictable issue that might take a long time to repro, they’re on my short list. It’s especially handy in a multithreaded environment, where adding log statement can affect the timing and make the problem harder to track down. Adding an item to a circular buffer is not “free” (especially if thread-safe), but IME it doesn’t have a performance impact big enough to affect your timing.
The absolutely most basic version might look like this:
size_t sEvents[8192];
int sEventIndex = 0;
#define ADD_EVT(e) sEvents[(sEventIndex++) & 8191] = e;
That’s it. You want to make it thread-safe?
volatile int sEventIndex = 0;
#define ADD_EVT(e) sEvents[AtomicInc(sEventIndex) & 8191] = e
Done. What are “events”? In the simplest case, it’s just an arbitrary code and really depends on the case you’re chasing. Most of the time I don’t even bother creating an enum (this is temp diagnostics, remember?), just add an event at every ‘fork in the road’. One thing I like to do to make relationship easier to spot and make the event codes a bit less cryptic is to use X, 100+X for matching pairs. So, for example, say we want to “log” all lock/unlock events - I use 1 for a lock, 101 for corresponding unlock. Sometimes, just a code might be not enough, you need some extra information, like thread id, container size etc. Easy enough, just make your “event” a pair or even a structure.
#define ADD_EVT(e) sEvents[AtomicInc(sEventIndex) & 8191] = std::make_pair(e, threadId);
We add our events, run the game, hopefully eventually repro the problem and all we have to do now is to inspect the event table (or log it, if you prefer) and try to recreate the history. It’s not exactly quick or easy an might require some more digging or adding extra events, but it’s a great help, especially if you can compare “good” sequences with “bad ones”. Here’s what it might look like for a bit more complex case (IIRC it was a simulation/render thread communication, second entry is thread id):
+ [0] (7, 51224)
+ [1] (8, 51224)
+ [2] (121, 268192)
+ [3] (0, 268192)
+ [4] (1, 268192)
+ [5] (9, 268192)
+ [6] (4, 51224)
+ [7] (10, 268192)
+ [8] (120, 268192)
+ [9] (5, 51224)
+ [10] (6, 51224)
+ [11] (7, 51224)
+ [12 (121, 268192)
+ [13] (0, 268192)
+ [14] (8, 51224)
+ [15] (1, 268192)
Again, it’s obviously very cryptic and makes zero sense without full context/event codes, but the important part here is that we can see exactly what threads were “doing”. (In this particular case, the problem is that event 0 would sometimes occur between events 7/8 and increase a semaphore too early.. “Good” sequence was 7/8/0/121).