Quick iteration, insertion, look-up, removal

²⁸/Dec 2010 By admin

Wouldn’t it be nice to get all four of those with some container type? Sadly, it seems quite tricky. Vector gives the most efficient iteration and pretty good insertion (at tail), but is more problematic when it comes to look-up & removal. Intrusive linked lists are pretty good here, but pointer chasing may hurt during iteration. Hash map falls somewhere in the middle, but again - iteration will most probably be slower than vector (we have to skip over empty buckets). Imagine quite typical game situation - a collection of objects to update every frame. Iteration seems the most important & heavy-traffic operation, updating is expensive enough, we don’t want to add even more overhead. Insertion/removal is a little bit less crucial, as ideally, it doesn’t happen too often, but having to linear search for element or reshuffle all other entries is a no-no. This is a fairly specialized scenario, so it may require specialized measures. I’d like to present a hybrid approach that should result in very quick iteration and reasonably fast look-up, insertion & removal operations.

Let’s start with the most simple structure, that also deals nicely with our most important operation (iteration) - a vector. Insertion is not a problem, either. That brings us to look-up/removal and here’s where problems start. Given object we need to determine its position (iterator in STl terms) in collection (look-up) & remove. Linear search is no good, we could try to keep elements sorted and use binary search, but it complicates removal (we’d have to preserve ordering). However, we could also use method similar to intrusive linked list and store index in the object itself. Index has a nice property of not being invalidated as new objects are added to the collection (as opposed to pointer/iterator). The only situation, where it can change is when object is removed from vector (and is before our object), but we’ll deal with it later. Now, removal. Traditional way is not too tempting, when you erase an object, you need to move all following entries one slot ‘down’. In the worst case (removing first object), this means n-1 copies. Luckily, there’s a better way. It’s quite popular method to quickly remove element from vector, without preserving ordering (we don’t care about it in our case). We simply copy last element over the one we’re removing, then pop_back. Sadly, it’s not a part of official STL, that’s maybe why it’s not widely popular. Here’s example implementation:

1template<typename T, typename TIter>
2void erase_unordered(std::vector<T>& v, TIter it)
3{
4    TIter newEnd = v.end();
5    --newEnd;
6    if (newEnd != it)
7        *it = *newEnd;
8    v.pop_back();
9}

In RDESTL, there’s a special vector method named erase_unordered, which does exactly that. Cost? One copy + pop_back. Well, not so fast. We still invalidate one index - of the last object, that we moved from its original position. It’s trivial to fix, though, just set it to whatever position object was copied to and we’re fine.

So much about theory, now some implementation details. It’s perfectly straightforward if you use scheme similar to the one described by Terrance Cohen in his excellent presentation. There are situations where same object may be on more than one list, though. Instead of storing multiple indices in object itself, we could add another indirection level. If you use handles as weak references, this should be fairly easy. Each subsystem has its own list of array indices, that’s indexed by object handle. It’s not terribly cache friendly (look-up will most likely mean two misses), so ideally, you’d like to use the first version if possible, but you shouldn’t be removing too many objects at runtime anyway.

Old comments

admin 2011-01-05 03:13:03

Good point, I’ve never experimented with this kind of hash map too much.

Pierre Terdiman 2011-01-04 14:00:33

“Hash map falls somewhere in the middle, but again â?? iteration will most probably be slower than vector (we have to skip over empty buckets)”
Hmmm but a coalesced hash map should cover that (one where you fill the holes when removing an entry). It’s not faster than your version though, just less intrusive.

GameCoder.it − EraseUnordered 2010-12-30 09:16:23

[…] sui miei feed preferiti ho letto un interessante post di Maciej Sinilo dal titolo Quick iteration, insertion, look-up, removal . Sostanzialmente si vorrebbe una struttura dati veloce in tutte le operazioni comuni (iterazione, […]

Ashkan 2010-12-29 08:33:38

Boost Multi-index Containers Library
Would love to see a benchmark.

.m.m.s.

Maciej Sinilo