Programming games is an uphill battle. We always try to fight for more, but our resources are limited. There’s never enough RAM, the CPUs are never too powerful and obviously same thing applies to bandwidth – we could always use some more. The eternal problem of mutiplayer game is trying to send as much information as possible using as little memory as possible. One of the weapons in our arsenal is compression. I will focus on integers today as there seems to be enough information about floats/vectors out there.
Let’s start with easy cases first (I’ll link to Shawn Hargreaves blog, good descriptions there):
- “enum” type integers with limited range of possible values – use bitfields. If you know you only have 4 types of grenades, no need to waste 8 bits for grenade type, you only need 2 bits, obviously
- arithmetic encoding for even more effective bitpacking
That’s all nice and good, but things aren’t always that straightforward. Imagine 32-bit variable that can take the complete range of values. Typical example is health/damage. Some level 1 creature might have 50hp, but your endgame boss sits on 250k. Same thing with damage, there’s a difference between poking someone with a stick and dropping a nuke from the orbit. Our previous approaches won’t help us much here, because even though we know the range — it’s huge. We could give up and always send 32-bits, but it seems wasteful, after all in majority of cases we require much less memory (huge damage spikes are rare).
At first, co-worker directed me towards Google’s Protocol Buffers (or rather the encoding scheme they used). They were created with more complicated applications in mind, but it seems like they’d work for us quite nicely. You’ll want to read the linked document for details, but the general idea in a nutshell is – we employ base 128 varints. We only use 7 bits in a byte for our data and the remaining bit is used to signalize if there are more bytes to consider. Some examples:
- 1 – 1 byte, 0000 0001 (MSB not set),
- 300 – 2 bytes, 1010 1100 0000 0010 (MSB set in the first byte, so there’s more data, MSB not set in the second (last) one)
Sounds better, but is it the best we can do? That depends on the typical distribution of values for our transmitted variable. For my purposes (damage/health) it still wasn’t perfect. My data was heavily skewed towards zero. Most typically values were in 0-230 range. Problem with 128 base varints is that values greater than 127 will not fit in a single byte (as we only have 7 bits)…
I decided to go with simpler solution that fit my data better – every ‘varint’ data starts with 2 bits telling us how many bytes follow.
- 1 = 10 bits, 00 0000001
- 128 = 10 bits, 00 10000000
- 300 = 18 bits, 01 100101100
It costs us additional 2 bits in every case, so in the worst situation we’re sending 34 bits, but this should happen once a blue moon. For
values from [0,127] range it’s a little bit worse than base 128 as it requires 10 bits instead of 8. However, for the [128-255] it still
only needs 10 bits (not 16). There’s no clear winner here, it all depends on your dataset, for me solution #2 was performing better (sending 10 bits in the overhelming majority of cases).