Dynamic initializers strike back

Back in the day I wrote about the ‘dynamic initializers’ problem. Basically, older versions of MSVC (up to 2012, not sure about 2013, seems better in 2015) had problems with static const floats that depended on other static const floats. Values were not calculated compile-time, there was actually a short function generated and it’d do it. The immediate problem is a code bloat (if our constant is placed in a global header), but the other potential issue stems from the fact that these ‘dynamic initializer’ functions respect the optimization settings (/fp:fast).

Wordpress to Hugo

I’ve been planning on converting my blog to some static site generator for a while now, but the (perceived) amount of work involved seemed scary, so I kept coming up with reasons not to do it. I really like the idea of static sites. Using gamedev jargon, this basically means we “precompute” as much as possible. Instead of retrieving posts from the database and building pages on the fly (like WP), we do it all offline and then push static HTML files online.

Elo vs Glicko

If you’ve ever done any matchmaking coding (…or played chess) you’re most likely familiar with the Elo rating system (PSA: it’s Elo, not ELO, it’s not an acronym, it’s named after a person - Arpad Elo). It’s a rating system used by most chess federations (most notably, FIDE) and has been adapted by many video games. It’s trivial to implement and get up and running, but suffers from a few issues: it’s just one number, there’s no reliability (or deviation) information of any kind.

DD2016 - video

A video from my DD2016 talk has been uploaded and can be viewed here:

Digital Dragons 2016

A few days ago I came back from the Digital Dragons 2016 conference. I’ve been in talks with these guys in the past years, but timing was never quite right, this time I finally could make it. Luckily for me, it was the same year they invited John Romero, David Brevik and Chris Avellone. In December 1993 my classmate brought an old floppy disc to school. He claimed it contained the best game ever.

Elixir diaries

You have probably heard about an AI algorithm defeating a human professional in a game of Go. The algorithm itself has been developed by Deep Mind, an English company that’s now owned by Google. One of the founders is Demis Hassabis. If you were playing Bullfrog games in the late 90s, the name might ring a bell, he’s one of the designers of Theme Park. Back in the day I remember reading a column in Edge magazine titled “The Elixir Diaries”.


Recently, we’ve been collectively complaining on Twitter about going crazy in C++ (as we do every few weeks). This reminded me of my dark period around 2002 when I was really excited about templates and metaprogramming. I tried finding my old code, but turned out my previous website was completely gone. Fortunately (?), good folks from The Wayback Machine had some copies. I moved some of my articles to a new server.

Fantasy football and statistics

I’m playing in a fantasy football league with some coworkers this year. I have always liked all kinds of sports, but it’s a little bit challenging to follow NFL from Europe. There’s not too many stations that actually show it (except for the Super Bowl) and even if they do, the time difference make it difficult to watch. I still followed major news and managed to catch a game every now and then, but I was very far from expert.


I’ve been debugging a rare memory corruption bug recently and - as usual - it turned out to be an exercise in frustration. This time only part of that was because of the bug itself, the other was because methods I chose were not very helpful (in my defense, it’s been a while, so I was a little bit rusty). The bug itself was fairly boring - sometimes, when leaving a certain UI screen game would crash.

Know your assembly (part N)

An entertaining head-scratcher from today. Application has suddenly started crashing at launch, seemingly inside steam_api.dll. It’d be easy to blame external code, but as mentioned - it’s a new thing, so most likely related to our changes. To make things more interesting, it’s only 32-bit build that crashes, 64-bit seems fine. High-level code looks like (simplified): 1 struct tester 2 { 3 virtual bool foo(); 4 virtual void bar() 5 { 6 } 7 virtual void lol() 8 { 9 if(!foo()) 10 { 11 printf("Failed\n"); 12 return; 13 } 14 bar(); 15 } 16 }; Crash occurs when trying to call bar() [the original code was actually crashing ‘inside’ bar, which debugger claimed to be inside aforementioned DLL]: 1 00199FB3 8B 06 mov eax,dword ptr [esi] 2 00199FB5 8B 10 mov edx,dword ptr [eax] 3 00199FB7 FF D2 call edx 4 00199FB9 84 C0 test al,al 5 00199FBB 75 10 jne tester::lol+1Dh (199FCDh) 6 [...] 7 00199FCD 8B 06 mov eax,dword ptr [esi] 8 00199FCF 8B 50 04 mov edx,dword ptr [eax+4] *** 9 00199FD2 8B CE mov ecx,esi 10 00199FD4 5E pop esi 11 00199FD5 FF E2 jmp edx Crash line marked with stars - Access violation reading location 0x00000018, EAX=0x14 at this point.

C++ 11 final

I’ve been doing some micro-optimizations recently. One of the things I’ve been trying is eliminating virtual method calls in ‘leaf’ types. Consider the following snippet (simplified): 1 struct Bar 2 { 3 virtual bool Func1() { return false; } 4 }; 5 struct Foo : public Bar 6 { 7 virtual bool Func1() 8 { 9 return true; 10 } 11 virtual void Func2() 12 { 13 if (Func1()) 14 printf("Hello"); 15 } 16 }; 17 18 void DoSomething(Foo& f) 19 { 20 f.Func2(); 21 } Calling DoSomething will result in 2 virtual method calls - and rightly so, there’s no way to tell if Func1/Func2 were not modified in some class that’s derived from Foo.

Instrumenting crash dumps

I’ve been planning to write a post about debugging multiplayer games (post-mortem) for a while now, but it keeps getting bigger and bigger, so instead of waiting until I can get enough time to finish it, I thought it’d be easier to share some quick’n’easy tricks first. I’d like to show a simple way of “instrumenting” a crash dump so that it gives us more info about the crime scene. Let’s assume we’ve received a crash from the following piece of code (it’s actually very close to a real-life scenario I encountered).

NaN memories

A nice thing about Twitter is that single tweet can bring so many memories. It all started when Fiora posted this: This reminds me of how HLSL very carefully defines "saturate" in a way that makes NaNs turn into 0: https://t.co/0tWzaa6H2K @rygorous — Fiora Aeterna ☄ (@FioraAeterna) March 11, 2015 It reminded me of an old bug I encountered few years ago when working on a multi-platform (PC/X360/PS3) title. One day we started getting strange bug reports.

MESIng with cache

(Please excuse the terrible pun, couldn’t help myself). As we all know, computer cache is a touchy beast, seemingly little modifications to the code can result in major performance changes. I’ve been playing with performance monitoring counters recently (using Agner Fog’s library I mentioned before). I was mostly interested in testing how cmpxchg instruction behaves under the hood, but wanted to share some other tidbits as well. Let’s assume we’re working with a simple spinlock code.

Rust pathtracer

Last year I briefly described my adventure with writing a pathtracer in the Go language. This year, I decided to give Rust a try. It’s almost exact 1:1 port of my Go version, so I’ll spare you the details, without further ado - here’s a short list of observations and comparisons. As previously, please remember this is written from position of someone who didn’t know anything about the language 2 weeks ago and still is a complete newbie (feel free to point out my mistakes!): Rust is much more “different” than most mainstream languages.

Hidden danger of the BSF instruction

Had a very interesting debugging session recently. I’ve been investigating some of external crash reports, all I had was a crash dump related to a fairly innocent-looking piece of code. Here’s a minimal snippet demonstrating the problem: struct Tree { void* items[32]; }; #pragma intrinsic(_BitScanForward) __declspec(noinline) void* Grab(Tree* t, unsigned int n) { unsigned int j = 0; _BitScanForward((unsigned long *)&j, n); return t->items[j]; } Easy enough, right? Seemingly, nothing can go wrong here.

Z-Machine interpreter in Go

Recently, I had an inspiring discussion with fellow programmers, we were talking about interesting side projects/programs to quickly “try out” new programming language/job interview tasks. One that’s been mentioned was coding a Z-machine interpreter that’s capable of playing Zork I. The Z-machine is a virtual machine developed by Joel Berez and Marc Blank, used for numerous Infocom text adventure games, most notably the Zork series. In all honesty, I’m probably a few years too young so didn’t get to play Zork when it was big (I did play old Sierra adventures back when you actually had to type commands, though, one of the the reasons I started to learn English was Police Quest I.

A Byte Too Far

A short cautionary tale from today. I’ve been modifying some code and one of the changes I made was to use a type of Lol as a key in a map-like structure (key-pair container, uses < operator for comparisons). Structure itself looked like: 1 struct Lol 2 { 3 byte tab[16]; 4 short cat; 5 bool foo; 6 }; …and here’s the operator< 1 bool Lol::operator<(const Lol& other) const 2 { 3 return(memcmp(this, &other, sizeof(other)) < 0); 4 } The problem was - it seemed like sometimes, in seemingly random cases, we’d try to insert an instance of Lol to a container even though exactly the same element was already there.

Going deeper - addendum

There’s been some comments to my previous post wondering about C++ compilers and their capabilities. Normally, I’m all for compiler bashing, in this case I’d probably cut them some slack. It’s easy to optimize when you’re focused on a single piece of code, way more difficult when you have to handle plethora of cases. On top of that, uops handled differently on different CPUs, e.g. in my limited tests Haswell seems to care less.

Going deeper

Few weeks ago I encountered a discussion on a Polish gamedev forum – participants were wondering whether it’s faster to access stack or heap memory. I didn’t pay much attention (there should be no significant difference) until someone had posted a test case. Out of curiosity, I ran it and to my surprise discovered, it was consistently faster for buffers allocated on the stack. We’re not talking about few cycles here and there, too, the difference was almost 20% (Ivy Bridge laptop).