22/Dec 2014
By admin
7 min. read
(Please excuse the terrible pun, couldn’t help myself).
As we all know, computer cache is a touchy beast, seemingly little modifications to the code can result in major performance changes. I’ve been playing with performance monitoring counters recently (using Agner Fog’s library I mentioned before). I was mostly interested in testing how cmpxchg instruction behaves under the hood, but wanted to share some other tidbits as well.
Let’s assume we’re working with a simple spinlock code.
14/Nov 2014
By admin
7 min. read
Last year I briefly described my adventure with writing apathtracer in the Go language. This year, I decided to give Rust a try. It’s almost exact 1:1 port of my Go version, so I’ll spare you the details, without further ado - here’s a short list of observations and comparisons. As previously, please remember this is written from position of someone who didn’t know anything about the language 2 weeks ago and still is a complete newbie (feel free to point out my mistakes!
26/Oct 2014
By admin
3 min. read
Had a very interesting debugging session recently. I’ve been investigating some of external crash reports, all I had was a crash dump related to a fairly innocent-looking piece of code. Here’s a minimal snippet demonstrating the problem:
struct Tree
{
void* items[32];
};
#pragma intrinsic(_BitScanForward)
__declspec(noinline) void* Grab(Tree* t, unsigned int n)
{
unsigned int j = 0;
_BitScanForward((unsigned long *)&j, n);
return t->items[j];
} Easy enough, right? Seemingly, nothing can go wrong here.
23/Sep 2014
By admin
3 min. read
Recently, I had an inspiring discussion with fellow programmers, we were talking about interesting side projects/programs to quickly “try out” new programming language/job interview tasks. One that’s been mentioned was coding a Z-machine interpreter that’s capable of playing Zork I. The Z-machine is a virtual machine developed by Joel Berez and Marc Blank, used for numerous Infocom text adventure games, most notably the Zork series. In all honesty, I’m probably a few years too young so didn’t get to play Zork when it was big (I did play old Sierra adventures back when you actually had to type commands, though, one of the the reasons I started to learn English was Police Quest I.
17/Jul 2014
By admin
2 min. read
A short cautionary tale from today. I’ve been modifying some code and one of the changes I made was to use a type of Lol as a key in a map-like structure (key-pair container, uses < operator for comparisons). Structure itself looked like:
1struct Lol
2{
3 byte tab[16];
4 short cat;
5 bool foo;
6}; …and here’s the operator<
1bool Lol::operator<(const Lol& other) const
2{
3 return(memcmp(this, &other, sizeof(other)) < 0);
4} The problem was - it seemed like sometimes, in seemingly random cases, we’d try to insert an instance of Lol to a container even though exactly the same element was already there.
4/May 2014
By admin
1 min. read
There’s been some comments to my previous post wondering about C++ compilers and their capabilities. Normally, I’m all for compiler bashing, in this case I’d probably cut them some slack. It’s easy to optimize when you’re focused on a single piece of code, way more difficult when you have to handle plethora of cases. On top of that, uops handled differently on different CPUs, e.g. in my limited tests Haswell seems to care less.
27/Apr 2014
By admin
6 min. read
Few weeks ago I encountered a discussion on a Polish gamedev forum – participants were wondering whether it’s faster to access stack or heap memory. I didn’t pay much attention (there should be no significant difference) until someone had posted a test case. Out of curiosity, I ran it and to my surprise discovered, it was consistently faster for buffers allocated on the stack. We’re not talking about few cycles here and there, too, the difference was almost 20% (Ivy Bridge laptop).
11/Nov 2013
By admin
6 min. read
There may come a time in game programmer’s life when he has to fix a bug in a library he doesn’t have the source code for. It doesn’t happen often, it might never happen, but it’s good to be prepared. If I remember correctly, I had to do it only two times, one was fairly recently. We were getting quite a few crash reports and were assured that fix in the third-party library was coming, but I decided to see if it’s possible to do anything about it in the meantime.
23/Oct 2013
By admin
5 min. read
Today’s article is brought to you by a friend of mine. He’s been doing some home experiments and noticed a particular piece of code was running faster in Debug than Release (using Visual C++ compiler). He mentioned this to me, I found this intriguing enough to ask him for this snippet and investigated it yesterday. Turned out it was a classical case of compiler trying to be too smart and shooting itself in the foot.
6/Oct 2013
By admin
2 min. read
I must admit I am not as die hard fan of ProDG debugger as some other coders out there, perhaps I’ve not been using it long enough. One tiny thing I miss though was the possibility of replacing an assembly instruction under the cursor with NOP with a single keystroke. Sure, with Visual Studio you can achieve same result with memory/immediate window, but it’s much more cumbersome. Today I decided to finally bite the bullet and recreate this little feature with VB macro:
23/Sep 2013
By admin
6 min. read
Recently I’ve been experimenting with the Go programming language a little bit. It’s my second approach actually - I gave it a half-hearted try last year, but gave up pretty quickly (I think it was some petty reason, too, probably K&R braces). This time around I actually managed to stick to it a little bit longer and learn a thing or two. I decided my test application would be a simple pathtracer.
9/Aug 2013
By admin
3 min. read
When experimenting with some of the more esoteric features of modern CPUs it’s sometimes not immediately obvious if we’re actually taking advantage of them. Sure, you can compare cycles, but the differences are not always big enough to justify conclusions. Luckily for us, in the Pentium processor Intel introduced a set of performance-monitoring counters. They are model specific (not compatible among different processor families) and allow you to monitor just about every aspect of CPU pipeline.
3/Aug 2013
By admin
2 min. read
Just a quick follow-up to my previous note. As mentioned by Michal, Future Crew guys decided to celebrate the 20th anniversary of Second Reality in the best way possible - they released a full source code. Obviously, it’s more of a tidbit than anything else, but it’s still interesting to finally see how certain effects were done. Apparently Fabian is already working on a code analysis article, but in the meantime I’ll only mention two things that caught my eye so far:
1/Aug 2013
By admin
2 min. read
I know I claimed I would not write about Second Reality here, mostly because everyone knows it, but it’s a special day today… It’s been exactly 20 years since Second Reality has been shown for the first time, at Assembly 1993. I’ve seen it few months later and still remember that day. I was living in Torun at that time and have just started high school. I would often visit local computer store, just to see what’s new.
24/Jul 2013
By admin
3 min. read
I know I’ve been bitching about load-hit-store (too) many times before, but it’s been one of the most annoying things we had to deal with at the previous generation consoles. LHS happens when we try to load data from the address that has been recently written to. X360/PPC CPUs were fairly simple, they could neither try to execute some other instruction (no OOE) nor retrieve the data without waiting for it to reach cache.
17/Jul 2013
By admin
5 min. read
If you’ve been coding for current (prev?) gen consoles, you know your optimization guidelines - be nice to your cache, avoid LHS and branches, in general - do not stress the pipeline too much. With next (current?) generation moving back to x86, things get a little bit more blurry. With out-of-order/speculative execution, register renaming and advanced branch predictors, it’s sometimes easy to shoot yourself in the foot when trying to be smarter than compiler/CPU.
18/Jun 2013
By admin
6 min. read
With the unveiling of next gen console specifications it’s clear that multi-threaded code is here to stay (I don’t think anyone expected otherwise). If anything, it’ll be even more common, new CPUs run at relatively low frequencies (when compared to modern PCs), so we’ll definitely have to go wide to use their potential. Here’s a quick cheat sheet I’m usually following when trying to move code to a background thread. Please note: move.
13/Jun 2013
By admin
1 min. read
It’s been a long time coming, but I finally found time to update MemTracer C# to support 64-bit applications (so 64-bit memory pointers + callstack addresses).
19/Apr 2013
By admin
9 min. read
Spent a little bit time tweaking RDE vector class again. As I already mentioned, the container itself is not terribly fascinating, there are not too many choices here. There’s another battle going on the lower level though, it’s interesting to see how an innocent instruction like_ size()_ can be a cause of a slowdown. Typically, a vector class has 3 properties it needs to keep track of:
buffer properties (pointer and size),
4/Apr 2013
By admin
2 min. read
Working in a game industry for more than few years tends to desensitize one to all the news about mass layoffs & companies going bust. Sadly, it happens so often, we’re slowly becoming used to it. I first found out about Disney shutting down Lucas Arts at work. Sure, I was surprised, it’s a big news after all, but I was busy, so didn’t think twice about it… I finished work, came home, read all the updates and then it suddenly hit me.