Pass by reference
23/Jul 2012
I’ve not been reading too many C++ books recently, but I have a foggy memory of a recommendation they include: do not pass arguments by value, unless it is a built-in type. Justification is given, probably, but it does not seem to stick and people tend to only remember the “pass by value - bad” part. There is plenty of programmers out there, who do it automatically - built-in by value, anything else - by reference. Fortunately, this rule works OK in 90% of cases. Alas, it tends to break for UDTs that are essentially wrappers over low-level types. Let’s take interned strings for example. This technique is popular in games, as we don’t like the cost associated with passing strings around and we usually work with immutable strings. Implementation details differ, but often the String class contains only a pointer to interned data and it’s this pointer that’s copied/compared. Single instance of our class should fit nicely in a register and passing it by reference will not make things better. Will it make it worse? Let’s see. Consider this simple code fragment:
static const IString MyEvent("MyEvent");
void HandleEvent(const IString& eventName)
{
if (eventName == MyEvent) { ... }
}
// ebp+10h --> &eventName
mov edi,dword ptr [ebp+10h]
mov eax,dword ptr [edi]
cmp eax,dword ptr [MyEvent(1993EBCh)]
As you can see, poor compiler has to deal with an additional level of indirection. Technically we’re passing a pointer to another pointer here, so it needs to dereference the first one in order to retrieve a value for comparison. Might not look like a big deal, but keep in mind it most probably means additional cache miss. It’s also more prone to cause problems in multi-threaded environment and make it harder to optimize because of aliasing. Just for comparison, here’s assembly code generated with argument passed by value.
mov eax,dword ptr [ebp+10h]
cmp eax,dword ptr [MyEvent(199B3FCh)]
Another popular example, that might be more surprising for PC coders, are vector (math, not containers) types. There are architectures with vector registers (single register holding 1 vector). On those, it’s obviously more efficient to pass them around by value and using registers, not stack (would cause alignment issues anyway). See http://molecularmusings.wordpress.com/2011/10/18/simdifying-multi-platform-math/ for example.
Old comments
Martins Mozeiko 2012-07-23 17:50:31
Here’s some interesting information about passing arguments by value: http://www.macieira.org/blog/2012/02/the-value-of-passing-by-value/
Conclusion is similar - Structures of up to 16 bytes containing integers and pointers should be passed by value.
sean barrett 2012-07-24 05:54:33
I’d draw a slightly different conclusion from the macieira.org post. For x86, he focuses on ILP32 x86-64; that is, integers and pointers are 32-bit, but registers are 64-bit. In this case, while it can past 16-byte structures in registers, it’s not clear that there’s any actual advantage to it because (as he notes at the end), having two 32-bit integer or pointer values packed into 64-bit registers is not necessarily going to more efficient.
If you’re computing the object on the fly, now the caller has to pack that data into registers and the callee has to unpack it. If you’ve already got an object of the type, then you’re going to load it into registers in the caller and the callee is going to unpack it. The first case might or might not be slower as registers, but the second case will definitely do unnecessary work compared to passing by reference.
Now, the 32-bit-pointers-on-a-64-bit-platform is certainly not the only case, although for certain people it is in fact the most common case and the one they need to optimize for the most (details are NDA). But that won’t be true forever, and certainly if it’s not the case on your slowest platform then things work out more like Maciej suggests. I’m just saying that blog post offers just about the strongest counterargument available.
obf 2012-07-24 07:56:50
As far as I can see there is a missing ‘=’ in the comparison, if this is a comparison of course:)
admin 2012-07-24 13:24:14
obf, you’re right of course, thanks.
Ricardo Costa 2012-07-26 10:43:14
We use a template helper to automatically select between passing by value or by ref based on the parameter type. It will pass by value when using a native type or a class/struct whose size is at most twice the register size, and that does not have a custom copy constructor or assign operator (which could cause a potential slowdown even when the class is small enough). See below:
template
struct ByRef {
typedef typename conditional<
is_fundamental::value || (is_trivially_copyable::value && sizeof(T) ::type Type;
};
// example:
template
void f (ByRef::Type t) {}
Ricardo Costa 2012-07-26 10:45:30
HTML seems to have stripped some chars, trying again below:
template<typename T>
struct ByRef {
typedef typename conditional<
is_fundamental<T>::value || (is_trivially_copyable<T>::value && sizeof(T) <= sizeof(size_t)*2),
T,
T const&
>::type Type;
};
// example:
template<typename T>
void f (ByRef<T>::Type t) {}
Ricardo Costa 2012-07-26 10:51:14
Note that it requires C++11 though. Those templates are defined in <type_traits> header.