Pointers to member functions
9/Sep 2012
Pointers to member functions is one of those aspects of C++ that I use every 2 years or so. I know how they work, but I usually need some refreshing on the details (especially the syntax kills me every time, although I might actually remember it now). One of the gotchas with them is - you’re not supposed to cast it to void* (see this question for example). That’s one of those guidelines that I read long time ago, absorbed and never gave a second thought. Recently, however, I’ve been working with some legacy code that required me to investigate this issue more carefully.
Let’s analyze how pointers to member functions behave for virtual methods. Consider the following snippet:
1struct Lol
2{
3 virtual void FuncA() { printf("A"); }
4 virtual void FuncC() { printf("C"); }
5 virtual void FuncD() { printf("D"); }
6};
7struct Cat
8{
9 virtual void FuncB() { printf("B"); }
10};
11
12template<typename T>
13void* GetFuncAddress(void (T::*method)())
14{
15 return *reinterpret_cast<void**>(&method);
16}
17int __cdecl main(int argc, char const* argv[])
18{
19 printf("%p - %p - %p\n", GetFuncAddress(&Lol::FuncA),
20 GetFuncAddress(&Cat::FuncB), GetFuncAddress(&Lol::FuncC));
Line 19 is the most interesting one, that’s where we actually try to obtain the address. With normal functions - it’s easy, with virtuals, things get hairy. We cannot use the vtable entry as we don’t know the concrete type of the object that’ll be used to call this method (we can derive from Lol/Cat). Here’s how MSVC handles it - for each vtable index, it generates small helper function that redirects the function call. It looks roughly like this:
1Lol::`vcall'{0}':
20038A3DC 8B 01 mov eax,dword ptr [ecx]
30038A3DE FF 20 jmp dword ptr [eax]
4Lol::`vcall'{4}':
50038A3E0 8B 01 mov eax,dword ptr [ecx]
60038A3E2 FF 60 04 jmp dword ptr [eax+4]
Lol::‘vcall’{0} is index 0, ‘vcall’{4} is index 1. As you can see all it does is load vtable from given instance and jump to a function under specified index. Please note there’s no ‘vcall’{8}, compiler is smart enough to only generate those functions if it ever needs to return an address for given index, we never take an address of Lol::FuncD.
Now, if you actually run this program, results might be surprising for some, it’ll print exactly same address twice, then some other value. It’s like it thinks that Lol::FuncA & Cat::FuncB have the same address… Let’s see the generated code:
101119B9A 68 E0 A3 11 01 push offset Lol::`vcall'{4}' (111A3E0h)
201119B9F B8 DC A3 11 01 mov eax,offset Lol::`vcall'{0}' (111A3DCh)
301119BA4 50 push eax
401119BA5 50 push eax
501119BA6 68 AC 08 12 01 push 11208ACh
601119BAB FF 15 20 D2 11 01 call dword ptr [__imp__printf (111D220h)]
As you can see - that’s exactly what it does. Obtains address of Lol::‘vcall’{0} then pushes it twice, so both arguments will have the same value*. Turns out, compiler is also smart enough not to generate multiple copies of those little helper functions. It only generates one per index and uses it everywhere (so there’s no Cat::‘vcall’{0}). That’s OK because the function code would have been exactly the same. If you try to call it, it’ll work properly too, as the actual function that’s executed depends on type of given object and is ‘extracted’ from the vtable. It’s only dangerous if you rely on two different virtual methods having two different addresses. Address alone just does not hold enough information in this situation, we need an object type as well.
*Sidenote: actually, if you’re using incremental linking, there’ll be yet another level of indirection (jump thunk), but the rest stays the same.
Old comments
Trillian 2012-09-09 18:38:35
This makes a whole lot of sense. Thanks for posting this kind of observations, I enjoy reading them.
xyz 2012-09-13 15:18:14
“Line 18 is the most interesting one,” - typo, I think it should be line 19 (line 18 is opening bracket)
admin 2012-09-14 02:39:05
Good point, bracket is not that interesting. Fixed.