C++ 11 final

I’ve been doing some micro-optimizations recently. One of the things I’ve been trying is eliminating virtual method calls in ‘leaf’ types. Consider the following snippet (simplified):

 1struct Bar
 3  virtual bool Func1() { return false; }
 5struct Foo : public Bar
 7  virtual bool Func1()
 8  {
 9    return true;
10  }
11  virtual void Func2()
12  {
13    if (Func1())
14      printf("Hello");
15  }
18void DoSomething(Foo& f)
20  f.Func2();

Calling DoSomething will result in 2 virtual method calls - and rightly so, there’s no way to tell if Func1/Func2 were not modified in some class that’s derived from Foo.

It can be a little bit wasteful, especially if we - a programmer - know for a fact that nothing derives from Foo. Func2 calling Func1 will always, in every single case call just that - Func1. I used a not-so-sophisticated method to work around that:

2    printf("Hello");

This works, but can be dangerous. Imagine one day some other programmer decides to derive from Foo and provide new implementation of Func1. He’s in for a nasty surprise (and I’m in for some public shaming). Traditionally, C++ offers a few ways of preventing inheritance, but they’re all fairly ugly (private constructors etc). Fortunately, C++ 11 introduced a new keyword - final - which does exactly what we want.

It also got me thinking - does it mean that compiler has additional knowledge it was lacking before. Couldn’t it use it to employ the same optimizations we’ve just tried to force? Are my changes even necessary? Sadly, as it often happens, the answer is – it depends.

AFAICT, Visual Studio doesn’t care much. Yes, it’ll prevent inheritance, but doesn’t seem like it affects code generation at all. Here’s assembly for our code fragment (with final keyword added):

 1// struct Foo final : public Bar
 3// DoSomething
 4000000013FE15460 48 8B 01             mov         rax,qword ptr [rcx]
 5000000013FE15463 48 FF 60 08          jmp         qword ptr [rax+8] ***
 7virtual void Func2()
 800000001407426D0 48 83 EC 28          sub         rsp,28h
 9    if(Func1())
1000000001407426D4 48 8B 01             mov         rax,qword ptr [rcx]
1100000001407426D7 FF 10                call        qword ptr [rax] ***
1200000001407426D9 84 C0                test        al,al
1300000001407426DB 74 10                je          Foo::Func2+1Dh (01407426EDh)
14      printf("Hello");
1500000001407426DD 48 8D 0D 04 3B EF 00 lea         rcx,[string "Hello" (01416361E8h)]
1600000001407426E4 48 83 C4 28          add         rsp,28h
1700000001407426E8 E9 B3 FB 85 00       jmp         printf (0140FA22A0h)
1800000001407426ED 48 83 C4 28          add         rsp,28h
1900000001407426F1 C3                   ret

As you can see - still 2 vtable accesses (lines marked with stars). Let’s see if GCC/Clang does any better (Compiler Explorer to the rescue). Just look at this beauty (that’s a body of DoSomething):

1    movl    $.L.str, %edi
2    xorl    %eax, %eax
3    jmp    printf                  # TAILCALL
5    .asciz    "Hello"

Not only did the compiler de-virtualize both calls, it also inlined them and sprinkled with a tailcall. Impressive.

Sadly, as mentioned - we can’t rely on these optimizations being employed consistently, but at the very least - final will prevent another programmer from making a mistake he’d later regret.

Addendum: As pointed out by Adrian in the comments - use sealed instead of final in VS2012 for the same behavior but with better code gen.

Old comments

Chris Kline 2015-04-30 13:09:51

Maciej, I’d ping Stephan T. Lavavej @ Microsoft @exchange.microsoft.com. He’s great about responding and often fixes issues on the spot when they’re reported, especially if they’re filed as issues on MS’s Connect site first.

admin 2015-05-06 01:01:33

Apparently, it’s been much improved in VS2013.

Adrian 2015-05-08 09:03:12

Use sealed instead of final in VS2012 for the same behavior but with better code gen.

admin 2015-05-15 00:50:52

Awesome, that’s great to know, thanks Adrian!