Virtual functions in C++ – thoughts on performance penalty

Most C++ developers have heard about performance penalty of using virtual functions. But is it really so high that you should avoid this useful and sometimes the only possible language feature for a particular case?

First let’s see when this feature is really needed. When you wish to write a testable code, you likely have to deal with some sort of dependency injection.

Let’s say you have some class:

class SomeNastyBigClass
{
public:
  SomeNastyBigClass():
    m_dependency1(param1, param2, param3),
    m_dependency2(param1, param2, param3),
    m_dependency3(param1, param2, param3)
  {
    // some other work for dependencies initialization
  }

  // lots of different methods

private:
  SomeDependencyClass1 m_dependency1;
  SomeDependencyClass2 m_dependency2;
  SomeDependencyClass3 m_dependency3;
};

Given the class has all dependencies ‘self-contained’ and creates them in constructor, you in your test case have no control over it and you end up having to test not only the SomeNastyBigClass, but all the dependency classes as well.

That problem naturally stops many developers from writing proper unit tests. And there should be no surprise as it might be a nightmare to write a proper unit test if one of the dependency classes is, for example, a connector to a database, or a credit card charger of some sort.

The solution might seem obvious – we should mock those dependency classes so that it allows us to change their behavior and gain full control over them. But how can we achieve that?

In my experience, I strongly believe that a class like the one in the example should be refactored. Something like that would suffice:

class SomeLessNastyBigClass
{
public:
  explicit SomeLessNastyBigClass(
const std::shared_ptr<SomeDependencyClass1>& pDependency1,
      const std::shared_ptr<SomeDependencyClass2>& pDependency2,
      const std::shared_ptr<SomeDependencyClass3>& pDependency3) :
    m_pDependency1(pDependency1),
    m_pDependency2(pDependency2),
    m_pDependency3(pDependency3) {

   // some other work for dependencies initialization
 }

 // lots of different methods

private:
  std::shared_ptr<SomeDependencyClass1> m_pDependency1;
  std::shared_ptr<SomeDependencyClass2> m_pDependency2;
  std::shared_ptr<SomeDependencyClass3> m_pDependency3;
};

It might look way more awkward and, in fact, counter-intuitive as it adds up lots of bloat code – shared pointers, multiple constructor arguments, and even a need to initialize the dependency classes somewhere externally. But when it comes to writing a test, you will thank yourself, because now you can easily mock those dependency classes and test just the SomeLessNastyBigClass functionality.

Let’s see how we might do it using googletest/googlemock framework.

Imagine that we want to mock SomeDependencyClass1 and the original class looks like this:

class SomeDependencyClass1
{
public:
  virtual uint32_t some_function_1(uint8_t param1) const;
  virtual std::string some_function_2(const std::string& param1,
                                      uint32_t param2);

  virtual ~SomeDependencyClass1();
};

First, we write a mock class, which essentially is a child class derived from SomeDependencyClass1. For that reason, googlemock framework requires the parent class methods, which we are going to override, to be virtual, and the parent class itself should have a virtual destructor – for a proper destruction of a child class.

Given everything that, writing a mock is pretty straight-forward:

#include <gmock/gmock.h>

class SomeDependencyClass1Mock : public SomeDependencyClass1
{
public:
  MOCK_CONST_METHOD1(some_function_1, uint32_t(uint8_t param1));
  MOCK_METHOD2(some_function_2, std::string(const std::string& param1,
                                            uint32_t param2));
};

Code is self-explanatory: we override both parent methods with special macros, which contain a number of arguments in it’s name and a constness of a method. Next goes a method name and a signature: return_type (arguments_list, …)

Going back to our unit test case, when we wish to test some of the SomeLessNastyBigClass’s methods, we just create an object of its type, passing a mock object when it asks for a dependency in the constructor:

auto pSomeDependencyClass1Mock = std::make_shared<SomeDependencyClass1Mock>();

SomeLessNastyBigClass testObject(pSomeDependencyClass1Mock,
                                 /*...other dependencies*/);

ASSERT_TRUE(testObject.some_method());

The mock we’ve just created is not just a dumb object which does nothing. We can specify a wide variety of expectations and behaviors depending on our needs. But by default it will just return the default value of the function’s return type and spam into stderr about the unexpected calls, which we can ignore for our simple test case.

More details on the googlemock configurations can be found in a cheatsheet.

Okay, so say we’ve refactored all the nasty code we had and wrote all the unit tests we were lacking. What about performance? Did we make everything testable, but terribly slow? Well.. not exactly. As in many cases, the answer depends on your particular usage scenario.

Let’s use another google’s framework, benchmark, and get numbers for comparison of virtual functions against non-virtual ones.

#include <benchmark/benchmark.h>
#include <memory>

class Interface
{
public:
  virtual int some_virtual_func() = 0;
  virtual ~Interface() {}
};

class Impl: public Interface
{
public:
  virtual int some_virtual_func() override
  {
    return ++m_nCounter;
  }

  int some_nonvirtual_func()
  {
    return ++m_nCounter;
  }

private:
  int m_nCounter = 0;
};

static void BM_VirtualFuncCreateEachTime(benchmark::State& state)
{
  while (state.KeepRunning()) {
    std::shared_ptr<Interface> p = std::make_shared<Impl>();
    benchmark::DoNotOptimize(p->some_virtual_func());
  }
}
BENCHMARK(BM_VirtualFuncCreateEachTime);

static void BM_NonVirtualFuncCreateEachTime(benchmark::State& state)
{
  while (state.KeepRunning()) {
   Impl impl;
   benchmark::DoNotOptimize(impl.some_nonvirtual_func());
  }
}
BENCHMARK(BM_NonVirtualFuncCreateEachTime);

static void BM_VirtualFuncCreateOnce(benchmark::State& state)
{
  std::shared_ptr<Interface> p = std::make_shared<Impl>();
  while (state.KeepRunning()) {
    benchmark::DoNotOptimize(p->some_virtual_func());
  }
}
BENCHMARK(BM_VirtualFuncCreateOnce);

static void BM_NonVirtualFuncCreateOnce(benchmark::State& state)
{
  Impl impl;
  while (state.KeepRunning()) {
    benchmark::DoNotOptimize(impl.some_nonvirtual_func());
  }
}
BENCHMARK(BM_NonVirtualFuncCreateOnce);

int main(int argc, char** argv)
{
  ::benchmark::Initialize(&argc, argv);
  ::benchmark::RunSpecifiedBenchmarks();
  return 0;
}

Here we simply create an abstract Interface class with pure virtual function and a child Impl class with 1 virtual overridden function and one non-virtual function. Both functions essentially do the same thing (increment a counter and return a result).

We provide 4 options to benchmark:

  1. Initialize a shared pointer of Interface type with a shared pointer to Impl each time in the benchmarking loop and execute a virtual method. Note, that shared pointer creates an object in the heap.
  2. Create a non-polymorphic object of Impl type on stack each loop cycle and execute non-virtual method.
  3. Create a polymorphic shared pointer once (outside the loop) and execute virtual method in the loop.
  4. Create a non-polymorphic object on stack once and execute non-virtual method in the loop.

Tested on Lenovo Y70 laptop, Ubuntu 16.04, gcc 5.4, -O3:

Run on (8 X 3300 MHz CPU s)
2016-08-16 18:22:29
Benchmark                                Time           CPU Iterations
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------------
BM_VirtualFuncCreateEachTime            33 ns         33 ns   18817204
BM_NonVirtualFuncCreateEachTime          2 ns          2 ns  330188679
BM_VirtualFuncCreateOnce                 2 ns          2 ns  397727273
BM_NonVirtualFuncCreateOnce              2 ns          2 ns  324074074

Process finished with exit code 0

As you may see, options 2, 3 and 4 give the same results, 2 nanoseconds per execution. And only the first option runs ~16 times slower.

My understanding is that the 1st option runs slower than the others just because it allocates heap memory inside the loop, literally each time. It’s not related to the virtual methods overhead at all, because we do not see any difference between options 3 and 4.

We could assume however that the results we’ve got are based purely on compiler optimization, because compiler might know in advance which polymorphic type is going to be used based on our benchmark code. Let’s make it impossible for compiler to guess and allocate an object based on some flag variable, which in my test was passed as an environment variable:

std::shared_ptr<Interface> p;

if(flag) {
  p = std::make_shared<Impl>();
} else {
  p = std::make_share<SomeOtherImpl>();
}

Results after that change remain the same up to nanoseconds, so I’m not going even to post it twice.

Obviously, this is a very synthetic example and your circumstances might be different. Compiler optimizations are not always predictable, so I would always suggest to benchmark if you’re not sure.

But in general, I believe it’s safe to say that it’s much better to have testable code rather than pursue old beliefs about virtual methods overhead cost, given that benchmark can’t distinguish the difference at the nanoseconds scale.

2 thoughts on “Virtual functions in C++ – thoughts on performance penalty

  1. As far as I understand there are only two ways to do a dependency injections:
    1) pure interfaces
    2) everything DI must be a template parameter 🙂

    If you choose the second approach you will not pay for virtual function call but it significantly increase the compile time :-))

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s