Why does virtual keyword increase the size of derived a class?

https://stackoverflow.com/questions/10903596

12-06-2021
|

Question

I have two classes - one base class and one derived from it :

class base {

 int i ;

  public :
  virtual ~ base () { }
};

class derived :  virtual public base { int j ; };

main()

{ cout << sizeof ( derived ) ; }

Here the answer is 16. But if I do instead a non-virtual public inheritance or make the base class non-polymorphic , then I get the answer as 12, i.e. if I do :

class base {

 int i ;

 public :
virtual ~ base () { }
};

class derived :  public base { int j ; };

main()

{ cout << sizeof ( derived ) ; }

class base {

int i ;

public :
~ base () { }
};

class derived :  virtual public base { int j ; };

main()

{ cout << sizeof ( derived ) ; }

In both the cases answer is 12.

Can someone please explain why there is a difference in the size of the derived class in 1st and the other 2 cases ?

( I work on code::blocks 10.05, if someone really need this )

Solution

The point of virtual inheritance is to allow sharing of base classes. Here's the problem:

struct base { int member; virtual void method() {} };
struct derived0 : base { int d0; };
struct derived1 : base { int d1; };
struct join : derived0, derived1 {};
join j;
j.method();
j.member;
(base *)j;
dynamic_cast<base *>(j);

The last 4 lines are all ambiguous. You have to explicitly whether you want the base inside the derived0, or the base inside derived1.

If you change the second and third line as follows, the problem goes away:

struct derived0 : virtual base { int d0; };
struct derived1 : virtual base { int d1; };

Your j object now only has one copy of base, not two, so the last 4 lines stop being ambiguous.

But think about how that has to be implemented. Normally, in a derived0, the d0 comes right after the m, and in a derived1, the d1 comes right after the m. But with virtual inheritance, they both share the same m, so you can't have both d0 and d1 come right after it. So you're going to need some form of extra indirection. That's where the extra pointer comes from.

If you want to know exactly what the layout is, it depends on your target platform and compiler. Just "gcc" isn't enough. But for many modern non-Windows targets, the answer is defined by the Itanium C++ ABI, which is documented at http://mentorembedded.github.com/cxx-abi/abi.html#vtable.

OTHER TIPS

There are two separate things here that cause extra overhead.

Firstly, having virtual functions in the base class increases its size by a pointer size (4 bytes in this case), because it needs to store the pointer to the virtual method table:

normal inheritance with virtual functions:

0        4       8       12
|      base      |
| vfptr  |  i    |   j   |

Secondly, in virtual inheritance extra information is needed in derived to be able to locate base. In normal inheritance the offset between derived and base is a compile time constant (0 for single inheritance). In virtual inheritance the offset can depend on the runtime type and actual type hierarchy of the object. Implementations may vary, but for example Visual C++ does it something like this:

virtual inheritance with virtual functions:

0        4         8        12        16
                   |      base        |
|  xxx   |   j     |  vfptr |    i    |

Where xxx is a pointer to some type information record, that allows to determine the offset to base.

And of course it's possible to have virtual inheritance without virtual functions:

virtual inheritance without virtual functions:

0        4         8        12
                   |  base  |
|  xxx   |   j     |   i    |

If a class has any virtual function, objects of this class need to have a vptr, that is a pointer to the vtable, that is the virtual table from where the address of the correct virtual function can be found. The function called depends on the dynamic type of the object, that it is the most derived class the object is a base subobject of.

Because the derived class inherits virtually from a base class, the location of the base class relative to the derived class is not fixed, it depends on the dynamic type of the object too. With gcc a class with virtual base classes needs a vptr to locate the base classes (even if there is no virtual function).

Also, the base class contains a data member, which is located just after the base class vptr. Base class memory layout is: { vptr, int }

If a base class needs vptr, a class derived from it will need a vptr too, but often the "first" vptr of a base class subobject is reused (this base class with the reused vptr is called the primary base). However this is not possible in this case, because the derived class needs a vptr not only to determine how to call the virtual function, but also where the virtual base is. The derived class cannot locate its virtual base class without using the vptr; if the virtual base class was used as a primary base, the derived class would need to locate its primary base to read the vptr, and would need to read the vptr to locate its primary base.

So the derived cannot have a primary base, and it introduces its own vptr.

The layout of a base class subobject of type derived is thus: { vptr, int } with the vptr pointing to a vtable for derived, containing not only the address of virtual functions, but also the relative location of all its virtual base classes (here just base), represented as an offset.

The layout of a complete object of type derived is: { base class subobject of type derived, base }

So the minimum possible size of derived is (2 int + 2 vptr) or 4 words on common ptr = int = word architectures, or 16 bytes in this case. (And Visual C++ makes bigger objects (when virtual base classes are involved), I believe a derived would have one more pointer.)

So yes, virtual functions have a cost, and virtual inheritance has a cost. The memory cost of virtual inheritance in this case is one more pointer per object.

In designs with many virtual base classes, the memory cost per object might be proportional to the number of virtual base classes, or not; we would need to discuss specific class hierarchies to estimate the cost.

In designs without multiple inheritance or virtual base classes (or even virtual functions), you might have to emulate many things automatically done by the compiler for you, with a bunch of pointers, possibly pointers to functions, possibly offsets... this could get confusing and error prone.

What's going on is the extra overhead used to mark a class as having virtual members or involving virtual inheritance. How much extra depends on the compiler.

A mark of caution: Making a class derive from a class for which the destructor is not virtual is usually asking for trouble. Big trouble.

Possibly extra 4 bytes are needed to mark class type at runtime. For example:

class A {
 virtual int f() { return 2; }
}

class B : virtual public A {
 virtual int f() { return 3; }
}

int call_function( A *a) {
   // here we don't know what a really is (A or B)
   // because of this to call correct method
   // we need some runtime knowledge of type and storage space to put it in (extra 4 bytes).
   return a->f();
}

int main() {
   B b;
   A *a = (A*)&b;

   cout << call_function(a);
}

The extra size is due to the vtable/vtable pointer that is "invisibly" added to your class in order to hold the member function pointer for a specific object of this class or it's descendant/ancestor.

If that isn't clear, you'll need to do much more reading about virtual inheritance in C++.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow