What is the exact problem with multiple inheritance?

https://stackoverflow.com/questions/225929

03-07-2019
|

Question

I can see people asking all the time whether multiple inheritance should be included into the next version of C# or Java. C++ folks, who are fortunate enough to have this ability, say that this is like giving someone a rope to eventually hang themselves.

What’s the matter with multiple inheritance? Are there any concrete samples?

Solution

The most obvious problem is with function overriding.

Let's say have two classes A and B, both of which define a method doSomething. Now you define a third class C, which inherits from both A and B, but you don't override the doSomething method.

When the compiler seed this code...

C c = new C();
c.doSomething();

...which implementation of the method should it use? Without any further clarification, it's impossible for the compiler to resolve the ambiguity.

Besides overriding, the other big problem with multiple inheritance is the layout of the physical objects in memory.

Languages like C++ and Java and C# create a fixed address-based layout for each type of object. Something like this:

class A:
    at offset 0 ... "abc" ... 4 byte int field
    at offset 4 ... "xyz" ... 8 byte double field
    at offset 12 ... "speak" ... 4 byte function pointer

class B:
    at offset 0 ... "foo" ... 2 byte short field
    at offset 2 ... 2 bytes of alignment padding
    at offset 4 ... "bar" ... 4 byte array pointer
    at offset 8 ... "baz" ... 4 byte function pointer

When the compiler generates machine code (or bytecode), it uses those numeric offsets to access each method or field.

Multiple inheritance makes it very tricky.

If class C inherits from both A and B, the compiler has to decide whether to layout the data in AB order or in BA order.

But now imagine that you're calling methods on a B object. Is it really just a B? Or is it actually a C object being called polymorphically, through its B interface? Depending on the actual identity of the object, the physical layout will be different, and its impossible to know the offset of the function to invoke at the call-site.

The way to handle this kind of system is to ditch the fixed-layout approach, allowing each object to be queried for its layout before attempting to invoke the functions or access its fields.

So...long story short...it's a pain in the neck for compiler authors to support multiple inheritance. So when someone like Guido van Rossum designs python, or when Anders Hejlsberg designs c#, they know that supporting multiple inheritance is going to make the compiler implementations significantly more complex, and presumably they don't think the benefit is worth the cost.

OTHER TIPS

The problems you guys mention are not really that hard to solve. In fact e.g. Eiffel does that perfectly well! (and without introducing arbitrary choices or whatever)

E.g. if you inherit from A and B, both having method foo(), then of course you don't want an arbitrary choice in your class C inheriting from both A and B. You have to either redefine foo so it's clear what will be used if c.foo() is called or otherwise you have to rename one of the methods in C. (it could become bar())

Also I think that multiple inheritance is often quite useful. If you look at libraries of Eiffel you'll see that it's used all over the place and personally I've missed the feature when I had to go back to programming in Java.

The diamond problem:

an ambiguity that arises when two classes B and C inherit from A, and class D inherits from both B and C. If there is a method in A that B and C have overridden, and D does not override it, then which version of the method does D inherit: that of B, or that of C?

...It is called the "diamond problem" because of the shape of the class inheritance diagram in this situation. In this case, class A is at the top, both B and C separately beneath it, and D joins the two together at the bottom to form a diamond shape...

Multiple inheritance is one of those things that is not used often, and can be misused, but is sometimes needed.

I never understood not adding a feature, just because it might be misused, when there are no good alternatives. Interfaces are not an alternative to multiple inheritance. For one, they don't let you enforce preconditions or postconditions. Just like any other tool, you need to know when it is appropriate to use, and how to use it.

let's say you have objects A and B which are both inherited by C. A and B both implement foo() and C does not. I call C.foo(). Which implementation gets chosen? There are other issues, but this type of thing is a big one.

The main problem with multiple inheritance is nicely summed up with tloach's example. When inheriting from multiple base classes that implement the same function or field it's the compiler has to make a decision about what implementation to inherit.

This get's worse when you inherit from multiple classes that inherit from the same base class. (diamond inheritance, if you draw the inheritance tree you get a diamond shape)

These problems are not really problematic for a compiler to overcome. But the choice the compiler has to make here are rather arbitrary, this make code far less intuitive.

I find that when doing good OO design I never need multiple inheritance. In cases I do need it I usually find I've been using inheritance to reuse functionality while inheritance is only appropriate for "is-a" relations.

There are other techniques like mixins that solve the same problems and don't have the issues that multiple inheritance has.

I don't think the diamond problem is a problem, I would consider that sophistry, nothing else.

The worst problem, from my point of view, with multiple inheritance is RAD - victims and people who claim to be developers but in reality are stuck with half - knowledge (at best).

Personally, I would be very happy if I could finally do something in Windows Forms like this (it's not correct code, but it should give you the idea):

public sealed class CustomerEditView : Form, MVCView<Customer>

This is the main issue I have with having no multiple inheritance. You CAN do something similar with interfaces, but there is what I call "s*** code", it's this painful repetitive c*** you have to write in each of your classes to get a data context, for example.

In my opinion, there should be absolutely no need, not the slightest, for ANY repetition of code in a modern language.

The Common Lisp Object System (CLOS) is another example of something that supports MI while avoiding the C++-style problems: inheritance is given a sensible default, while still allowing you the freedom to explicitly decide how exactly to, say, call a super's behaviour.

There is nothing wrong in multiple inheritance itself. The problem is to add multiple inheritance to a language that was not designed with multiple inheritance in mind from the start.

The Eiffel language is supporting multiple inheritance without restrictions in a very efficient and productive way but the language was designed from that start to support it.

This feature is complex to implement for compiler developers, but it seems that that drawback could be compensated by the fact that a good multiple inheritance support could avoid the support of other features (i.e. no need for Interface or Extension Method).

I think that supporting multiple inheritance or not is more a matter of choice, a matter of priorities. A more complex feature takes more time to be correctly implemented and operational and may be more controversial. The C++ implementation may be the reason why multiple inheritance was not implemented in C# and Java...

One of the design goals of frameworks like Java and .NET is to make it possible for code which is compiled to work with one version of a pre-compiled library, to work equally well with subsequent versions of that library, even if those subsequent versions add new features. While the normal paradigm in languages like C or C++ is to distribute statically-linked executables that contain all of the libraries they need, the paradigm in .NET and Java is to distribute applications as collections of components that are "linked" at run-time.

The COM model which preceded .NET attempted to use this general approach, but it didn't really have inheritance--instead, each class definition effectively defined both a class and an interface of the same name which contained all its public members. Instances were of the class type, while references were of the interface type. Declared a class as deriving from another was equivalent to declaring a class as implementing the other's interface, and required the new class to re-implement all public members of the classes from which one derived. If Y and Z derive from X, and then W derives from Y and Z, it won't matter if Y and Z implement X's members differently, because Z won't be able to use their implementations--it will have to define its own. W might encapsulate instances of Y and/or Z, and chain its implementations of X's methods through theirs, but there would be no ambiguity as to what X's methods should do--they'd do whatever Z's code explicitly directed them to do.

The difficulty in Java and .NET is that code is allowed to inherit members and have accesses to them implicitly refer to the parent members. Suppose one had classes W-Z related as above:

class X { public virtual void Foo() { Console.WriteLine("XFoo"); }
class Y : X {};
class Z : X {};
class W : Y, Z  // Not actually permitted in C#
{
  public static void Test()
  {
    var it = new W();
    it.Foo();
  }
}

It would seem like W.Test() should creating an instance of W call the implementation of virtual method Foo defined in X. Suppose, however, that Y and Z were actually in a separately-compiled module, and although they were defined as above when X and W were compiled, they were later changed and recompiled:

class Y : X { public override void Foo() { Console.WriteLine("YFoo"); }
class Z : X { public override void Foo() { Console.WriteLine("ZFoo"); }

Now what should be the effect of calling W.Test()? If the program had to be statically linked before distribution, the static link stage might be able to discern that while the program had no ambiguity before Y and Z were changed, the changes to Y and Z have made things ambiguous and the linker could refuse to build the program unless or until such ambiguity is resolved. On the other hand, it's possible that the person who has both W and the new versions of Y and Z is someone who simply wants to run the program and has no source code for any of it. When W.Test() runs, it would no longer be clear what W.Test() should do, but until the user tried to run W with the new version of Y and Z there would be no way any part of the system could recognize there was a problem (unless W was considered illegitimate even before the changes to Y and Z).

The diamond is not a problem, as long as you don’t use anything like C++ virtual inheritance: in normal inheritance each base class resembles a member field (actually they are laid out in RAM this way), giving you some syntactic sugar and an extra ability to override more virtual methods. That may impose some ambiguity at compile-time but that’s usually easy to solve.

On the other hand, with the virtual inheritance it too easily goes out of control (and then becomes a mess). Consider as an example a “heart” diagram:

  A       A
 / \     / \
B   C   D   E
 \ /     \ /
  F       G
    \   /
      H

In C++ it is entirely impossible: as soon as F and G are merged into a single class, their As are merged too, period. That means you may never consider base classes opaque in C++ (in this example you have to construct A in H so you have to know that it present somewhere in the hierarchy). In other languages it may work, however; for example, F and G could explicitly declare A as “internal,” thus forbidding consequent merging and effectively making themselves solid.

Another interesting example (not C++-specific):

  A
 / \
B   B
|   |
C   D
 \ /
  E

Here, only B uses virtual inheritance. So E contains two Bs that share the same A. This way, you can get an A* pointer that points to E, but you can’t cast it to a B* pointer although the object is actually B as such cast is ambiguous, and this ambiguity can’t be detected at compile time (unless the compiler sees the whole program). Here is the test code:

struct A { virtual ~A() {} /* so that the class is polymorphic */ };
struct B: virtual A {};
struct C: B {};
struct D: B {};
struct E: C, D {};

int main() {
        E data;
        E *e = &data;
        A *a = dynamic_cast<A *>(e); // works, A is unambiguous
//      B *b = dynamic_cast<B *>(e); // doesn't compile
        B *b = dynamic_cast<B *>(a); // NULL: B is ambiguous
        std::cout << "E: " << e << std::endl;
        std::cout << "A: " << a << std::endl;
        std::cout << "B: " << b << std::endl;
// the next casts work
        std::cout << "A::C::B: " << dynamic_cast<B *>(dynamic_cast<C *>(e)) << std::endl;
        std::cout << "A::D::B: " << dynamic_cast<B *>(dynamic_cast<D *>(e)) << std::endl;
        std::cout << "A=>C=>B: " << dynamic_cast<B *>(dynamic_cast<C *>(a)) << std::endl;
        std::cout << "A=>D=>B: " << dynamic_cast<B *>(dynamic_cast<D *>(a)) << std::endl;
        return 0;
}

Moreover, the implementation may be very complex (depends on language; see benjismith’s answer).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow