C++ Techniques: Type-Erasure vs. Pure Polymorphism

Question 1

C++ style virtual method based polymorphism:

You have to use classes to hold your data.
Every class has to be built with your particular kind of polymorphism in mind.
Every class has a common binary-level dependency, which restricts how the compiler creates the instance of each class.
The data you are abstracting must explicitly describe an interface that describes your needs.

C++ style template based type erasure (with virtual method based polymorphism doing the erasure):

You have to use template to talk about your data.
Each chunk of data you are working on may be completely unrelated to other options.
The type erasure work is done within public header files, which bloats compile time.
Each type erased has its own template instantiated, which can bloat binary size.
The data you are abstracting need not be written as being directly dependent on your needs.

Now, which is better? Well, that depends if the above things are good or bad in your particular situation.

As an explicit example, std::function<...> uses type erasure which allows it to take function pointers, function references, output of a whole pile of template-based functions that generate types at compile time, myraids of functors which have an operator(), and lambdas. All of these types are unrelated to one another. And because they aren't tied to having a virtual operator(), when they are used outside of the std::function context the abstraction they represent can be compiled away. You couldn't do this without type erasure, and you probably wouldn't want to.

On the other hand, just because a class has a method called DoFoo, doesn't mean that they all do the same thing. With polymorphism, it isn't just any DoFoo you are calling, but the DoFoo from a particular interface.

As for your sample code... your GetSomeText should be virtual ... override in the polymorphism case.

There is no need to reference count just because you are using type erasure. There is no need not to use reference counting just because you are using polymorphsm.

Your Object could wrap T*s like how you stored vectors of raw pointers in the other case, with manual destruction of their contents (equivalent to having to call delete). Your Object could wrap a std::shared_ptr<T>, and in the other case you could have vector of std::shared_ptr<T>. Your Object could contain a std::unique_ptr<T>, equivalent to having a vector of std::unique_ptr<T> in the other case. Your Object's ObjectModel could extract copy constructors and assignment operators from the T and expose them to Object, allowing full-on value semantics for your Object, which corresponds to the a vector of T in your polymorphism case.

Question 2

Here's one view: The question seems to ask how one should choose between late binding ("runtime polymorphism") and early binding ("compile-time polymorphism").

As KerrekSB points out in his comments, there are some things you can do with late binding that it just isn't realistic to do with early binding. Many uses of the Strategy pattern (decoding network I/O) or the Abstract Factory pattern (runtime-selected class factories) fall into this category.

If both approaches are viable, then choosing is a matter of the trade offs involved. In C++ applications, the main tradeoffs I see between early and late binding are implementation maintainability, binary size, and performance.

There are at least some people who feel that C++ templates in any shape or form are impossible to comprehend. Or possibly have some other, less dramatic reservation with templates. C++ templates have many little gotchas ("when do I need to use the 'typename' and 'template' keywords?"), and non-obvious tricks (SFINAE comes to mind).

Another tradeoff is optimization. When you bind early, you give the compiler more information about your program, and so it can (potentially) do a better job optimizing. When you bind late, the compiler (probably) doesn't know ahead of time as much information -- some of that information may be in other compilation units, and so the optimizer can't do as much.

Another tradeoff is program size. In C++ at least, using "compile-time polymorphism" sometimes balloons binary size, as the compiler creates, optimizes, and emits different code for each used specialization. In contrast, when binding late, there's only one code path.

It's interesting to compare the same tradeoff being made in a different context. Take web applications, where one uses (some type of) polymorphism to deal with differences between browsers, and possibly for internationalization (i18n)/localization. Now, a hand-written JavaScript web application would likely use what amounts to late binding here, by having methods which detect capabilities at runtime to figure out what to do. Libraries like jQuery take this tack.

Another approach is to write different code for each possible browser/i18n possibility. While this sounds absurd, it is far from unheard of. The Google Web Toolkit uses this approach. GWT has its "deferred binding" mechanism, used to specialize the compiler's output to different browsers and different localizations. GWT's "deferred binding" mechanism uses early binding: The GWT Java-to-JavaScript compiler figures out all possible ways the polymorphism might be needed, and spits out an entirely different "binary" for each.

The tradeoffs are similar. Wrapping your head around how you extend GWT using deferred binding can be a headache and a half; Having knowledge at compile time allows GWT's compiler to optimize each specialization separately, possibly yielding better performance, and smaller size for each specialization; The whole of a GWT application can end up being many times the size of a comparable jQuery application, due to all of the precompiled specializations.

Question 3

One benefit to runtime generics that no-one here has mentioned (?) is the possibility for code that is generated and injected into a running application, to use the same List, Hashmap / Dictionary etc. that everything else in that application is already using. Why you'd want to do that, is another question.