I ended up using the "thunk table" concept that I outlined in the question. For each operation, there is a single instance of a thunk table (which is static and is shared through a template - the compiler will therefore automatically make sure that there is only a single table instance per operation type, not per invokation). Thus my objects have no virtual functions whatsoever.
Most importantly - the speed gain from using simple function pointer instead of virtual functions is negligible (but it is not slower, either). What gains a lot of speed is implementing a decision tree and linking all the functions statically - that improved the runtime of some not very compute intensive code by about 40%.
An interesting side effect is being able to have "virtual" template functions, which is not usually possible.
One problem that I needed to solve was that all my objects needed to have some interface, as they would end up being accessed by some calls other than the functors. I devised a detached facade for that. A facade is a virtual class, declaring the interface of the objects. A detached facade is instance of this virtual class, specialized for a given class (for all in the list, operator []
returns detached facade for the type of the selected item).
class CDetachedFacade_Base {
public:
virtual void DoStuff(BaseType *pthis) = 0;
};
template <class ObjectType>
class CDetachedFacade : public CDetachedFacade_Base {
public:
virtual void DoStuff(BaseType *pthis)
{
static_cast<ObjectType>(pthis)->DoStuff();
// statically linked, CObjectType is a final type
}
};
class CMakeFacade {
BaseType *pthis;
CDetachedFacade_Base *pfacade;
public:
CMakeFacade(BaseType *p, CDetachedFacade_Base *f)
:pthis(p), pfacade(f)
{}
inline void DoStuff()
{
f->DoStuff(pthis);
}
};
To use this, one needs to do:
static CDetachedFacade<CMyObject> facade;
// this is generated and stored in a templated table
// this needs to be separate to avoid having to call operator new all the time
CMyObject myobj;
myobj.DoStuff(); // statically linked
BaseType *obj = &myobj;
//obj->DoStuff(); // can't do, BaseType does not have virtual functions
CMakeFacade obj_facade(obj, &facade); // choose facade based on type id
obj_facade.DoStuff(); // calls CMyObject::DoStuff()
This allows me to use the optimized thunk table in the high performance portion of the code and still have polymorphically behaving objects to be able to conveniently handle them where performance is not required.