C++: Class specialization a valid transformation for a conforming compiler?

https://stackoverflow.com/questions/15148425

16-03-2022
|

Pergunta

Hopefully this isn't too specialized of a question for StackOverflow: if it is and could be migrated elsewhere let me know...

Many moons ago, I wrote a undergraduate thesis proposing various devirtualization techniques for C++ and related languages, generally based on the idea of precompiled specialization of code paths (somewhat like templates) but with checks to choose the correct specializations are chosen at runtime in cases where they cannot be selected at compile-time (as templates must be).

The (very) basic idea is something like the following...suppose you have a class C like the following:

class C : public SomeInterface
{
public:
    C(Foo * f) : _f(f) { }

    virtual void quack()
    {
        _f->bark();
    }

    virtual void moo()
    {
        quack(); // a virtual call on this because quack() might be overloaded
    }

    // lots more virtual functions that call virtual functions on *_f or this

private:
    Foo * const _f; // technically doesn't have to be const explicitly
                    // as long as it can be proven not be modified
};

And you knew that there exist concrete subclasses of Foo like FooA, FooB, etc, with known complete types (without necessarily having an exhaustive list), then you could precompile specialized versions of C for some selected subclasses of Foo, like, for example (note the constructor is not included here, purposely, since it won't be called):

class C_FooA final : public SomeInterface
{
public:
    virtual void quack() final
    {
        _f->FooA::bark(); // non-polymorphic, statically bound
    }

    virtual void moo() final
    {
        C_FooA::quack(); // also static, because C_FooA is final
        // _f->FooA::bark(); // or you could even do this instead
    }

    // more virtual functions all specialized for FooA (*_f) and C_FooA (this)

private:
    FooA * const _f;
};

And replace the constructor of C with something like the following:

C::C(Foo * f) : _f(f)
{
    if(f->vptr == vtable_of_FooA) // obviously not Standard C++
        this->vptr = vtable_of_C_FooA; 
    else if(f->vptr == vtable_of_FooB)
        this->vptr = vtable_of_C_FooB;
    // otherwise leave vptr unchanged for all other values of f->vptr
}

So basically, the dynamic type of the object being constructed is changed based on the dynamic type of the arguments to its constructor. (Note, you can't do this with templates because you can only create a C<Foo> if you know the type of f at compile-time). From now on, any call to FooA::bark() through C::quack() only involves one virtual call: either the call to C::quack() is statically bound to the non-specialized version which dynamically calls FooA::bark(), or the call to C::quack() is dynamically forwarded to C_FooA::quack() which statically calls FooA::bark(). Furthermore, dynamic dispatch might be eliminated completely in some cases if the flow analyzer has enough information to make a static call to C_FooA::quack(), which could be very useful in a tight loop if it allows inlining. (Although technically at that point you'd probably be OK even without this optimization...)

(Note that this transformation is safe, although less useful, even if _f is non-const and protected instead of private and C is inherited from a different translation unit...the translation unit creating the vtable for the inherited class won't know anything at all about the specializations and the constructor of the inherited class will just set the this->vptr to its own vtable, which will not reference any specialized functions because it won't know anything about them.)

This might seem like a lot of effort to eliminate one level of indirection, but the point is that you can do it to any arbitrary nesting level (any depth of virtual calls following this pattern could be reduced to one) based only on local information within a translation unit, and do it in a way that's resilient even if new types are defined in other translation units that you don't know about...you just might add a lot of code bloat that you wouldn't have otherwise if you did it naively.

Anyway, independent of whether this kind of optimization would really have enough bang-for-the-buck be worth the effort of implementation and also worth the space overhead in the resulting executable, my question is, is there anything in Standard C++ which would prevent a compiler from performing such a transformation?

My feeling is no, since the standard doesn't specify at all how virtual dispatch is done or how pointers-to-member-functions are represented. I'm pretty sure there's nothing about the RTTI mechanism preventing C and C_FooA from masquerading as the same type for all purposes, even if they have different virtual tables. The only other thing I could think of that could possibly matter is some close reading of the ODR, but probably not.

Am I overlooking something? Barring ABI/linking issues, would transformations like this be possible without breaking conforming C++ programs? (Furthermore, if yes, could this be done currently with the Itanium and/or MSVC ABIs? I'm fairly sure the answer there is yes, as well, but hopefully someone can confirm.)

EDIT: Does anyone know if anything like this is implemented in any mainstream compiler/JIT for C++, Java, or C#? (See discussion and linked chat in the comments below...) I'm aware JITs do speculative static-binding/inlining of virtuals directly at call sites, but I don't know if they do anything like this (with entirely new vtables being generated and chosen based on a single type check done at the constructor, rather than at each call site).

Solução

Is there anything in Standard C++ which would prevent a compiler from performing such a transformation?

Not if you're sure the observable behavior is unchanged - that's the "as-if rule" which is Standard section 1.9.

But this might make proving that your transformation is correct pretty difficult: 12.7/4:

When a virtual function is called directly or indirectly from a constructor (including the mem-initializer or brace-or-equal-initializer for a non-static data member) or from a destructor, and the object to which the call applies is the object under construction or destruction, the function called is the one defined in the constructor or destructor's own class or in one of its bases, but not a function overriding it in a class derived from the constructor or destructor's own class, or overriding it in one of the other base classes of the most derived object.

So if the destructor Foo::~Foo() happens to directly or indirectly call C::quack() on an object c, where c._f points at the object being destroyed, you need to call Foo::bark(), even if _f was a FooA when you constructed object c.

Outras dicas

On first reading, this sounds like a c++-focused variation of polymorphic inline caching. I think that V8 and Oracle's JVM both use it, and I know that .NET does.

To answer your original question: I don't think there's anything in the standard that forbids these kinds of implementations. C++ takes the "as-is" rule quite seriously; so long as you faithfully implement the right semantics, you can do the implementation in any crazy way you like. c++ virtual calls aren't very complicated, so I doubt you'd trip over any edge cases there either (unlike if, say, you were trying to do something clever with static binding).

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow