Wrapping C++ in C: Derived to base conversions

https://stackoverflow.com/questions/9169614

26-04-2021
|

Question

I am wrapping a simple C++ inheritance hierarchy into "object-oriented" C. I'm trying to figure out if there any gotchas in treating the pointers to C++ objects as pointers to opaque C structs. In particular, under what circumstances would the derived-to-base conversion cause problems?

The classes themselves are relatively complex, but the hierarchy is shallow and uses single-inheritance only:

// A base class with lots of important shared functionality
class Base {
    public:
    virtual void someOperation();
    // More operations...

    private:
    // Data...
};

// One of several derived classes
class FirstDerived: public Base {
    public:
    virtual void someOperation();
    // More operations...

    private:
    // More data...
};

// More derived classes of Base..

I am planning on exposing this to C clients via the following, fairly standard object-oriented C:

// An opaque pointers to the types
typedef struct base_t base_t;
typedef struct first_derived_t first_derived_t;

void base_some_operation(base_t* object) {
     Base* base = (Base*) object;
     base->someOperation();
}

first_derived_t* first_derived_create() {
     return (first_derived_t*) new FirstDerived();
}

void first_derived_destroy(first_derived_t* object) {
     FirstDerived* firstDerived = (FirstDerived*) object;
     delete firstDerived;
}

The C clients only pass around pointers to the underlying C++ objects and can only manipulate them via function calls. So the client can finally do something like:

first_derived_t* object = first_derived_create();
base_some_operation((base_t*) object); // Note the derived-to-base cast here
...

and have the virtual call to FirstDerived::someOperation() succeed as expected.

These classes are not standard-layout but do not use multiple or virtual inheritance. Is this guaranteed to work?

Note that I have control over all the code (C++ and the C wrapper), if that matters.

Solution

// An opaque pointers to the types
typedef struct base_t base_t;
typedef struct first_derived_t first_derived_t;

// **********************//
// inside C++ stub only. //
// **********************//

// Ensures you always cast to Base* first, then to void*,
// then to stub type pointer.  This enforces that you'll
// get consistent a address in presence of inheritance.
template<typename T>
T * get_stub_pointer ( Base * object )
{
     return reinterpret_cast<T*>(static_cast<void*>(object));
}

// Recover (intermediate) Base* pointer from stub type.
Base * get_base_pointer ( void * object )
{
     return reinterpret_cast<Base*>(object);
}

// Get derived type pointer validating that it's actually
// the right type.  Returs null pointer if the type is
// invalid.  This ensures you can detect invalid use of
// the stub functions.
template<typename T>
T * get_derived_pointer ( void * object )
{
    return dynamic_cast<T*>(get_base_pointer(object));
}

// ***********************************//
// public C exports (stub interface). //
// ***********************************//

void base_some_operation(base_t* object)
{
     Base* base = get_base_pointer(object);
     base->someOperation();
}

first_derived_t* first_derived_create()
{
     return get_stub_pointer<first_derived_t>(new FirstDerived());
}

void first_derived_destroy(first_derived_t* object)
{
     FirstDerived * derived = get_derived_pointer<FirstDerived>(object);
     assert(derived != 0);

     delete firstDerived;
}

This means that you can always perform a cast such as the following.

first_derived_t* object = first_derived_create();
base_some_operation((base_t*) object);

This is safe because the base_t* pointer will be cast to void*, then to Base*. This is one step less than what happened before. Notice the order:

FirstDerived*
Base* (via implicit static_cast<Base*>)
void* (via static_cast<void*>)
first_derived_t* (via reinterpret_cast<first_derived_t*>)
base_t* (via (base_t*), which is a C++-style reinterpret_cast<base_t*>)
void* (via implicit static_cast<void*>)
Base* (via reinterpret_cast<Base*>)

For calls that wrap a FirstDerived method, you get an extra cast:

FirstDerived* (via dynamic_cast<FirstDerived*>)

OTHER TIPS

You can certainly make a C interface to some C++ code. All you need is extern "C", and I recommend a void * as your opaque data type:

// library.h, for C clients

typedef void * Handle;

extern "C" Handle create_foo();
extern "C" void destroy_foo(Handle);

extern "C" int magic_foo(Handle, char const *);

Then implement it in C++:

#include "library.h"
#include "foo.hpp"

Handle create_foo()
{
    Foo * p = nullptr;

    try { p = new Foo; }
    catch (...) { p = nullptr; }

    return p
}

void destroy_foo(Handle p)
{
    delete static_cast<Foo*>(p);
}

int magic_foo(Handle p, char const * s)
{
    Foo * const f = static_cast<Foo*>(p);

    try
    {
        f->prepare();
        return f->count_utf8_chars(s);
    }
    catch (...)
    {
        return -1;
        errno = E_FOO_BAR;
    }
}

Remember never to allow any exceptions to propagate through a calling C function!

This is the approach I've used in the past (perhaps as implied by Aaron's comment). Note that the same type names are used in both C and C++. Casts are all done in C++; this naturally represents good encapsulation irrespective of questions of legality. [Obviously you need delete methods as well.] Note that to call someOperation() with a Derived*, an explicit upcast to Base* is required. If Derived does not provide any new methods such as someOtherOperation, then you do not need to expose Derived* to clients, and avoid the client side casts.

Header file:"BaseDerived.H"

#ifdef __cplusplus
extern "C"
{
#endif
    typedef struct Base Base;
    typedef struct Derived Derived;

    Derived* createDerived();
    Base* createBase();
    Base* upcastToBase(Derived* derived);
    Derived* tryDownCasttoDerived(Base* base);
    void someOperation(Base* base);
void someOtherOperation(Derived* derived);
#ifdef __cplusplus
}
#endif

Implementation: "BaseDerived.CPP"

#include "BaseDerived.H"
struct Base 
{
    virtual void someOperation()
    {
        std::cout << "Base" << std::endl;
    }
};
struct Derived : public Base
{
public:
    virtual void someOperation()
    {
        std::cout << "Derived" << std::endl;
    }
private:
};

Derived* createDerived()
{
    return new Derived;
}

Base* createBase()
{
    return new Base;
}

Base* upcastToBase(Derived* derived)
{
    return derived;
}

Derived* tryDownCasttoDerived(Base* base)
{
    return dynamic_cast<Derived*>(base);
}

void someOperation(Base* base)
{
    base->someOperation();
}

void someOperation(Derived* derived)
{
    derived->someOperation();
}

I think these two lines are the nub of the question:

first_derived_t* object = first_derived_create();
base_some_operation((base_t*) object); // Note the derived-to-base cast here
...

There is no really safe way to allow this in the C code. In C, such a cast never changes the raw integral value of the pointer, but sometimes C++ casts will do so and therefore you need a design that never has any casts within the C code.

Here is one (overly complex?) solution. First, decide on a policy that the C code will always strictly deal with a value which is effectively a Base* - this is a somewhat arbitrary policy to ensure consistency. This means that the C++ code will sometimes have to to a dynamic_cast, we'll come to that later.

(You can make the design work correctly with C code simply by using casts, as has been mentioned by others. But I'd would be worried that the compiler will allow all sorts of crazy casts, such as (Derived1*) derived2_ptr or even casts to types in a different class hierarchy. My goal here is to enforce the proper object-oriented is-a relationship within the C code.)

Then, the C handle classes could be something like

struct base_t_ptr {
    void * this_; // holds the Base pointer
};
typedef struct {
    struct base_t_ptr get_base;
} derived_t_ptr;

This should make it easy to use something like casts in a concise and safe way: Note how we pass in object.get_base in this code:

first_derived_t_ptr object = first_derived_create();
base_some_operation(object.get_base);

where the declaration of base_some_operation is

extern "C" base_some_operation(struct base_t_ptr);

This will be quite type safe, as you won't be able to pass a derived1_t_ptr to this function without going via the .get_base data member. It will also help your C code to know a little about the types and which conversions are valid - you don't want to accidentally convert Derived1 to Derived2.

Then, when implementing the non-virtual methods defined only in a derived class, you'll need something like:

extern "C" void derived1_nonvirtual_operation(struct derived1_t_ptr); // The C-style interface. Type safe.

void derived1_nonvirtual_operation(struct derived1_t_ptr d) {
    // we *know* this refers to a Derived1 type, so we can trust these casts:
    Base * bp = reinterpret_cast<Base*>(d.get_base.this_);
    Derived1 *this_ = dynamic_cast<Derived1*>;
    this_ -> some_operation();
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow