Question

If you take an existing C code base and compile it with a C++ compiler, what sort of issues can you expect to crop up? For example, I think that assigning an integer to an value with an enumerated type will fail in C++, whereas it's legal (if a bit nasty) in C.

If I don't wrap all my C files in extern C { ... }, am I going to get name-mangling where I least expect it? Is there some reason why I really shouldn't do this?

For background, we have a very large code-base written in C. For a few years we've been jumping through hoops to do things that would come naturally via C++ ( homebrewe inheritance, for example). We'd like to start moving towards C++, but in a gradual fashion; getting our CORBA-like framework to support it, and refactoring modules as we go along to take advantage of the more natural approach C++ would provide.

Was it helpful?

Solution

I've done something like this once. The main source of problems was that C++ is more strict about types, as you suspected. You'll have to add casts where void* are mixed with pointers of other types. Like allocating memory:

Foo *foo;
foo = malloc(sizeof(*foo));

The above is typical C code, but it'll need a cast in C++:

Foo *foo;
foo = (Foo*)malloc(sizeof(*foo));

There are new reserved words in C++, such as "class", "and", "bool", "catch", "delete", "explicit", "mutable", "namespace", "new", "operator", "or", "private", "protected", "friend", etc. These cannot be used as variable names, for example.

The above are probably the most common problems when you compile old C code with a C++ compiler. For a complete list of incompatibilities, see Incompatibilities Between ISO C and ISO C++.

You also ask about name mangling. In absence of extern "C" wrappers, the C++ compiler will mangle the symbols. It's not a problem as long as you use only a C++ compiler, and don't rely on dlsym() or something like that to pull symbols from libraries.

OTHER TIPS

See Incompatibilities between ISO C and ISO C++ for a very detailed list of all of the incompatibilities. There are a lot of subtle issues, including some which don't immediately manifest in a compiler error. For example, one issue that can be a problem is the size of character constants:

// In C, prints 4.  In C++, prints 1
printf("%d\n", sizeof('A'));

If I don't wrap all my C files in "extern C { ... }", am I going to get name-mangling where I least expect it?

It bites you when you try to link together C and C++.

I've written a lot of header files containing:

#ifdef __cplusplus
    extern "C" {
#endif

// rest of file

#ifdef __cplusplus
    }
#endif

After a while it merges into the existing multiple-include boilerplate and you stop seeing it. But you do have to be careful where you put it - usually it belongs after any includes your header does.

Is there some reason why I really shouldn't do this?

If you know for sure you aren't going to combine C and C++ then there's no reason to do it that I know of. But with the gradual migration you describe, it's essential for anything with a published interface that both C components and C++ components need to use.

The big reason not to do it is that it prevents you from overloading functions (at least, in those headers). You might find you want to do that once you've migrated all your code to C++ and started maintaining/refactoring/extending it.

Another example: there's not an implicit conversion from ints to enums in C++, while there's one in C. You'll need a cast if you really want to do it in C++.

In general, you won't get any problems at all. Yes, there are some incompatibilities between C and C++, but they don't seem to come up that often except for the malloc casting mentioned above, which is pretty trivial to fix.

I have successfully compiled and used the following open-source C libraries as C++:

  • the Expat XML parser
  • the FreeType2 font rasterizer
  • libjpeg: handles JPEG images
  • libpng: handles PNG images
  • the Zlib compression library

The hardest part was adding namespace wrappers, which took some hours, largely because of #include statements buried deep in the code, which had to be outside the C++ namespace.

Why did I do this? Because I sell a commercial library which people were linking directly into their apps; and sometimes their apps were linked to other versions of Expat, FreeType, etc. This caused multiply-defined-symbol errors. The cleanest thing to do was to move everything inside my library and hide it in my namespace.

However, I didn't do that with all the open-source libraries I use. Some haven't caused conflicts yet, and I haven't got round to fixing them, which although trouble-free is quite tedious. The interesting exception is SQLite, which I couldn't get to compile in C++. So I did a massive search and replace, adding a prefix (the name of my product) to every single externally visible symbol. That solved my client's problem.

I've done this before using MSVC, if using MSVC a good strategy is:

  1. Set individual files to build as CPP, that way you can incrementally move to a CPP compiler.
  2. Work through file by file using ctrl+f7 just to build that one file.
  3. Rather than casting all mallocs you can create a template version instead

foo = (Foo*)malloc(sizeof(*foo));

becomes

foo = malloc<Foo>();

And of course you can have an overload for the cases where you want a Foo+n bytes

I'd also recomment switching the memory allocations to use RAII where possible too, I found that some functions where pretty complex so switching to RAII was too high risk, for the most case it was simple enough to do.

C++ has stricter type checking, so you might need to add a cast to each call to malloc/realloc/calloc.

try to compile with a C++ compiler:

typedef enum{ false = 0, true = 1} bool;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top