Why does C++ need language modifications to be “managed”?

https://stackoverflow.com/questions/819425

03-07-2019
|

Question

Why can't a compiler be written that manages what needs to be managed in C++ code (i.e. to make it "CLR compatible")?

Maybe with some compromise, like prohibiting void pointers in some situations etc. But all these extra keywords etc. What's the problem that has to be solved by these additions?

I have my thoughts about some aspects and what might be hard to solve, but a good solid explanation would be highly appreciated!

Solution

I'd have to disagree with the answers so far.

The main problem to understand is that a C++ compiler creates code which is suitable for a very dumb environment. Even a modern CPU does not know about virtual functions, hell, even functions are a stretch. A CPU really doesn't care that exception handling code to unwind the stack is outside any function, for instance. CPU's deal in instruction sequences, with jumps and returns. Functions certainly do not have names as far as the CPU is concerned.

Hence, everything that's needed to support the concept of a function is put there by the compiler. E.g. vtables are just arrays of the right size, with the right values from the CPUs viewpoint. __func__ ends up as a sequence of bytes in the string table, the last one of which is 00.

Now, there's nothing that says the target environment has to be dumb. You could definitely target the JVM. Again, the compiler has to fill in what's not natively offered. No raw memory? Then allocate a big byte array and use it instead. No raw pointers? Just use integer indices into that big byte array.

The main problem is that the C++ program looks quite unrecognizable from the hosting environment. The JVM isn't dumb, it knows about functions, but it expects them to be class members. It doesn't expect them to have < and > in their name. You can circumvent this, but what you end up with is basically name mangling. And unlike name mangling today, this kind of name mangling isn't intended for C linkers but for smart environments. So, its reflection engine may become convinced that there is a class c__plus__plus with member function __namespace_std__for_each__arguments_int_pointer_int_pointer_function_address, and that's still a nice example. I don't want to know what happens if you have a std::map of strings to reverse iterators.

The other way around is actually a lot easier, in general. Pretty much all abstractions of other languages can be massaged away in C++. Garbage collection? That's already allowed in C++ today, so you could support that even for void*.

One thing I didn't address yet is performance. Emulating raw memory in a big byte array? That's not going to be fast, especially if you put doubles in them. You can play a whole lot of tricks to make it faster, but at which price? You're probably not going to get a commercially viable product. In fact, you might up with a language that combines the worst parts of C++ (lots of unusual implementation-dependent behavior) with the worst parts of a VM (slow).

OTHER TIPS

Existing correct code, i.e. code written according to the C++ standard, must not change its behaviour inadvertently.

Well C++/CLI is mostly meant to be a glue between managed and unmanaged code. As such you need to have the ability to mix mangaged an unmanaged concepts. You need to be able to allocate managed and unmanged objects in the same code, so there is no way around separate key words.

Why can't you compile native C++ code targetting the CLR?

Yes, you guessed it right, there would be too many compromises, that would make it useless. I'd like to name just three examples...

1.) Templates: C++ supports them, the CLR doesn't (generics are different). So you couldn't use the STL, boost etc. in your code.

2.) Multiple inheritance: supported in C++, not in CLI. You couldn't even use the standard iostream class and derivatives (like stringstream, fstream), which inherit both from istream and ostream.

Almost none of the code out there would compile, you couldn't even implement the standard library.

3.) Garbage collection: Most C++ apps manage their memory manually (using smart pointers etc.), the CLR has automatic memory management. Thus the C++ style "new" and "delete" would be incompatible with "gcnew", making existing C++ code useless for this new compiler.

If you'd have to root out all the important features, even the standard library, and no existing code would compile... then what's the point?

First of all, the distinction between "simple C++" and "managed C++" was intentional, because one of the MC++ purposes was to provide a bridge between existing C++ code and CLR.

Next, there's just too many C++ features that don't fit into CLR model. Multiple inheritance, templates, pointer arithmetics... Without drawing a clear line the programmers would be doomed to face cryptic errors, both at compile- and runtime.

I think this is because adding managed code features into C++ would made C++ slower and the compiler more complex. So much so that C++ would lose what it's designed for in the first place. One of the nice things of C++ is that it's a nice language to work with, it's low-level enough and yet somewhat portable. And probably that's what the C++ Standard Committee plans to made it stay that way. Anyway I do not think C++ can ever be fully "managed", because that would mean programs written in C++ needs a VM to execute. If that's the case, why not just use C++/CLI?

Qt framework does almost that. I.e. it has smart pointers, that automatically set to null, when the object that they point to is destroyed. And still it's a native C++, after parsed by moc(meta object compiler).

Yes, I suppose C++ could become managed. But then .NET would need to be rewritten for C++ and not with a bias towards BASIC. With having many languages all under the same roof. Certain features have got to go. It was a choice between VB.NET or C++.NET, and VB.NET was chosen. Funny thing I hear is that C# is more popular than VB.NET (although I use neither!).

The .NET CLR requires that no reference to a managed object can ever exist anyplace the run-time doesn't know about except while the object is pinned; good performance requires that objects be pinned as little as possible. Since the .NET CLR cannot understand all of the data structures that are usable within C++, it's imperative that no references to managed objects ever be persisted in such structures. It would be possible to have "ordinary" C++ code interact with .NET code without any changes to the C++ language, but the only way the C++ code could keep any sort of "reference" to any .NET objects would be to have a some code on the .NET side assign each object a handle of some sort, and keep a static table of the objects associated with the handles. C++ code which wanted to manipulate the objects would then have to ask the .NET wrapper to perform some operation upon the object identified by a handle. Adding the new syntax makes it possible for the compiler to identify the kinds of objects the .NET framework will need to know about, and enforce the necessary restrictions upon them.

first thing to consider is every thing that makes c++ "fast" will disappear. a full garbage collection system in c++ is next to impossible. because c++ you can have pointer nearly anywhere in the code. runtime type information becomes costly if not directly built into the langauge system it self. you can take advantage of true native performance. template will dissappear. true pointers will dissapear. direct access to memory is gone.

list of things that would have to be enforced

1. no direct pointers(pointers will get replace with complex refernces)
2. templates (generics pay for preformance)
3. simple c-style arrays (will get wrapped with array structures)
4. programmer no longer has control of whether data is on the stack or
the heap.
5. garbage collection will be enforced(this will cause the most changes to the syntax)
6. runtime type data will get added extensively to the code.
(larger code size)
7.  inlining will become more difficult for the compiler
(no more inline key word)
8. no more inline assembly.
9. the new langauge by now will become incompatible c code.(unless you go through hoops)

I agree with 5hammer ! If I left Java and other managed languages that's not for nothing : that's to have FULL control over the computer, access memory manage memory myself, have control over how the computer will run my code, integrate with C libraries (such as Lua). If I loose that flexibility, then I would just leave C++ and fall back to C, and if C becomes managed too, then I would go to assembler.

Managed languages are the worst ones ever for all gaming platforms/complex programs as they are bounding you in some kine of sandbox with no direct access to the hardware, and are much slower than compiled languages.

The main purpose of C++ had always been Performence. It's one of the best language for big Games. And without this language performences a lot of games would not exist !

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow