Why wasn't code “managed” from the start? [closed]

https://stackoverflow.com/questions/796931

18-09-2019
|

Question

Note that this is not about the .NET CLR that Microsoft is thrusting into the atmosphere to evangelize the concept of managed code. Most of you know that managed code has been around for quite some time and isn't very related to rocket science.

What I would like to know is why the concept of runtime security in the evolution of computers came so late.

I know this is like asking "why didn't the first Model T Ford come with airbags and seat belts?". The relevance of the question still stands despite this though, because it's well within human nature to protect againts known dangers. E.g. the first T-Ford didn't go fast enough to motivate airbag research. It didn't go fast enough for people to make fatal judgemental errors so often that it would motivate seat belt as law and standard in many countries.

In computer evolution it's almost the other way around. We started out with assembler, the equivalent of driving a T-Ford at 200mph with an eye-patch. I've had the pleasure of conversating with a few old truckers from this era, hearing these stories about hand-assembling assembly code, human debuggers, grillion lines of code etc. If we make a really nasty error in C, we might end up with a bluescreen. Decades ago, you could end up with damaged hardware and god knows what. But it's a mystery to me - so many decades, and all we did to make crashing less painful was the bluescreen (sorry for using MS as archetype for anything).

It's not only within human nature to protect against known dangers, it's also within any programmer's nature to automate and systemize common facilities, like error checking, memory diagnostics, logging frameworks, backup maintenance etc etc.

Why didn't programmers/humans start to automate the task of ensuring that code they feed to the system won't harm the system?. Yes, ofcourse, performance. But hey, this was well before any seriously penetrating hardware standard. Why didn't motherboards get designed with bus architectures and extra processors to facilitate "managed code"?

Is there any metaphor to Model T Fords not being fast enough that I'm missing?

Solution

Let's think this through from first principles.

A managed platform provides a relatively sandboxed area to run program code that is created from the high level language to a form more suitable to be executed by the platform (IL bytecodes) . In there are also utility features like garbage collection and module loading.

Now think about a native application - the OS provides a relatively sandboxed area (a process) to run program code that is created from a high level language to a form more suitable to be executed by the platform (x86 opcodes). In there are also utility features like virtual memory management and module loading.

There's not much difference, I think the reason we have managed platform in the first place is simply because it makes coding the platform easier. It should make the code portable between OSes, but MS didn't care for that. Security is part of the managed platform, but should be part of the OS - eg. your managed app can write files and similar, just like a normal process. Restricting that is a security feature, it isn't an aspect of a managed platform that doesn't exist on native.

Ultimately, they could have put all those managed features into a set of native dlls and scrapped the idea of the intermediary bytecode, JIT compiling to native code instead. "Managed" features like GC is easily possible on native heaps - see the Boehm C++ one for an example.

I think MS did it partly because it made the compiler easier to write, and partly because that's how Java was made (and .NET is very much a descendant of Java, if only in spirit), though Java did it that way to make cross-platform coding possible, something MS doesn't care for.

So, why didn't we get managed code from the start - because all the things you mention as being part of 'managed' code, are native code. Managed platforms we have today are simply an additional abstraction on top of an already abstracted platform. High-level languages have had more features added to them to protect you from yourself, buffer overflows are a thing of the past, but there's no reason they couldn't have been implemented in C when C was first invented. Its just that they weren't. Perhaps hindsight makes it seem like these features were missing, but I'm sure in 10 years time, we'll be asking "why didn't C# implement the obviously useful feature XYZ like we have today"

OTHER TIPS

Managed code built in security etc. has been around for a long time.

There just wasn't room for it in the original PC platform and it never got added in later.

The venerable IBM mainframe has has protected addressing, untouchable kernal libraries, role based security etc. etc. since the 70s. Plus all that Assembler code was managed by a sophisticated (for the time) change management system. (Univac, Burroughs etc had something similar.)

Unix had fairly decent security built in from the beginning (and it hasn't changed very much over the years).

So I think this is very much a windows/web space problem.

There has never been a mainframe virus! Most of the financial transactions in the world pass through these systems at some point so its not as if they werent an attractive target.

The internal IBM mail system did host the first 'trojan' though!

Actually, managed code has been around for a very long time. Consider:

LISP
Smalltalk
BASIC (original flavour)

All provided operating system-like environments which protected the use from memory and other resource control issues. And all were relative failures (BASIC only really succeeded when features like PEEK & POKE that allowed you to mess with the underlying system were introduced).

Computers weren't powerful enough and making them powerful enough was too expensive. When you've only got limited resources at your disposal, every byte and CPU cycle counts.

The first computer I used was a Sinclair ZX Spectrum in 1982. It had less RAM (16K) than the size of a single Windows' font file today. And that was relatively recently, in the home computer age. Before the mid-1970s the idea of having a computer in your home was inconceivable.

Just for the record, we never hand-compiled assembly. We hand-assembled assembly language code. Now that that's clear...

Your analogy is clouding the question because the speed of the car is not analogous to the speed of the computer in this sense: The increasing speed of the car necessitated the changes in auto safety, but it's not the increased speed of the computer that drives the need for changes in computer security, it's the increase in connectivity. From a slightly different angle: For the car, increasing speed is the driving technology for increasing safety. For computers, increasing speed is the enabling technology for increasing safety.

So, the first cars were safe in accidents because they were slow. The first computers were safe because they weren't networked.

Now, cars are made safer through seat belts, air bags, ABS, anti-collision devices, and so forth. Computers are made safe through additional techniques, although you still can't beat unplugging the network cable.

This is a simplification, but I think it gets at the heart of it. We didn't need that stuff back then, because computers weren't connected to the network.

The same reason why There were no trains 300 years ago . The same reason why There were no cell phones 30 years ago. The same reason why we still don't have teleportation machine.

Technology evolves over time , it is called evolution.

The computers wasn't powerful enough back then. running a garbage collector at the background would have kill you application performance.

Speaking to your question of why computers didn't have the protection mechanisms on the level of managed code, rather than why VMs couldn't run on slow hardware (already explained in other posts). The short answer is that it was. CPUs were designed to throw an exception when bad code happened so that it wouldn't damage the system. Windows handles this notoriously poorly, but there are other OSs out there. Unix passes it as signals so that the programs get terminated without bringing down the system. Really whether or not you are running managed code or not, a null pointer exception will result the same way - in program termination. Virtual memory ensures that programs don't mess with other code, so all they can do is hurt themselves.

Which brings me to my second point. All this is unnecessary if you know what you are doing. If I want to keep my furniture clean, I simply don't drop food on it. I don't need to cover my house in plastic, I just have to be careful. If you're a sloppy coder the best VM in the world isn't going to save you, it will just allow you to run your sloppy code without any noise. Also, porting code is easy if you use proper encapsulation. When you are a good coder, managed code doesn't help extensively. That is why not everyone is using it. It is simply a matter of preference, not better / worse.

As far as run-time security goes, there's nothing a P-code compiler can predict that a machine code can't, and nothing a managed code interpreter can handle that the OS can't (or doesn't) already. Motherboards with extra buses, CPUs and instruction sets cost a lot more money - IT is all about the cost/performance ratio.

In 1970, the cost of memory was around $1/bit (without inflation). You cannot afford the luxury garbage collection with costs like that.

I think like most questions, "Why did we not have X in programming Y years ago" the answer is speed/resource allocation. With limited resources they needed to be managed as effectively as possible. The general purpose type of management associated with managed code would have been too resource consuming to have been of benefit in performance critical applications of the time. This is also part of why today's performance critical code is still written in C, Fortran or assembler.

Why didn'we just build airplanes and spaceships at once, instead of mucking around with horse-and-carriage and all that tedious stuff?

The use of an intermediate language requires one of two things:

Run-time interpretation, which will have a substantial performance penalty (widely variable--occasionally 2x or less, but sometimes 100x or more)
A just-in-time compiler, which will require extra RAM, and which will add delay roughly proportional to program size, rather than number of statements executed

One thing that has changed over the years is that many programs run the most heavily-used portions of their mode many more times than they used to. Suppose the first time any particular statement is executed will incur a penalty 1,000 times as long as subsequent executions. What will be the effect of that penalty in a program where each statement is run an average of 100 times? What will be effect of that penalty on a program where each statement is run an average of 1,000,000 times?

Just-in-time compiling has been possible for a long time, but in the 1980's or 1990's, the performance cost would have been unacceptable. As technologies have changed, the practical costs of JIT compilation have come down to the point that they're entirely practical.

The answer becomes clearer - humans weren't built for writing programs. Machines should be doing it and letting us relax by playing pacman.

For what it's worth, I read a couple of papers for my computing languages class (one by CAR Hoare and another by Nicholas Wirth) advocating exactly this back in the 60s and 70s among other things.

I can't speak to exactly why these things didn't happen, but my guess is that it's just one of those things that looks obvious in hindsight that wasn't obvious at the time. It's not that earlier compilers weren't concerned about security. It's that they had different ideas about how to do this.

Hoare mentions the idea of a "checkout compiler". As far as I can tell, this is essentially a compiler that does static analysis. To him, this was a popular technique that failed (or at least didn't solve as many problems as it was inteneded to solve). The solution to him was to make programming languages more secure by creating managed code (or at least that's how he would have put it in modern terms).

I'd imagine that once C (and later C++) caught on, the idea of managed code was essentially dead. It's not that C was a bad language, just that it was intended to be an assembly language rather than an application programming language.

If you get a chance, you might read Hints on programming-language design. It's a pretty good read if you're interested in this kind of thing.

Best answer to this question is, IMHO, nobody had an idea of managed code at that time. Knowledge actually evolves over time. As compared to fields like architecture or agriculture, computer science is a very young field. So the collective knowledge about the field is also young and will evolve over time. Perhaps in a few years we come across some new phenomenon and someone will be asking the same question, "why didn't somebody think of XYZ beofore?".

I'd say it's largely been change resistance coupled with false perception of inefficiency of garbage collection that's delayed the adoption of GC and related techniques. Of course the brain dead segmented memory model on Intel 8086 didn't exactly help promote sane memory management on the PC.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow