Technial confusion between compilation and interpretation

https://stackoverflow.com/questions/11565917

21-06-2021
|

Question

I've read many definitions and statements about "Interpretation" and "compilation". But I am still very much confused.

Technically speaking, what is REALLY the difference between interpretation and compilation under the hood? Let me elaborate (please correct any wrong concept I might have) :

In java, the source code is "compiled" into ByteCode which is then "interpreted" and/or "just-in-time compiled" into machine code. But what is the difference between just in time compilation and interpretation? I mean, in the end, as far as my guess goes, the Host's CPU will run machine code only. Thus, in interpretation as well, instructions ARE being converted into machine code which can be understood by the CPU. So, where do we draw the line between just-in-time compilation and interpretation?

P.S. This is my conception. It might be totally wrong. In that case, Kindly excuse my stupidity and correct me.

Thanks.

Solution

1. Frankly speaking the idea that java has both Compiler and Interpreter is a myth, its the behavior of it that is marked as Compilation and Interpreter.

2. Java compiler compiles the human readable code to byte code. Which then is converted by the JIT (Just In Time Compiler) during runtime into machine level executable code.

3. During Runtime JIT identifies the runtime intensive part of the code and then converts it into machine level executable code, this part of the code is known as Hot-Spot, and thats why JIT is called as Hot-Spot compiler.

4. JIT uses the Virtual Memory Table ( V-table), which is a pointer to the method in the class. The Hot-Spot code is then converted to its machine level executable code, its address is stored here, and when this part is called again, then its directly fetched by this stored address. This behavior of JIT to keep compiling small amount of code during Run time is assumed to be Interpreted Behavior, And the JIT behaviour of storing this for later use is assumed as Compilation.

5. Virtual Memory Table also has a table which stores the address of the byte code, which can be used if needed.

OTHER TIPS

When the code is compiled, the generated artifact is understandable directly by the hardware. Basically it's a machine code sent directly to the CPU. This also means that an artifact compiled against given CPU architecture won't run on another. The advantage is immediate startup and great performance.

In interpreted environments there is either no compilation at all or the result of such step is an intermediary code. This code is two abstract to be sent directly to processing unit. Instead a separate layer is needed (virtual machine, interpreter) that reads this artifact and executes it in some sandbox environment. The advantage of this approach is portability - intermediate code can run on any platform where native interpreter is available. Unfortunately the performance is almost always worse.

JIT in Java is a hybrid technology. First bytecode is interpreted, each bytecode instruction is executed by the interpreter. However at some point in time (and under some conditions) bytecode is translated into machine code and sent directly to CPU to improve performance. This approach brings best of both worlds - portability of intermediate code and speed of native code. Moreover, the JIT knows much more about runtime behaviour of your code (how many times given loop is called on average? Is this method really virtual?), so the machine code can be even faster then the one generated by an ordinary compiler (!)

You're right that eventually everything has to be converted to machine code. The basic difference is that in the case of an interpreter, this translation occurs every time the code runs, whereas a compiler does this translation ahead of time, after which the compiler is not required for running the program.

Just-in-time compilation is a combination of both, where the JIT compiler is still required for running the program, and the code is compiled at run-time.

Compilation takes time but it is advantageous when the same piece of code is run several times, for eg. in a loop. The Java HotSpot VM takes this approach further by initially interpreting bytecode directly and then JIT-compiling a piece of code once it has run a certain number of times.

Interpreters interpret code line by line, and decides the machine code at run time;

Compiler consume code by chunk, and decides the machine code at compile time;

JIT compiler is a hybrid approach, in which the code is generated at run time (but could be already cached to improve performance), but is consumed in chunk.

An interpreted environment involves instructions being executed immediately after parsing, where both the parsing and execution are done by the interpreter. This means that the machine you run the code on must have the interpreter in order to run the program.¹

A compiler will parse the instructions into machine code and store them for later execution. Java however is bytecompiled², which means that this process turns the instructions into ByteCode, which will then be used by the interpreter.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow