Difference between a bytecode parsed instruction and machine language?

https://stackoverflow.com/questions/881709

22-08-2019
|

Question

"A bytecode program is normally executed by parsing the instructions one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or "just-in-time" (JIT) compilers, translate bytecode into machine language as necessary at runtime: this makes the virtual machine unportable."

A question about this paragraph is that: After the bytecode gets processed, what's the difference between a parsed instruction and machine language (or machine code)?

Solution

JIT is different to a byte code interpreter.

Consider the following C function:

int sum() {
   return 5 + 6;
}

This will be compiled directly machine code. The exact instructions on say x86 and ARM processors will be different.

If we wrote a basic bytecode interpreter it might look something like this:

for(;;) {
   switch(*currentInstruction++) {
   case OP_PUSHINT:
      *stack++ = nextInt(currentInstruction);
      break;
   case OP_ADD:
      --stack;
      stack[-1].add(*stack);
      break;
   case OP_RETURN:
      return stack[-1];
   }
}

This can then interpret the following set of instructions:

OP_PUSHINT (5)
OP_PUSHINT (6)
OP_ADD
OP_RETURN

If you compiled the byte code interpreter on both x86 or ARM then you would be able to run the same byte code without doing any further rewriting of the interpreter.

If you wrote a JIT compiler you would need to emit processor specific instructions (machine code) for each supported processor, whereas the byte code interpreter is relying on the C++ compiler to emit the processor specific instructions.

OTHER TIPS

In a bytecode interpreter, the instruction format is usually designed for very fast "parsing" using shift and mask operators. The interpreter, after "parsing" (I prefer "decoding") the instruction, immediately updates the state of the virtual machine and then begins decoding the next instruction. So after the bytecode gets processed in an interpreter, no remnant remains.

In a JIT compiler, bytes are processed in units larger than a single instruction. The minimum unit is the basic block, but modern JITs will convert larger paths to machine code. This is a translation step, and the output of the translation step is machine code. The original bytecode may remain in memory, but it is not used for implementation—so there is no real difference. (Although it is still typical that the machine code for a JITted virtual machine does different things from the machine code emitted by a native-code compiler.)

There's no difference - JIT compiler is done exactly for that - it produces machine code that is executed on the hardware.

Ultimately it all boils down to machine instructions.

Native App - contains machine instructions that are executed directly.
JIT App - bytecode is compiled into machine instructions and executed.
Translated App - bytecode is translated by a virtual machine that is a Native App.

As you can tell, with #1, you have the least overhead while with #3, you have the most overhead. So, performance should be fastest on #1 and just as fast on #2 after the initial compilation overhead.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow