Question

In dynamic languages, how is dynamically typed code JIT compiled into machine code? More specifically: does the compiler infer the types at some point? Or is it strictly interpreted in these cases?

For example, if I have something like the following pseuocode

def func(arg)
    if (arg)
        return 6
    else
        return "Hi"

How can the execution platform know before running the code what the return type of the function is?

Was it helpful?

Solution

In general, it doesn't. However, it can assume either type, and optimize for that. The details depend on what kind of JIT it is.

The so-called tracing JIT compilers interpret and observe the program, and record types, branches, etc. for a single run (e.g. loop iteration). They record these observations, insert a (quite fast) check that these assumptions are still true when the code is executed, and then optimize the heck out of the following code based on these assuptions. For example, if your function is called in a loop with a constantly true argument and adds one to it, the JIT compiler first records instructions like this (we'll ignore call frame management, memory allocation, variable indirection, etc. not because those aren't important, but because they take a lot of code and are optimized away too):

; calculate arg
guard_true(arg)
boxed_object o1 = box_int(6)
guard_is_boxed_int(o1)
int i1 = unbox_int(o1)
int i2 = 1
i3 = add_int(res2, res3)

and then optimizes it like this:

; calculate arg
; may even be elided, arg may be constant without you realizing it
guard_true(arg)
; guard_is_boxed_int constant-folded away
; unbox_int constant-folded away
; add_int constant-folded away
int i3 = 7

Guards can also be moved to allow optimizing earlier code, combined to have fewer guards, elided if they are redundant, strengthened to allow more optimizations, etc. If guards fail too frequently, or some code is otherwise rendered useless, it can be discarded, or at least patched to jump to a different version on guard failure.

Other JITs take a more static approach. For instance, you can do quick, inaccurate type inference to at least recognize a few operations. Some JIT compilers only operate on function scope (they are thus called method JIT compilers by some), so they probably can't make much of your code snippet (one reason tracing JIT compilers are very popular for). Nevertheless, they exist -- an example is the latest revision of Mozilla's JavaScript engine, Ion Monkey, although it apparently takes inspiration from tracing JITs as well. You can also insert add not-always-valid optimizations (e.g. inline a function that may be changed later) and remove them when they become wrong.

When all else fails, you can do what interpreters do, box objects, use pointers to them, tag the data, and select code based on the tag. But this is extremely inefficient, the whole purpose of JIT compilers is getting rid of that overhead, so they will only do that when there is no reasonable alternative (or when they are still warming up).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top