Question

Pardon my english. I am recently trying to understand the different parts of a compiler and to implement them with a play language. I am wondering what the jobs of a semantic analyzer is, because many things that I read that a semantic analyzer is supposted to do are not really for dynamic languages, such as type checking, scope checking, etc. because those things are checked at run time.

So I think a few of the jobs of a semantic analyzer for a dynamic language (like LUA or PYTHON or RUBY) are to

  1. make sure that assignments are not bad like 1 = a or 5 = 5

However, I am not sure what other jobs the semantic analasis phase of a compiler for dynamic languages are. It seems that it has a very small job to do in dynamic languages because most is done at run time. What other common jobs does the semantic analyzer take care of for dynamic languages? I feel as I am missing much of the part of semantic analasis. Thank you.

Was it helpful?

Solution

You're right, many analysis tasks don't exist in dynamic language compilers (that's why they are relatively simple to implement). However, there are a few more tasks I can think of:

  • Scoping. It is correct that the type and sometimes even existence of variables is determined dynamically, but at least for Lua and Python, there is some part of scoping that can (and should if you don't want to complicate the implementation needlessly) be done at compile time: The scope of non-global variables.

    • What has to be analyzed? That part is easy in Lua as there is an explicit local keyword - but it still requires the compiler to be aware of it! - and requires relatively extensive analysis in Python, with assignments implicitly making variables locals and two (in 3.x, one in 2.x) keywords to change that behaviour.

    • Why does it matter? In Python, acessing a local variable that hasn't been initialized yet is as much of an error as accessing a non-existing global in Python, but a different error. In Lua, both lead to nil and local doesn't change the scope of previous assignments, but the semantics of subsequent reads/writes still change. Also, the bytecode instructions for are very different in both cases.

  • Optimizations. Well, obviously you can have only limited (if any, in some cases) information about what variables/"constants" contain. Nonetheless, at least CPython has a wide variety of constant folding and bytecode optimization passes (see peephole.c) and even Lua with its insanely fast one-pass compiler does some constant folding on arithmetic instructions. And the PyPy interpreter (independently of its JIT) introduced a CALL_LIKELY_BUILTIN opcode that's emitted for calls to globals that, by their name, are probably builtin function. It's obvious that this requires some scope analysis.

  • As you said yourself, complaining about the few constructs that are forbidden at compile-time. However, this could be counted under parsing as well (many of these rules are actually encoded in the grammar). Another example (that aren't easily encoded in the grammar) is duplicate function parameter names.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top