LLVM, Parrot, JVM, PyPy + python

https://stackoverflow.com/questions/5328295

26-10-2019
|

Question

What is the problem in developing some languages, for example python for some optimized techniques with some of LLVM / Parrot.

PyPy, LLVM, Parrot are the main technologies for common platform development.
I see this like:

PyPy - framework to build VM with build in optimized VM for python
So it quite general solution. The process goes as listed down:
1. leaving PyPy code and:
  a. PyPy backend for some VM (like jvm)
  b. som Kit to make own VM
  c. processing/running PyPy internal code

Am I right About this process? For python there is optimized VM? Particularly by default there is build in VM for optimized PyPy code (step 5.c) - which is for python and every language processing can stop there and be running by it?

Parrot - much like PyPy, but without 5.a and 5.b ? Some internal improvements for dynamic processing (Parrot Magic Cookies).

Both Parrot and PyPy are designed to create a platform which create a common dynamic languages runtime, but PyPy wants more - also to create more VM.
Where is the sens of PyPy? For what we need to create more VM? Shouldn't be better to focus on one VM (like in parrot) - because there is common one code level - either PyPy internal bytecode or Parrot ones. I think we can't gain nothing better to translate to PyPy bytecode to newly created with PyPy VMs.

LLVM - i see this very similar to PyPy but without VM generator.
It is mature, well designed environment with similar targets as PyPy (but without VM generator) but working on low level structure and great optimization/JIT techniques implemeted

Is see this as: LLVM is general use, but Parrot and **PyPy* designed for dynamic languages. In PyPy / Parrot is more easy to introduce some complicated techniques - because it is more high level then LLVM - like sophisticate compiler which can better understand high level code and produce better assembler code (which humans can't write in reasonable time), then the LLVM one?

Questions:

Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?
I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy
Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high development process and has big companies investing in.
I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?
What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?
Can you correct me if i'm wrong somewhere

Some addings:

=============

CLARIFICATION

I want to figure how all this software consist - and what is the problem to porting one to other.

Solution

That not stuff anybody can possible answer in a stackoverflow questions but i give it a minmal shot.

First what problems do the 3 projects solve?

pypy allows you to implement an interpreter in a high level language and you get a generated jit for free. The good thing about this is that you don't have a dependence mismatch between the langauge and the platform. Thats the reason why pypy-clr is faster then IronPython. More info here: http://codespeak.net/pypy/dist/pypy/doc/extradoc.html --> High performance implementation of Python for CLI/.NET with JIT compiler generation for dynamic)
llvm is a low level infrastructure for compilers. The general idea is to have one "high level assembly". All the optomizations work on that language. Then there is tons of infrastructure around to help you build compilers (JIT or AOT). Implementing a dynamic language on llvm is possible but needs more work then implementing it on pypy or parrot. You, for example, can't get a GC for free (there are GC you can use together with LLVM see http://llvm.org/devmtg/2009-10/ --> the vmkit video ) There are attempts to build a platform better for dynamic langauges based on llvm: http://www.ffconsultancy.com/ocaml/hlvm/
I don't know that much about parrot but as I understand they want to build one standard VM specialized for dynamic langauges (perl, php, python ....). The problem here is the same as with compiling to JVM/CLR there is a dependency missmatch, just a much smaller one. The VM still does not know the semantics of your langauge. As I unterstand parrot is still pretty slow for user code. (http://confreaks.net/videos/118-elcamp2010-parrot)

The answer to your question:

Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?

Thats a question of effort. Building everthing your self and specialized for you will eventually be faster but it's a LOT more effort.

I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy.

Targeting parrot would (at this point) not likely have a benefit over pypy. Why nobody else does it I don't know.

Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high development process and has big companies investing in.

Ok there is a lot of stuff in that question.

Like I said LLVM is hard to move to and parrot is not that fast (correct me if im wrong).
Ruby has Rubinius witch tries to do a lot in ruby and jits to llvm (http://llvm.org/devmtg/2009-10/ --> Accelerating Ruby with LLVM).
There is a implementation of CLR/JVM on LLVM but they both already have very mature implemantations that have big investments.
LLVM is not high level.

I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?

I have no idea what the question is.

What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?

Watch the video of VMKit I linked above that show how far they got and what the problem is (and how they solved it).

Can you correct me if i'm wrong somewhere

Lots of stuff you wrote is wrong or I just don't anderstand what you mean, but the stuff I linked should make a lot of stuff clearer.

Some examples:

Clojure

The creater didn't want all the work of implementing his own vm and all the libraries. So where to go to? Since Clojure is a new langauge you can build it in a way that works well on a platform like the JVM by restricting a lot of dynamic stuff a language like python or ruby would have.

Python

The language can't (practically) be changed to work well on JVM/CLR. So implementing python on those wont bring massive speedups. Static compiler won't work very well either because there are not many static guarantees. Writing a JIT in C will be fast but very hard to change (see the psyco project). Using the llvm jit could work and is explored by the Unladen Swallow project (again http://llvm.org/devmtg/2009-10/ --> Unladen Swallow: Python on LLVM). Some people wanted to have python in python so they started pypy and their idea seams to work really well (see above). Parrot could work as well but I have not seen anybody have try (feel free).

On everything:

I think you're confused and I can understand that. Take your time and read, listen, watch everything you can get. Don't stress yourself. There are a lot of parts to this and eventually you see how what fits together and what makes sense and even when you know a lot there is still a lot of discussing one may do. The question is where to implement a new language or how to speed up a old language have many answers and if you ask 3 people you're likely to get three different answers.

OTHER TIPS

What are you trying to implement? Your question is very confusingly worded (I realize English is likely not your first language).

LLVM and PyPy are both mature, useful projects, but really don't overlap much at this point. (At one point, PyPy could generate LLVM bytecode—which was statically compiled to an interpreter—as opposed to C code, but it didn't provide much of a performance benefit and is no longer supported.)

PyPy lets you write an interpreter in RPython and use that as a description to generate a native code interpreter or JIT; LLVM is a C++ framework for building a compiler toolchain which can also be used to implement a JIT. LLVM's optimizers, code generation and platform support are significantly more advanced than those of PyPy, but it isn't as well suited to building a dynamic language runtime (see the Unladen Swallow retrospective for some examples of why). In particular, it is not as effective at collecting/using runtime feedback (which is absolutely essential for making dynamic languages perform well) as PyPy's trace-based JIT. Also, LLVM's garbage collection support is still somewhat primitive, and it lacks PyPy's unique ability to automatically generate a JIT.

Incidentally two Java implementations are built on LLVM—J3/VMKit and Shark.

You might consider watching the PyPy talk from Stanford last week; it provides a pretty decent overview of how PyPy works. Carl Friedrich Bolz's presentation also provides a good overview of the state of VM implementation.

The main reason? Because VM design is not a settled technology, and having a variety of VMs with different goals and objectives allows a variety of mechnisms to be tried in parallel rather than all having to be tried in series.

The JVM, CLR, PyPy, Parrot, LLVM and the rest all target different kinds of problems in different ways. It's similar to the reasons why Chrome, Firefox, Safari and IE all use their own Javascript engines.

Unladen Swallow attempted to apply LLVM to CPython, and spent more of their time fixing issues in LLVM than they did in doing anything Python specific.

Python-on-Parrot suffered from semantic differences between Perl 6 and Python causing problems with the front-end compilation process, so future efforts in this area are likely to use the PyPy front-end to target the Parrot VM.

Different VM developers certainly keep an eye on what the others are doing, but even when they lift good ideas they will put their own spin on them before incorporating them.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow