Вопрос

From what I know, CPython programs are compiled into intermediate bytecode, which is executed by the virtual machine. Then how does one identify without knowing beforehand that CPython is written in C. Isn't there some common DNA for both which can be matched to identify this?

Это было полезно?

Решение 2

Python isn't written in C. Arguably, Python is written in an esoteric English dialect using BNF.

However, all the following statements are true:

  1. Python is a language, consisting of a language specification and a bunch of standard modules
  2. Python source code is compiled to a bytecode representation
  3. this bytecode could in principle be executed directly by a suitably-designed processor but I'm not aware of one actually existing
  4. in the absence of a processor that natively understands the bytecode, some other program must be used to translate the bytecode to something a hardware processor can understand
  5. one real implementation of this runtime facility is CPython
  6. CPython is itself written in C, but ...
    1. C is a language, consisting of a language specification and a bunch of standard libraries
    2. C source code is compiled to some bytecode format (typically something platform-specific)
    3. this platform specific format is typically the native instruction set of some processor (in which case it may be called "object code" or "machine code")
    4. this native bytecode doesn't retain any magical C-ness: it is just instructions. It doesn't make any difference to the processor which language the bytecode was compiled from
    5. so the CPython executable which translates your Python bytecode is a sequence of instructions executing directly on your processor
    6. so you have: Python bytecode being interpreted by machine code being interpreted by the hardware processor
  7. Jython is another implementation of the same Python runtime facility
  8. Jython is written in Java, but ...
    1. Java is a language, consisting of a spec, standard libraries etc. etc.
    2. Java source code is compiled to a different bytecode
    3. Java bytecode is also executable either on suitable hardware, or by some runtime facility
    4. The Java runtime environment which provides this facility may also be written in C
    5. so you have: Python bytecode being interpreted by Java bytecode being interpreted by machine code being interpreted by the hardware processor

You can add more layers indefinitely: consider that your "hardware processor" may really be a software emulation, or that hardware processors may have a front-end that decodes their "native" instruction set into another internal bytecode.

All of these layers are defined by what they do (executing or interpreting instructions according to some specification), not how they implement it.

Oh, and I skipped over the compilation step. The C compiler is typically written in C (and getting any language to the stage where it can compile itself is traditionally significant), but it could just as well be written in Python or Java. Again, the compiler is defined by what it does (transforms some source language to some output such as a bytecode, according to the language spec), rather than how it is implemented.

Другие советы

The interpreter is written in C.

It compiles Python code into bytecode, and then an evaluation loop interprets that bytecode to run your code.

You identify what Python is written in by looking at it's source code. See the source for the evaluation loop for example.

Note that the Python.org implementation is but one Python implementation. We call it CPython, because it is implemented in C. There are other implementations too, written in other languages. Jython is written in Java, IronPython in C#, and then there is PyPy, which is written in a (subset of) Python, and runs many tasks faster than CPython.

I found a good understanding of my original doubt here: http://amitsaha.github.io/site/notes/articles/c_python_compiler_interpreter.html

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top