Module system for OOP language
https://softwareengineering.stackexchange.com/questions/213055
Question
I'm designing a simple OO programming language.
It's statically typed, compiled, and executed by a VM - similar to Java.
The difference is that I don't want to have such a strong emphasis on OOP. The code itself will mostly resemble C++ (classes, functions, and variables allowed in the file scope).
One of the things that I need to have is a module system. I have the following figured out:
- Every file is a module (once compiled) - like Python
- Programmers must import a module with the
import
keyword, which causes the compiler to search for modules in standard directories and the file directory (the VM has to do this at runtime as well)
And now I have no idea how should I introduce the concept of submodules and module hierarchy.
One option, for example, is to depend on the directory hierarchy, so that import engine.graphics.renderer
would expect to find a directory called "engine" in the working directory, and inside a directory called "graphics", with a module called "renderer".
What are the drawbacks of such a design? Am I missing anything?
Solution
Take a look at Python's package/module hierarchy/organization, and especially in historical aspect, major additions over the course of years, most importantly the latest ones. Probably, there is no sense in inventing the wheel.
I do not know how far you want to go with your language, eg
- Do you want modules bytecode to be read from zip?
- How do you separate modules/libraries for different interpreter versions?
- Do you plan to make it seemless with distros?
- Do not forget easiness of language's own distribution (think distutils / pypi - each platform/language has its own, not always good)
I guess, Java may be another interesting example. Things can be learned from Erlang's way as well (eg https://stackoverflow.com/questions/2968914/code-hot-swapping-in-erlang).
There is a whole lot of design issues (some touched above) if you plan to see your programming language mainstream one day. Fortunately, there are great examples out there.
Some directions programming language module/package/library system design should address:
- intuitive code for both writing and using the module
- module/package objects in the code
- module/package introspection capabilities
- module/package encapsulation (how to hide unnecessary details?)
- use of interfaces/header files?
- fine-graned dispatching system, which both language own distribution utils as well as system level utils can use (avoid "DLL hell")
- storage places for bytecompiled modules
- setup system, which automates compiling both "pure" modules and modules/dependencies
- canonical place of compiled code, tests, documentation, assets in your module
OTHER TIPS
First of all, let's assume that you map your namespaced model to another namespace, such as the filesystem, as you suggested. Second, I'm assuming that modules can import other modules. In other words, engine.graphics.renderer
could very well contain a line like import circle.arc
. That raises two issues:
Where would the VM look for
circle.arc
? According to to the filesystem mappimg mentioned, there should be a directory like/etc/mylang/modules/circle/arc
(/etc/mylang/modules
being the root of your module structure).How would an application reference
circle.arc
: byimport
ingcircle.arc
orengine.graphics.render.circle.arc
? The first would "spoil" (if not ruin) the hierarchy, becausecircle.arc
is clearly a submodule ofengine.graphics.renderer
, and the second would imply that there should be a/etc/mylang/modules/engine/graphics/renderer/circle/arc
in your filesystem, placingcircle.arc
in two locations simultaneously.
Having said that, and should you decide to go with the namespace approach, mapping it to the filesystem seems too restrictive to me. I think modules can reside in all different kinds of places (even zip files, as already mentioned, even urls). The VM would start by looking up an entry in some kind of index (maybe a config file) that would map the namespace to actual module locations.