How do modern VMs handle memory allocation?

https://stackoverflow.com/questions/11324117

19-06-2021
|

Question

I'm working on a simple stack machine written in C, mostly for learning purposes. After using malloc/free for my memory operations, I thought it would be a good idea to read some memory allocation specific code from modern virtual machines.

I downloaded Lua source code and started reading it. After a while, I realized there are lots of macro stuff involved, and I couldn't find the code where real memory allocation is done (ie. malloc call).

find . -exec grep -i "malloc" '{}' \; -print

It printed only some Lua macros that have malloc word in their names. The Lua VM (and programming language) doesn't use malloc at all!

So this leads me to the question: how do modern VMs handle memory allocation? How does Lua allocate memory from the heap? Are there any ways for allocation other than malloc? What are the pros/cons of other methods?

I'm also wondering about best-practices, design-patterns, etc. for safely working on allocated memory. I see in Lua's source that there is lots of indirection before allocating memory. Where can I learn about this stuff?

Solution

Lua most definitely uses malloc, in the form of realloc (one can also pass a custom allocator too), however, because Lua uses a GC like 99% of VM based languages, it uses the macros to automatically add the GC header block to the allocation.

You'll find Lua's memory all handled by the LuaM_ routines in lmem.c and lmem.h, these all use the global state of the VM to store an allocator, which is initially set to l_alloc (from lauxlib.c), but can be changed by lua_setallocf.

Recently, LuaJIT added allocation sinking and plans for some really cool memory features, which you can read up on this article on LuaJIT Garbage Collection. The article covers a lot of strategy and design revolving around VM/JIT memory allocation, sinking, aggregation, and garbage collecting.

As you can see, memory allocation and sinking strategies are very closely link to the GC one employs (if any).

In terms of pro's and con's of various memory allocators, using standard malloc is simple to use, but at the cost of speed and wastage to alignment and various extra blocks tagged on to each allocation.

Moving to more advanced arena, pool, slab and block allocators, we can speed things up dramatically (especially for fixed size internal VM allocations) and avoid a lot of the fragmentation and overhead that can occur with more general allocators such as malloc, but of course these allocators are more complex, and you have to debug them if you start from scratch (which in a bigger system like a VM is just asking for problems), as apposed to the tried-and-tested CRT malloc implementation.

OTHER TIPS

The Lua core does not use malloc and friends. It relies on a user-supplied memory allocation function that has realloc-like semantics (but is more precise when treating NULL pointers and sizes of 0). See lua_Alloc.

The auxiliary Lua library provides a convenience luaL_newstate function that creates a Lua state via the core lua_newstate function using a memory allocation function based on standard realloc and free. Other clients can use whatever memory allocation is suitable for their app.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow