Loading/unloading ELF sections on demand?

https://stackoverflow.com/questions/1649084

linux
elf

22-07-2019
|

Question

For a rather obscure use case I'd like to have a (large) statically linked Linux executable made up of a small piece of control code and large pieces of static (read-only) data. Is it possible, to save memory, to get the loader to load only the sections for the control code, and then manually load the sections of RO data as they are needed, and unload them again once the processing is done?

Is this possible?

(I suppose data streams (on the filesystem level) could be used to solve this, but they aren't available to me (EXT3) and distribution would be tricky since data streams easily get lost.)

Solution

This is (very probably) already taken care of for you.

The real answer of course will be system-dependent, but in general, modern operating systems (and certainly Linux) use demand paging for executables, so no RAM will be actually allocated for sections of the ELF file you don't reference.

OTHER TIPS

Instead of linking your blobs into the binary, append them to it. They won't be mapped, but you can read or map them as you need them.

Some answers are somewhat misleading as they imply that the whole binary will be mapped. No, that's wrong. Not everything will be mapped!

Proof:

$ cat /proc/self/maps
**08048000**-08052000 r-xp 00000000 08:03 78433      /bin/cat
**08052000**-08053000 rw-p 0000a000 08:03 78433      /bin/cat
...

That's the only two parts from the that will be mapped. Why? Because these are the only ones that have the type DT_LOAD:

$ readelf -l /bin/cat

Elf file type is EXEC (Executable file)
Entry point 0x8049cf8
There are 8 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x08048034 0x08048034 0x00100 0x00100 R E 0x4
  INTERP         0x000134 0x08048134 0x08048134 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  LOAD           0x000000 **0x08048000** 0x08048000 0x09ba0 0x09ba0 R E 0x1000  <<
  LOAD           0x00a000 **0x08052000** 0x08052000 0x00228 0x00804 RW  0x1000  <<
  DYNAMIC        0x00a014 0x08052014 0x08052014 0x000c8 0x000c8 RW  0x4
  NOTE           0x000148 0x08048148 0x08048148 0x00044 0x00044 R   0x4
  GNU_EH_FRAME   0x008c48 0x08050c48 0x08050c48 0x002a4 0x002a4 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

You'll also notice that the virtual address is the same as defined in the ELF file.

In practice you'll only have access to the first 40 KiB (0x8052000-0x8048000 = 40960 Bytes) of the file. This is enough for the ELF headers but you won't be able to access say the DWARF .debug headers, let alone the string table (.strtab).

If you want access to all ELF sections, the easiest would be to map the whole file.

No, if it's part of the ELF file, it will be mapped. I'm not sure of the particulars of ELF, but if this was a PE file, you could simply tag on your added data to the end of the PE file, beyond the PE structure. Data that isn't part of the PE structure doesn't get mapped to memory in windows. I suspect the same exists in ELF.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow