Question

I'm working on a utility which needs to resolve hex addresses to a symbolic function name and source code line number within a binary. The utility will run on Linux on x86, though the binaries it analyzes will be for a MIPS-based embedded system. The MIPS binaries are in ELF format, using DWARF for the symbolic debugging information.

I'm currently planning to fork objdump, passing in a list of hex addresses and parsing the output to get function names and source line numbers. I have compiled an objdump with support for MIPS binaries, and it is working.

I'd prefer to have a package allowing me to look things up natively from the Python code without forking another process. I can find no mention of libdwarf, libelf, or libbfd on python.org, nor any mention of python on dwarfstd.org.

Is there a suitable module available somewhere?

Was it helpful?

Solution

Please check pyelftools - a new pure Python library meant to do this.

OTHER TIPS

You might be interested in the DWARF library from pydevtools:

>>> from bintools.dwarf import DWARF
>>> dwarf = DWARF('test/test')
>>> dwarf.get_loc_by_addr(0x8048475)
('/home/emilmont/Workspace/dbg/test/main.c', 36, 0)

You should give Construct a try. It is very useful to parse binary data into python objects.

There is even an example for the ELF32 file format.

I don't know of any, but if all else fails you could use ctypes to directly use libdwarf, libelf or libbfd.

I've been developing a DWARF parser using Construct. Currently fairly rough, and parsing is slow. But I thought I should at least let you know. It may suit your needs, with a bit of work.

I've got the code in Mercurial, hosted at bitbucket:

Construct is a very interesting library. DWARF is a complex format (as I'm discovering) and pushes Construct to its limits I think.

hachior is another library for parsing binary data

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top