Question

Basically, according to my knowledge,

If we use GDB to debug execute code compiled from C source code, compiler will leave the source code Path in the ELF/PE file, so GDB will use the source code to facilitate our debug process.

But how can GDB provide the asm code info in the above process? I give a example captured on my computer as below:

enter image description here

So my questions are:

  1. How GDB provide the asm code? Is this asm code generated by disassembly?
  2. If so, then how can GDB guarantee the disassembly accuracy? as I know that linear disassembly algorithm like OBJDUMP used can not distinguish data from code, commercial tool like IDA pro can also make mistake from time to time

Could anyone give me some help? Thank you!

Was it helpful?

Solution

Remember that the "compiler" (e.g. ) does several things:

  • preprocess (#includes, macros and such)
  • compilation proper (convert pre-processed c to assembly)
  • assembly (convert assembly to object code)
  • link (appropriately package objects up into libraries/executables)

By far the most complex part of this is the compilation-proper phase.

Remember also that assembly is more-or-less a direct representation of the machine instructions contained in the object code.

So, to answer your questions:

  1. GDB reads your libraries/executables and trivially (relatively) extracts the machine instructions, and thus the assembly code. This is the disassembly process.
  2. Again, since GDB is getting the machine instructions/object code directly from the libraries/executables, as long as it is able to accurately convert machine code to assembly instructions, there shouldn't be much problem providing an accurate disassembly.

In other words there a is a 1:many mapping from source code to assembly, meaning that there are many possible permutations of assembly code for given source code, given different compilers, compiler options, etc. This means it is difficult, if not impossible to derive source code from pure object code. Thus for effective debugging of c source code, the source code must be available to GDB, be it embedded or in its original .c form.

Conversely there is much closer to a 1:1 mapping from assembly code to object code, as both more-or-less represent the same thing - the layout of instructions in memory necessary to create the given program. Therefore the disassembly process is much more straightforward than any potential "decompilation" process.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top