Why are the first 4 bytes of 64-bit addresses printed as 0x00000001?

https://stackoverflow.com/questions/16445302

14-04-2022
|

Question

I'm looking at the disassembly of some x86_64 code with Apple's otool. Here's a sample of the disassembly, as outputted by otool:

0000000100055de4    movq    $0x00000000,%rax

Only the last 4 bytes in that offset, the 00055de4, represent the file address of that instruction. I can open a hex editor and navigate to 0x55de4 and the movq instruction is there.

However, I noticed that gdb only works when I include all 8 bytes in the address, including the mysterious 1. break *0x0000000100055de4 works as expected, while break *0x00055de4 never triggers.

Every 64-bit binary I have analyzed with otool shows this pattern. It obviously doesn't apply to 32-bit addresses.

So, if 0x55de4 is the actual address, why do otool and gdb use 0x0000000100055de4?

Solution

__PAGEZERO, the first load command in a 64 bit Mach-O binary, specifies a segment size of 0x100000000 in virtual memory.

$ otool -lV binary

command 0
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __PAGEZERO
   vmaddr 0x0000000000000000
   vmsize 0x0000000100000000

When you do break *0x00055de4 your breakpoint ends up in this segment of zeros, which explains why it's never hit. 0x0000000100055de4 is the address of the instruction (found at 0x55de4 in the binary) when loaded into virtual memory.

For 32 bit binaries the __PAGEZERO size is 0x1000, which explains why the pattern does not apply.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow