What is register %eiz?

https://stackoverflow.com/questions/2553517

23-09-2019
|

문제

In the following assembly code that I dumped out using objdump:

lea    0x0(%esi,%eiz,1),%esi

What is register %eiz? What does the preceding code mean?

해결책

See Why Does GCC LEA EIZ?:

Apparently %eiz is a pseudo-register that just evaluates to zero at all times (like r0 on MIPS).

...

I eventually found a mailing list post by binutils guru Ian Lance Taylor that reveals the answer. Sometimes GCC inserts NOP instructions into the code stream to ensure proper alignment and stuff like that. The NOP instruction takes one byte, so you would think that you could just add as many as needed. But according to Ian Lance Taylor, it’s faster for the chip to execute one long instruction than many short instructions. So rather than inserting seven NOP instructions, they instead use one bizarro LEA, which uses up seven bytes and is semantically equivalent to a NOP.

다른 팁

(Very late to the game, but this seemed like an interesting addition): It's not a register at all, it's a quirk of the Intel instruction encoding. When using a ModRM byte to load from memory, there are 3 bits used for the register field to store 8 possible registers. But the spot where ESP (the stack pointer) "would" be is instead interpreted by the processor as "a SIB byte follows this instruction" (i.e. it's an extended addressing mode, not a reference to ESP). For reasons known only to the authors, the GNU assembler has always represented this "zero where a register would otherwise be" as a "%eiz" register. The Intel syntax just drops it.

Andy Ross provides a lot more of the underlying reasoning, but is unfortunately wrong or at the very least confusing about the technical details. It is true that an effective address of just (%esp) cannot be encoded with just the ModR/M byte as instead of being decoded as (%esp), it is used to signal that a SIB byte is also included. However, the %eiz pseudo-register is not always used with a SIB byte to represent that a SIB byte was used.

The SIB byte (scale/index/base) has three pieces to it: the index (a register such as as %eax or %ecx that the scale is applied to), the scale (a power of two from 1 to 8 that the index register is multiplied by), and the base (another register that is added to the scaled index). This is what allows for instructions such as add %al,(%ebx,%ecx,2) (machine code: 00 04 4b -- opcode, modr/m, sib (note no %eiz register even though the SIB byte was used)) (or in Intel syntax, "add BYTE PTR [ecx*2+ebx], al").

However, %esp cannot be used as the index register in a SIB byte. Instead of allowing this option, Intel instead adds an option to use the base register as is with no scaling or indexing. Therefore to disambiguate between the case of add %al,(%ecx) (machine code: 00 01 -- opcode, modr/m) and add %al,(%ecx) (machine code: 00 04 21 -- opcode, modr/m, sib), the alternate syntax add %al,(%ecx,%eiz,1) is instead used (or for Intel syntax: add BYTE PTR [ecx+eiz*1],al).

And as explained in the article linked to by Sinan, this specific instruction (lea 0x0(%esi,%eiz,1),%esi) is merely used as a multi-byte nop (equivalent to esi = &*esi) so that only one nop-like instruction has to be executed instead of multiple nop instructions.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow