Question

Can one deduct which operand "owns" the REG part, in the ModR/M byte, and which one has the Mod+RM part from the standard opcode map, optionally with direction bit, in Intel's developer manual? Does the direction bit comply to all operations with two operands (Where direction can be ambiguous)?

Some information about where I am at

I started at the top of the One-byte Opcode Map. Here we have:

        0         1        2        3        4        5         6         7
    +--------|--------|--------|--------|--------|---------|---------|---------|
  0 |                         ADD                          |   PUSH  |   POP   |
    | Eb, Gb | Ev, Gv | Gb, Eb | Gv, Ev | AL, Ib | rAX, Iz | ES(i64) | ES(i64) |
    +--------|--------|--------|--------|--------|---------|---------|---------|
  1 |                         ADC
  …

I am looking at ADD as opcode 00h and 02h.

Description of the letters states:

  Addressing methods:
  …
  E    A ModR/M byte follows the opcode and specifies the operand. The
       operand is either a general-purpose register or a memory address. If
       it is a memory address, the address is computed from a segment
       register and any of the following values: a base register, an index
       register, a scaling factor, a displacement.
  G    The reg field of the ModR/M byte selects a general register.
       Example: AX (000)
  …
  -------------------------------------------------------------------------     
  Operand types:
  …
  b    Byte, regardless of operand-size attribute.
  v    Word, doubleword or quadword (in 64-bit mode), depending on
       operand-size attribute.
  …

As mentioned, I am looking at the ADD opcodes 00h and 02h. Further I'm looking at 32-bit only (for now). The description table for the operations are:

Opcode   Instruction  Op/en   Description
00 /r    ADD r/m8,r8   MR     Add r8 to r/m8.
02 /r    ADD r8,r/m8   RM     Add r/m8 to r8.

Some test cases:

00 05 e25a4600    add [0x465ae2], al
00 0d e25a4600    add [0x465ae2], cl

00 06             add [esi], al          ; Eb, Gb
02 06             add al, [esi]          ; Gb, Eb

00 c8             add al, cl             ; Eb, Gb
02 c8             add cl, al             ; Gb, Eb

00 05 e0514600    add [0x4651e0], al
02 05 e0514600    add al, [0x4651e0]

ModR/M byte:

    7   6          5   4   3         2   1   0        Bit
+------------+------------------+------------------+
|    MOD     |    REG/Opcode    |       R/M        |
+------------+------------------+------------------+

Opcode and ModR/M from test cases in binary:

   Opcode   Mod   REG/opc   R/M
0000 0000    00       000   101           add [0xNNNNNN], al
0000 0000    00       001   101           add [0xNNNNNN], cl
0000 0000    00       000   110           add [esi], al
0000 0010    00       000   110           add al, [esi]
0000 0000    11       001   000           add al, cl
0000 0010    11       001   000           add cl, al
0000 0000    00       000   101           add [0xNNNNNN], al
0000 0010    00       000   101           add al, [0xNNNNNN]
       |
       +-----> Direction bit?

Question section:

(Ups, that introduction grew and grew, TLDR?) But, finally:

  1. The description of E in the Opcode Map, (first table), states that "A ModR/M byte follows the opcode and specifies the operand. The operand is either a general-purpose register or memory address."

    So my question becomes: What defines if it is register or memory address? As in: You read the bytes one by one. Opcode has the E notation.
    1. Is it based on the direction bit of the opcode?
    2. Can it be deducted from second operand? – as in: By this concrete example the second operand has the G flag which states "The reg field of the ModR/M byte selects a general register". Would this always be true?
    3. Would that override a direction bit?

Appendix B has a short section stating:

B.1.4.8 Direction (d) Bit

In many two-operand instructions, a direction bit (d) indicates which operand is considered the source and which is the destination. See Table B-11.

Table B-11. Encoding of Operation Direction (d) Bit
+---+-----------------------------+---------------------------------+
| d |          Source             |          Destination            |
+---+-----------------------------+---------------------------------+
| 0 | reg Field                   | ModR/M or SIB byte              |
| 1 | ModR/M or SIB Byte          | reg Field                       |
+---+-----------------------------+---------------------------------x

2. What does many mean here? Where is it defined for which ones applies, and for which ones it does not?

Hope this didn't become to long.

Was it helpful?

Solution

See this table for Mod/RM encoding.

Well firstly, there isn't always a direction bit. Only in ALU ops*, as far as I know. The direction bit says which of the E part or the G part is the source and which the destination.

What defines whether the E operand is a register operand or a memory operand is whether the MOD part of the Mod/RM byte is 11 (register) or not (memory), as you've probably seen from that table I linked to. The R field doesn't always encode a general purpose register, it can encode other types of registers and it can even extend the opcode (both also shown by that table), but G fields always encode GPRs (it refers to the R field in the Mod/RM byte of course, but when it's referred to as G you know it has to be a GPR, otherwise it could be something else, it depends on what instruction it's the operand to).

None of the new instructions really have direction bits. Most of them have no way to write to memory at all, except instructions that only write and don't read/modify/write. For example there's movaps r, r/m and movaps r/m, r, and movdqa r, r/m and movdqa r/m, r, and they could be interpreted as having a direction bit, but for movaps it's bit 0 and for movdqa it's bit 4. You might as well say that there are just two different encodings.

*specifically, the group described by 00aa a0ds where aaa is the operations (add, adc, and, xor, or, sbb, sub, cmp, in that order from 0 to 7), d is the direction bit and s indicates the size (0 for byte operations, 1 for everything else as distinguished by current mode and prefixes)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top