Question

I am sort of a newbie to assembly language, and I need help understanding how mnemonics are converted directly to bytes.

For example, I have a line saying

b 0x00002B78

which is located at the memory address 0x00002A44. How does this translate to EA00004B (the byte representation of the above assembly)? I am under the impression that the "EA00" signifies the "b" branching part of the assembly, but what about the "004B"? If anyone can give a general understanding of this and resources to find conversions and such, that would be appreciated. I tried googling this but I am really not to sure what to google exactly. The stuff I have been googling has not been helpful.

Was it helpful?

Solution

All the information you're looking for is in the ARM Architecture Reference Manual. If you look up the b instruction, you'll see its encoding and how it works. Here's the specific instruction you care about:

excerpt from ARM docs

The E is the condition field, which you can look up in this table:

condition fields

For you, it's "execute always". Then the A, which in binary is the 1010 to match bits 27:24 (you have a branch instruction, not a branch & link instruction). Lastly the rest of the instruction is the immediate offset field. It's a PC-relative offset, which is why it's encoded as 0x00004b.

Let's look at your specific example now. You have the instruction:

b 0x00002B78

located at address 0x00002a44. OK, great. So first off, we can stick in the opcode bits:

cccc 101L xxxx xxxx xxxx xxxx xxxx xxxx

Now, the L bit is zero for our case:

cccc 1010 xxxx xxxx xxxx xxxx xxxx xxxx

We want to execute this instruction unconditionally, so we add the AL condition code bits:

1110 1010 xxxx xxxx xxxx xxxx xxxx xxxx

And now all we have to do is calculate the offset. The PC will be 0x2a4c when this instruction is executed (the PC is always "current instruction + 8" in ARM), so our relative jump needs to be:

0x2b78 - 0x2a4c = 0x12c

Great - now we apply the reverse of the transformations described in the documentation above, rightshifting 0x12c by two:

0x12c / 4 = 0x4b = 0b1001011

And that's the last field:

1110 1010 0000 0000 0000 0000 0100 1011

Turning that binary instruction back into hex gives you the instruction encoding you were looking for:

0xea00004b
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top