Question

While going through some C code having inline assembly I came across the .byte (with a Dot at the beginning) directive.

On checking the assembly reference on web I found that it is used to reserve a byte in memory.

But in the code there was no label before the statement. So I was wondering what is use of an unlabeled .byte directive or any other data storage directive for that matter.

For e.g. if i code .byte 0x0a, how can i use it ?

Was it helpful?

Solution

There are a few possibilities... here are a couple I can think of off the top of my head:

  1. You could access it relative to a label that comes after the .byte directive. Example:

      .byte 0x0a
    label:
      mov (label - 1), %eax
    
  2. Based on the final linked layout of the program, maybe the .byte directives will get executed as code. Normally you'd have a label in this case too, though...

  3. Some assemblers don't support generating x86 instruction prefixes for operand size, etc. In code written for those assemblers, you'll often see something like:

      .byte 0x66
      mov $12, %eax
    

    To make the assembler emit the code you want to have.

OTHER TIPS

Here's an example with inline assembly:

#include <stdio.h>
void main() {
   int dst;
   // .byte 0xb8 0x01 0x00 0x00 0x00 = mov $1, %%eax
   asm (".byte 0xb8, 0x01, 0x00, 0x00, 0x00\n\t"
    "mov %%eax, %0"
    : "=r" (dst)
    : : "eax"  // tell the compiler we clobber eax
   );
   printf ("dst value : %d\n", dst);
return;
}

(See compiler asm output and also disassembly of the final binary on the Godbolt compiler explorer.)

You can replace .byte 0xb8, 0x01, 0x00, 0x00, 0x00 with mov $1, %%eax the run result will be the same. This indicated that it can be a byte which can represent some instruction eg- move or others.

Minimal runnable example

.byte spits out bytes wherever you are. Whether there is a label or not pointing to the byte, does not matter.

If you happen to be in the text segment, then that byte might get run like code.

Carl mentioned it, but here is a complete example to let it sink in further: a Linux x86_64 implementation of true with a nop thrown in:

.global _start
_start:
    mov $60, %rax
    nop
    mov $0, %rdi
    syscall

produces the exact same executable as:

.global _start
_start:
    mov $60, %rax
    .byte 0x90
    mov $0, %rdi
    syscall

since nop is encoded as the byte 0x90.

One use case: new instructions

One use case is when new instructions are added to a CPU ISA, but only very edge versions of the assembler would support it.

So project maintainers may choose to inline the bytes directly to make it compilable on older assemblers.

See for example this Spectre workaround on the Linux kernel with the analogous .inst directive: https://github.com/torvalds/linux/blob/94710cac0ef4ee177a63b5227664b38c95bbf703/arch/arm/include/asm/barrier.h#L23

#define CSDB    ".inst  0xe320f014"

A new instruction was added for Spectre, and the kernel decided to hardcode it for the time being.

The .byte is a directive that allows you to declare a constant byte only known through inspection without any context.

From the GNU Assembler Guide:

.byte  74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top