Question

I have a short set of machine instructions (160 bytes), and I dont know what it does.

Im on a mac and I ran it under a GDB dissasembler and it came out with this:

....f3c0:   jmp    0x7fff5fbff3c6
....f3c2:   scas   %es:(%rdi),%eax
....f3c3:   retq   $0xa3bf
....f3c6:   sub    $0x100,%esp
....f3cc:   xor    %ecx,%ecx
....f3ce:   mov    %cl,(%rsp,%rcx,1)
 + 50 more lines....

I know very little assembler, but some of the commands looked funny ( like rex.RXB, rex.WB, rex.B). So after a bit of googling I found this command which told me it was a DOS executable:

   $ file program
   program: DOS executable (COM)
  • Is there a program that can disassemble a DOS executable?

If not, I will try to disassemble it manually since there is only 160 bytes. However I will need a reference of what each bytes means. E.g.

90 = NOP
8a = MOV
....
  • Is there a reference like this for DOS machine code instructions?

  • How else might I find out what the program does?


Update:

After a great suggestion from IGOR I disassembled the code using a different program. However, there are still some bad instructions:

 e:  88 0c                  mov    BYTE PTR [si],cl
10:  0c fe                  or     al,0xfe
12:  c1                     (bad)  
13:  75 f9                  jne    0xe
......
......
96:  90                     nop
97:  e8 9d ff               call   0x37
9a:  ff                     (bad)  
9b:  ff 41 41               inc    WORD PTR [bx+di+0x41]
  • Any ideas why its says (bad)?
Was it helpful?

Solution

If it's a COM file, then it's just raw real-mode x86 code. You can tell objdump to use 8086 mode, e.g.:

objdump -b binary -D -m i8086 file.com

To see Intel-style mnemonics (used by most of Intel and DOS documentation), add "-M intel".

For the instruction reference, try this or this.

OTHER TIPS

You can run it on a DOS machine through the DOS debugger. Might be quite cryptic though, if it's been built with defence against that in mind.

If you're brave, you could try installing DOSBox and just run it!

Dont assume that everything you see is an instruction, it could just be data and the instructions that preceed it that look like real instructions could just be data. It is a variable word length instruction set so disassembly is difficult anyway. Simulation might be the easiest way or a combination of the two. Dont wait to start your analysis until you get a clean disassembly, take several different disassemblies, as many as you can easily get tools for and just dig in. You might have to do some by hand anyway, its the nature of the beast for instruction sets like this one.

google pcemu to find an emulator for 8086/88 with dos call support, etc. Pcemu itself is easy to dig into and have it dump instructions as they execute, etc. Then follow the disassembly dumps you have to see if it is making sense. If not maybe you need to do your own disassembler.

if this code was originally written in something other than assembler then it may be tough to follow, esp if you dont know the assembly language. if you are doing this as a learning exercise in assembly there are many other better ways to learn. Granted writing a diassembler (or emulator) for an instruction set is an excellent way to learn an instruction set, variable word length instructions though are advanced as you have to go in execution order not linearly through memory to find the instructions, then later go through linearly and disassemble what you have detected as instructions and leave the rest data. It might be better to get your feet wet with something much simpler like the msp430 then attack something as painful as x86. The quick and dirty way to get a disassembler for 8088/86 would be to take something like pcemu and add printfs to it and disassemble in execution order, which is what you are interested in anyway from an analysis perspective (I assume).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top