Easiest way is to build the engine with debugging enabled, then use the IONFLAGS
env var: per the spew handling code you can enable spew channels like codegen
.
How to get the assembly code from IonMonkey
-
02-06-2022 - |
Question
for studying purposes I am trying to find out the memory address of a variable after JIT compilation from IonMonkey ( IonMonkey is part of SpiderMonkey, the Javascript engine of Mozilla )
Until now I have followed these instruction https://developer.mozilla.org/en-US/docs/SpiderMonkey/Hacking_Tips#Printing_the_generated_assembly_code_(from_gdb)
I use GDB and I run the same procedure with 2 different test files.
function f(a, b) { return a + b; }
var shell = "AAAA";
for (var i = 0; i < 1000000; i++){ f( shell[0], shell[1] ); }
and this one:
function f(a, b) { return a + b; }
var shell = "AAAA";
for (var i = 0; i < 1000000; i++){ f( shell[1], shell[1] ); }
I believed that this way I would spot the difference between the generated code and find out where is the "shell" variable is located. The problem is that the generated code is exactly the same. I also tried different versions of simple function such as minus or print but the generated code is totally different.
Can anyone suggest any way so I can get the memory address of the variable?
The assembly generated code is
0x7ffff7ff3ac8: mov 0x20(%rsp),%r10
0x7ffff7ff3acd: shr $0x2f,%r10
0x7ffff7ff3ad1: cmp $0x1fff2,%r10d
0x7ffff7ff3ad8: je 0x7ffff7ff3ae3
0x7ffff7ff3ade: jmpq 0x7ffff7ff3b85
0x7ffff7ff3ae3: mov 0x28(%rsp),%r10
0x7ffff7ff3ae8: shr $0x2f,%r10
0x7ffff7ff3aec: cmp $0x1fff5,%r10d
0x7ffff7ff3af3: je 0x7ffff7ff3afe
0x7ffff7ff3af9: jmpq 0x7ffff7ff3b85
0x7ffff7ff3afe: mov 0x30(%rsp),%r10
0x7ffff7ff3b03: shr $0x2f,%r10
0x7ffff7ff3b07: cmp $0x1fff5,%r10d
0x7ffff7ff3b0e: je 0x7ffff7ff3b19
0x7ffff7ff3b14: jmpq 0x7ffff7ff3b85
0x7ffff7ff3b19: mov 0x28(%rsp),%r8
0x7ffff7ff3b1e: movabs $0x7fffffffffff,%rax
0x7ffff7ff3b28: and %r8,%rax
0x7ffff7ff3b2b: mov 0x30(%rsp),%r9
0x7ffff7ff3b30: movabs $0x7fffffffffff,%rdi
0x7ffff7ff3b3a: and %r9,%rdi
0x7ffff7ff3b3d: mov $0x1670b78,%r11d
0x7ffff7ff3b43: mov (%r11),%rcx
0x7ffff7ff3b46: cmp %rcx,%rsp
0x7ffff7ff3b49: jbe 0x7ffff7ff3b8f
0x7ffff7ff3b4f: callq 0x7ffff7ff39a0
0x7ffff7ff3b54: test %rbp,%rbp
0x7ffff7ff3b57: je 0x7ffff7ff3bd6
0x7ffff7ff3b5d: movabs $0xfffa800000000000,%rcx
0x7ffff7ff3b67: or %rbp,%rcx
0x7ffff7ff3b6a: retq
0x7ffff7ff3b6b: nop
...
0x7ffff7ff3b72: nop
0x7ffff7ff3b73: movabs $0xffffffffffffffff,%r11
0x7ffff7ff3b7d: push %r11
0x7ffff7ff3b7f: callq 0x7ffff7fe9400
0x7ffff7ff3b84: int3
0x7ffff7ff3b85: pushq $0x0
0x7ffff7ff3b8a: jmpq 0x7ffff7ff3c40
0x7ffff7ff3b8f: sub $0x28,%rsp
0x7ffff7ff3b93: mov %r9,0x20(%rsp)
0x7ffff7ff3b98: mov %r8,0x18(%rsp)
0x7ffff7ff3b9d: mov %rdi,0x10(%rsp)
0x7ffff7ff3ba2: mov %rcx,0x8(%rsp)
0x7ffff7ff3ba7: mov %rax,(%rsp)
0x7ffff7ff3bab: pushq $0x280
0x7ffff7ff3bb0: callq 0x7ffff7fee880
0x7ffff7ff3bb5: mov 0x20(%rsp),%r9
0x7ffff7ff3bba: mov 0x18(%rsp),%r8
0x7ffff7ff3bbf: mov 0x10(%rsp),%rdi
0x7ffff7ff3bc4: mov 0x8(%rsp),%rcx
0x7ffff7ff3bc9: mov (%rsp),%rax
0x7ffff7ff3bcd: add $0x28,%rsp
0x7ffff7ff3bd1: jmpq 0x7ffff7ff3b4f
0x7ffff7ff3bd6: sub $0x40,%rsp
0x7ffff7ff3bda: mov %r9,0x38(%rsp)
0x7ffff7ff3bdf: mov %r8,0x30(%rsp)
0x7ffff7ff3be4: mov %rdi,0x28(%rsp)
0x7ffff7ff3be9: mov %rsi,0x20(%rsp)
0x7ffff7ff3bee: mov %rbx,0x18(%rsp)
0x7ffff7ff3bf3: mov %rdx,0x10(%rsp)
0x7ffff7ff3bf8: mov %rcx,0x8(%rsp)
0x7ffff7ff3bfd: mov %rax,(%rsp)
0x7ffff7ff3c01: push %rdi
0x7ffff7ff3c02: push %rax
0x7ffff7ff3c03: pushq $0x500
0x7ffff7ff3c08: callq 0x7ffff7fec370
0x7ffff7ff3c0d: mov %rax,%rbp
0x7ffff7ff3c10: mov 0x38(%rsp),%r9
0x7ffff7ff3c15: mov 0x30(%rsp),%r8
0x7ffff7ff3c1a: mov 0x28(%rsp),%rdi
0x7ffff7ff3c1f: mov 0x20(%rsp),%rsi
0x7ffff7ff3c24: mov 0x18(%rsp),%rbx
0x7ffff7ff3c29: mov 0x10(%rsp),%rdx
0x7ffff7ff3c2e: mov 0x8(%rsp),%rcx
0x7ffff7ff3c33: mov (%rsp),%rax
0x7ffff7ff3c37: add $0x40,%rsp
0x7ffff7ff3c3b: jmpq 0x7ffff7ff3b5d
0x7ffff7ff3c40: pushq $0x0
0x7ffff7ff3c45: jmpq 0x7ffff7fe9008
0x7ffff7ff3c4a: hlt
Solution
OTHER TIPS
My guess is that you're only looking at the function f
, which simply adds its arguments; tracing through the code quickly, I notice that there is only one reverse branch within the function, and there is no flow path that could possibly loop.
It looks like this function reads two arguments off the stack, typechecks them (and breaks into the interpreter if the typecheck fails), calls one function on those objects, and returns the result; everything after the "..." is error-handling code. If you want a better test, try shoving the for loop into its own function, and printing the result of adding shell[0] and shell[1].
Alternatively, you're already using GDB... set a breakpoint in the JITed code, and poke at them. Chances are, that breakpoint will be hit 1000000 times, and the arguments will be the same each time.
Finally, you can get the current instruction pointer with a sequence like (x86/gas):
call 1n
1: pop %eax
This may be easier for your purposes :-)
It's not much but I found a way to locate simple integer variables and strings.
- Run js shell with gdb
- Place a breakpoint Following the instructions here
- do a x/50i $pc-1 and start!
For integers:
Javascript code
var sum =10;
for (var i = 0; i < 100000 ; i++ ) { sum = sum + 1; }
Generated code
0x7ffff7ff34a7: movabs $0x7ffff5e4c060,%rax
0x7ffff7ff34b1: mov 0x10(%rax),%rax
0x7ffff7ff34b5: movabs $0x1670b98,%r11
0x7ffff7ff34bf: cmpl $0x0,(%r11)
0x7ffff7ff34c3: jne 0x7ffff7ff3542
0x7ffff7ff34c9: mov 0x6c0(%rax),%ecx -- Load var i
0x7ffff7ff34cf: cmp $0x186a0,%ecx -- compare with 100000
0x7ffff7ff34d5: jge 0x7ffff7ff34fe -- jump greater or equal [loop end]
0x7ffff7ff34db: mov 0x6b8(%rax),%edx -- load var sum
0x7ffff7ff34e1: add $0x1,%edx -- +1 to sum
0x7ffff7ff34e4: jo 0x7ffff7ff3561
0x7ffff7ff34ea: mov %edx,0x6b8(%rax) -- store sum
0x7ffff7ff34f0: add $0x1,%ecx -- +1 to i
0x7ffff7ff34f3: mov %ecx,0x6c0(%rax) -- store i
0x7ffff7ff34f9: jmpq 0x7ffff7ff34b5 -- continue loop
0x7ffff7ff34fe: movabs $0xfff9000000000000,%rcx
0x7ffff7ff3508: retq
Now we check the memory
var i -- 0x1696230 + 0x6c0 = 0x16968f0 var sum -- 0x1696230 + 0x6b8 = 0x16968e8
0x00007ffff7ff34b5 (gdb) info registers rax 0x1696230
(gdb) x/w 0x16968f0: 0x0000044a
(gdb) x/w 0x16968e8: 0x00000454
0x0000044a = hex( 44a ) = dec( 1098 ) 0x00000454 = hex( 454 ) = dec( 1108 )
So, that's the way we find the memory address of an integer.
For strings:
Javascript code
for (var i = 0; i < 100000; i++){ shell="ABCDEFG" }
info ascii( "ABCDEFG" ) = hex( 41 42 43 44 45 46 47 )
Generated code
0x7ffff7ff341f: movabs $0x7ffff5e4c060,%rax
0x7ffff7ff3429: mov 0x10(%rax),%rax
0x7ffff7ff342d: movabs $0x1670b98,%r11
0x7ffff7ff3437: cmpl $0x0,(%r11) -- compare addr 0
0x7ffff7ff343b: jne 0x7ffff7ff34d7 -- jump not equal
0x7ffff7ff3441: mov 0x6b8(%rax),%ecx -- load var i
0x7ffff7ff3447: cmp $0x186a0,%ecx -- compare i with 100000
0x7ffff7ff344d: jge 0x7ffff7ff3493 -- jump greater or equal [loop end]
0x7ffff7ff3453: jmpq 0x7ffff7ff3470 -- continue loop
0x7ffff7ff3458: push %rdx
0x7ffff7ff3459: lea 0x6c0(%rax),%rdx
0x7ffff7ff3460: callq 0x7ffff7fe9a48
0x7ffff7ff3465: pop %rdx
0x7ffff7ff3466: jmpq 0x7ffff7ff3470
0x7ffff7ff346b: hlt
0x7ffff7ff3...: hlt -- filled with hlt
0x7ffff7ff346f: hlt
0x7ffff7ff3470: movabs $0xfffafffff5f3a280,%r11 -- Load address "shell"
0x7ffff7ff347a: mov %r11,0x6c0(%rax) -- store address to var shell
0x7ffff7ff3481: mov $0x1,%edx
0x7ffff7ff3486: add %ecx,%edx -- +1 to i
0x7ffff7ff3488: mov %edx,0x6b8(%rax) -- store var i
0x7ffff7ff348e: jmpq 0x7ffff7ff342d -- continue loop
0x7ffff7ff3493: movabs $0xfff9000000000000,%rcx
0x7ffff7ff349d: retq
var shell -- 0x1696230 + 0x6c0 = 0x16968f0
(gdb) info registers rax 0x1696230: 0xf5f3a280
(gdb) x/w 0x16968f0: 0xf5f3a280
(gdb) x/20w 0x7ffff5f3a280
0x7ffff5f3a280: 0x00000078 0x00000000 0xf5f3a290 0x00007fff
0x7ffff5f3a290: 0x00420041 0x00440043 0x00460045 0x00000047
0x7ffff5f3a2a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffff5f3a290: 0x00420041 0x00440043 0x00460045 0x00000047 We got our string!