Assuming that the stack pointer is 16-byte aligned when func
is entered, then the combination of
pushq %rbp ; <- 8 bytes
movq %rsp, %rbp
subq $8, %rsp ; <- 8 bytes
will keep it 16-byte aligned for the subsequent call to foo()
.
It seems that since the compiler knows about the implementation of foo()
and that it's a noop, it's not bothering with the stack alignment. If foo()
is seen as only a declaration or prototype in the translation unit where func()
is compiled you'll see your expected stack alignment.