Question

This is related but NOT the same as frame pointer omitting ? Any risk?

I am trying to follow this old (but still relevan article)

http://blogs.msdn.com/b/larryosterman/archive/2007/03/12/fpo.aspx

Larry (author writes)

machines got sufficiently faster since 1995 that the performance improvements that were achieved by FPO weren't sufficient to counter the pain in debugging and analysis that FPO caused

However in the discussion further down the page one user writes

Disabling FPO can have both serious code size and performance impact. Tail call optimizations have to be disabled when a frame pointer is present, leading to much greater stack usage in affected paths. Small functions are also disproportionately affected by prolog/epilog code. Third, although there are still six registers available with a frame pointer on X86, only three of them are nonvolatile with respect to nested calls: EBX, ESI, and EDI. Opening up a fourth register can drop out a bunch of spill code.

I have a couple of question.

  1. Spill code == Register spillage?
  2. Is the author correct that FPO is generally considered a pain and the gain doe not out-weigh the benefits.
  3. Is FPO still relevant today in x64 architecture since there are a LOT more registers o play with.
  4. Do you use FPO? What for (if yes) and does it make a difference to you?

Finally in this article

http://www.altdevblogaday.com/2012/05/24/x64-abi-intro-to-the-windows-x64-calling-convention/

The author says

[with repect to Windows x64 calling convention].....

All parameters have space reserved on the stack, even the ones passed in registers. In fact, there’s stack space for 4 parameters even if your function doesn’t have any params. Those parameters are 8 bytes so that’s at least 32 bytes on the stack for every function (every function actually has at least 48 bytes on the stack…I’ll explain that another time). This stack area is called the home space. There are few reasons behind this home space:

  1. If the registers need to be used for something else, the called function can store the data in the home space without moving the stack pointer.
  2. It keeps the stack structure easy to determine. That’s very handy for debugging, and perhaps necessary for x64′s stack metadata (another point I’ll come back to another time). ...... The compiler can use it for whatever it wants, and an optimized build will likely make great use of it.

Wouldn't an optimized build optimize the excess allocation away?

Was it helpful?

Solution

1.Spill code == Register spillage?

Almost. Stricly speaking, spill code is the code added by the compiler to implement a register spill. The spill itself is the decision to tag a live range as not able to be placed in a register.

2.Is the author correct that FPO is generally considered a pain and the gain doe not out-weigh the benefits.

The author is probably correct that in modern processor architectures, the kinds of functions where FPOs will generate a significant performance gain is a smaller set than in the past. Yet FPO's do make code smaller, reducing cache pressure. They do reduce register pressure. These can be important in some settings. They do speed up prolog and epilog code by a few instructions. The point about debuggers not working well without the FP is noteworthy. It means core dumps are less useful for post mortems on production-optimized code. You'd never want to use FPO during development except for final testing.

3.Is FPO still relevant today in x64 architecture since there are a LOT more registers o play with.

Modern processors are so various and complex that you just about never know what's "relevant" until you try it and measure.

4.Do you use FPO? What for (if yes) and does it make a difference to you?

I have written a medium-size C library (20K SLOC) where it made a small (~5%) difference in run time overall under gcc. This was a native language extension to a scripting language that had to compile under both gcc and Visual C. Using it would have split the build path. I decided 5% was not worth it for the purpose the extension served. But if it had been a dynamic fluid simulation to predict the weather, 5% could have been worth many millions of dollars. The decision would have been different.

5.Wouldn't an optimized build optimize the excess allocation away?

That's entirely up to the compiler and optimizer designer. It looks from the MS documentation here that MS has defined the ABI to require home space for all data even if it's whole lifetime is spent in a register.

OTHER TIPS

1) When you need to use a register and don't have any unused ones, you need to write code to save some register value on the stack and later restore it.

2) FPO was a pain back when unwinding was primarily done by walking the stack. Nowadays standard unwind ABIs exist anyway (e.g. to enable exception handling), so the information already exists, and is organized more efficiently (away from the hot code), so there's no pain. Sure, there would be some pain if you wrote all your machine code by hand, but that's not the typical use case.

3) Typical x86_64 ABIs don't use frame pointers at all (except when absolutely necessary, like for variable-length arrays in C).

4) I'm not a compiler. My compiler doesn't generate frame pointers.

Optimize excess away) Not sure what your question is. The space consumption for the home area isn't a problem. The benefit of not having to adjust any stack pointers is a big advantage, since you need a lot less code. The same goes for the red zone just beyond the stack frame, which allows leaf code to use a lot of memory without ever needing any stack pointer gymnastics.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top