Usage of frame pointer optimizations

Question 1

1.Spill code == Register spillage?

Almost. Stricly speaking, spill code is the code added by the compiler to implement a register spill. The spill itself is the decision to tag a live range as not able to be placed in a register.

2.Is the author correct that FPO is generally considered a pain and the gain doe not out-weigh the benefits.

The author is probably correct that in modern processor architectures, the kinds of functions where FPOs will generate a significant performance gain is a smaller set than in the past. Yet FPO's do make code smaller, reducing cache pressure. They do reduce register pressure. These can be important in some settings. They do speed up prolog and epilog code by a few instructions. The point about debuggers not working well without the FP is noteworthy. It means core dumps are less useful for post mortems on production-optimized code. You'd never want to use FPO during development except for final testing.

3.Is FPO still relevant today in x64 architecture since there are a LOT more registers o play with.

Modern processors are so various and complex that you just about never know what's "relevant" until you try it and measure.

4.Do you use FPO? What for (if yes) and does it make a difference to you?

I have written a medium-size C library (20K SLOC) where it made a small (~5%) difference in run time overall under gcc. This was a native language extension to a scripting language that had to compile under both gcc and Visual C. Using it would have split the build path. I decided 5% was not worth it for the purpose the extension served. But if it had been a dynamic fluid simulation to predict the weather, 5% could have been worth many millions of dollars. The decision would have been different.

5.Wouldn't an optimized build optimize the excess allocation away?

That's entirely up to the compiler and optimizer designer. It looks from the MS documentation here that MS has defined the ABI to require home space for all data even if it's whole lifetime is spent in a register.

Question 2

1) When you need to use a register and don't have any unused ones, you need to write code to save some register value on the stack and later restore it.

2) FPO was a pain back when unwinding was primarily done by walking the stack. Nowadays standard unwind ABIs exist anyway (e.g. to enable exception handling), so the information already exists, and is organized more efficiently (away from the hot code), so there's no pain. Sure, there would be some pain if you wrote all your machine code by hand, but that's not the typical use case.

3) Typical x86_64 ABIs don't use frame pointers at all (except when absolutely necessary, like for variable-length arrays in C).

4) I'm not a compiler. My compiler doesn't generate frame pointers.

Optimize excess away) Not sure what your question is. The space consumption for the home area isn't a problem. The benefit of not having to adjust any stack pointers is a big advantage, since you need a lot less code. The same goes for the red zone just beyond the stack frame, which allows leaf code to use a lot of memory without ever needing any stack pointer gymnastics.