Speed differences between proc, Proc.new, lambda, and stabby lambda

Question

So it seems you have three questions. The middle one is unclear to me, so I will address the other two:

Why is normal method invocation so much faster?

This is the easier of the questions.

First realize that the times involved here are for function call overhead. I did my own timings based on your code (but with an identity function instead of multiplication), and non-direct invocations took 49% longer. With one multiplication, non-direct invocations took only 43% longer. In other words, one reason why you're seeing a large disparity is that your function itself is doing almost nothing. Even a single multiplication makes 6% of the difference "vanish". In a method of any reasonable complexity, the method call overhead is usually going to be a relatively small percentage of the overall time.

Next, remember that a proc/block/lambda is essentially a chunk of code that is being carried around (though a block literal cannot be saved into a variable). This implies one more level of indirection than a method call...meaning that at the very least the CPU is going to have to traverse a pointer to something.

Also, remember that Ruby supports closures, and I'm betting there is some overhead in deciding which environment the indirect code should run in.

On my machine, running a C program that invokes a function directly has 10% less overhead than one that uses a pointer to a function. An interpreted language like Ruby, where closures are also involved, is definitely going to use more.

My measurements (Ruby 2.2) indicate a direct method invocation takes about as long as 6 multiplications, and an indirect invocation takes about as long as 10.

So while the overhead is nearly twice as large, remember that the overhead in both cases is often relatively small.

Are there any other (performance based) reasons to chose between the different approaches?

I'd say given the above data the answer is usually no: you're much better off using the construct that gives you the most maintainable code.

There are definitely good reasons to choose one over the other. To be honest, I'm surprised about the difference I see between lambdas and blocks (on my machine, lambdas have 20% less overhead). Lambdas create anonymous methods that include parameter list checking, so if anything I would expect it to be slightly slower, but my measurements put it ahead.

That the direct invocation is faster simply isn't surprising at all.

The place where this kind of thing makes a difference is in very frequently called code where the overhead adds up to be noticeable in wall-clock kinds of ways. In this case, it can make sense to do all manner of ugly optimizations to try to squeeze a bit more speed, all the way up to inlining code (dodge the function-call overhead altogether).

For example, say your application uses a database framework. You profile it and find a frequently-called, small (e.g. less than 20 multiplications worth of work) function that copies the data from the database result into data structures. Such a function might comprise the lion's share of the result processing, and simply inlining that function might shave off significant amounts of CPU time at the expense of some code clarity.

Simply inlining your "square function" into a long numeric calculation with a billion steps could save you dramatic amounts of time because the operation itself takes a lot less time than even a direct method invocation.

In most cases, though, you're better off with clean, clear code.