Are nested functions faster than global functions in Python?

Question 1

There are several parts that have an effect here:

1) The time to define the function (create the function object),
2) The time to look up the function object by name,
3) The time to actually call the function.

Your global function example is faster with 1) (no need to redefine a in each call to b1). However, it is slower with in 2) because global variable lookup is slower than local lookup.

Why can't we have both then?

I've extended your benchmark with a solution that uses the global function, but avoids the global lookup using a local variable. It seems to be the fastest of the three on my machine:

        5         6         7
b1:  0.04147   0.44421   4.46508
b2:  0.03399   0.43321   4.41121
b3:  0.03258   0.41821   4.25542

b1:  0.03240   0.42998   4.39774
b2:  0.03320   0.43465   4.42229
b3:  0.03155   0.42109   4.23669

b1:  0.03273   0.43321   4.37266
b2:  0.03326   0.43551   4.42208
b3:  0.03137   0.42356   4.25341

b1:  0.03253   0.43104   4.40466
b2:  0.03401   0.43719   4.42996
b3:  0.03155   0.41681   4.24132

b1:  0.03244   0.42965   4.37192
b2:  0.03310   0.43629   4.42727
b3:  0.03117   0.41701   4.23932

Question 2

LOAD_GLOBAL is much slower than LOAD_FAST.

b1

  5           0 LOAD_GLOBAL              0 (sum)
              3 BUILD_LIST               0
              6 LOAD_GLOBAL              1 (range)
              9 LOAD_FAST                0 (loopcount)
             12 CALL_FUNCTION            1
             15 GET_ITER
        >>   16 FOR_ITER                18 (to 37)
             19 STORE_FAST               1 (n)
             22 LOAD_GLOBAL              2 (a)
             25 LOAD_FAST                1 (n)
             28 CALL_FUNCTION            1
             31 LIST_APPEND              2
             34 JUMP_ABSOLUTE           16
        >>   37 CALL_FUNCTION            1
             40 RETURN_VALUE

b2

  8           0 LOAD_CONST               1 (<code object a at 01E57A40, ...>)
              3 MAKE_FUNCTION            0
              6 STORE_FAST               1 (a)

 10           9 LOAD_GLOBAL              0 (sum)
             12 BUILD_LIST               0
             15 LOAD_GLOBAL              1 (range)
             18 LOAD_FAST                0 (loopcount)
             21 CALL_FUNCTION            1
             24 GET_ITER
        >>   25 FOR_ITER                18 (to 46)
             28 STORE_FAST               2 (n)
             31 LOAD_FAST                1 (a)
             34 LOAD_FAST                2 (n)
             37 CALL_FUNCTION            1
             40 LIST_APPEND              2
             43 JUMP_ABSOLUTE           25
        >>   46 CALL_FUNCTION            1
             49 RETURN_VALUE

Whether the difference makes up for creating a function every time depends on the function (and how many times you use LOAD_GLOBAL).

Creating a local reference to a global function to use in an inner loop is a relatively common optimisation:

def a(n):
    return n + 1

def b1(loopcount):
    local_a = a
    return sum([local_a(n) for n in range(loopcount)])

This should still be faster than looking up a global and more maintainable than a nested function.

Question 3

In general, no, as each time the outer function is run, the inner function must be redefined.

This said, performance really doesn't matter unless you can show it does (for example, a particular loop is too slow and causing performance issues) - so I'd recommend using what is most readable - premature optimization is a bad thing.

That said, I would argue global functions are probably a better solution as they are easier to reuse across your code and I would say more readable - after all, flat is better than nested.

Question 4

I think it is caused by the order of searching object definition in python.

Whenever the interpreter meets an object name, it will first search the local object definition dict which records the mapping between object name and the object itself. If not found in local dict, then global, then builtin.

And, all things in python are objects, including function.