Was writing a blog post about some python coding styles and came across something that I found very strange and I was wondering if someone understood what was going on with it. Basically I've got two versions of the same function:
a = lambda x: (i for i in range(x))
def b(x):
for i in range(x):
yield i
And I want to compare the performance of these two doing just being set up. In my mind this should involve a negligible amount of computation and both methods should come up pretty close to zero, however, when I actually ran the timeit:
def timing(x, number=10):
implicit = timeit.timeit('a(%s)' % int(x), 'from __main__ import a', number=number)
explicit = timeit.timeit('b(%s)' % int(x), 'from __main__ import b', number=number)
return (implicit, explicit)
def plot_timings(*args, **kwargs):
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
x_vector = np.linspace(*args, **kwargs)
timings = np.vectorize(timing)(x_vector)
ax.plot(x_vector, timings[0], 'b--')
ax.plot(x_vector, timings[1], 'r--')
ax.set_yscale('log')
plt.show()
plot_timings(1, 1000000, 20)
I get a HUGE difference between the two methods as shown below:
Where a
is in blue, and b
is in red.
Why is the difference so huge? It looks the explicit for loop version is also growing logarithmically, while the implicit version is doing nothing (as it should).
Any thoughts?