Question

In Python, len is a function to get the length of a collection by calling an object's __len__ method:

def len(x):
    return x.__len__()

So I would expect direct call of __len__() to be at least as fast as len().

import timeit

setup = '''
'''

print (timeit.Timer('a="12345"; x=a.__len__()', setup=setup).repeat(10))
print (timeit.Timer('a="12345"; x=len(a)',      setup=setup).repeat(10))

Demo link

But results of testing with the above code shows len() to be faster. Why?

Was it helpful?

Solution

The builtin len() function does not look up the .__len__ attribute. It looks up the tp_as_sequence pointer, which in turn has a sq_length attribute.

The .__len__ attribute on built-in objects is indirectly mapped to the same slot, and it is that indirection (plus the attribute lookup) that takes more time.

For Python-defined classes, the type object looks up the .__len__ method when the sq_length is requested.

OTHER TIPS

From an excellent Python Object-Oriented Programming: Build robust and maintainable object-oriented Python applications and libraries, 4th Edition book by Steven F. Lott, and Dusty Phillips

You may wonder why these objects don't have a length property instead of having to call a function on them. Technically, they do. Most objects that len() will apply to have a method called __len__() that returns the same value. So len(myobj) seems to call myobj.__len__().

Why should we use the len() function instead of the __len__() method? Obviously, __len__() is a special double-underscore method, suggesting that we shouldn't call it directly. There must be an explanation for this. The Python developers don't make such design decisions lightly.

The main reason is efficiency. When we call the __len__() method of an object, the object has to look the method up in its namespace, and, if the special __getattribute__() method (which is called every time an attribute or method on an object is accessed) is defined on that object, it has to be called as well. Furthermore, the __getattribute__() method may have been written to do something clever, for example, refusing to give us access to special methods such as __len__()! The len() function doesn't encounter any of this. It actually calls the __len__() method on the underlying class, so len(myobj) maps to MyObj.__len__(myobj).

__len__ is slower than len(), because __len__ involves a dict lookup.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top