Question

From the ‘Special method lookup for new-style classes’ section of the ‘Data model’ chapter in the Python documentation (bold emphasis mine):

For new-style classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary. That behaviour is the reason why the following code raises an exception (unlike the equivalent example with old-style classes):

>>> class C(object):
...     pass
...
>>> c = C()
>>> c.__len__ = lambda: 5
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'C' has no len()

The rationale behind this behaviour lies with a number of special methods such as __hash__() and __repr__() that are implemented by all objects, including type objects. If the implicit lookup of these methods used the conventional lookup process, they would fail when invoked on the type object itself:

>>> 1 .__hash__() == hash(1)
True
>>> int.__hash__() == hash(int)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: descriptor ’__hash__’ of ’int’ object needs an argument

Incorrectly attempting to invoke an unbound method of a class in this way is sometimes referred to as ‘metaclass confusion’, and is avoided by bypassing the instance when looking up special methods:

>>> type(1).__hash__(1) == hash(1)
True
>>> type(int).__hash__(int) == hash(int)
True

I cannot catch the words in bold well…

Was it helpful?

Solution

To understand what's going on here, you need to have a (basic) understanding of the conventional attribute lookup process. Take a typical introductory object-oriented programming example - fido is a Dog:

class Dog(object):
    pass

fido = Dog()

If we say fido.walk(), the first thing Python does is to look for a function called walk in fido (as an entry in fido.__dict__) and call it with no arguments - so, one that's been defined something like this:

def walk():
   print "Yay! Walking! My favourite thing!"

fido.walk = walk

and fido.walk() will work. If we hadn't done that, it would look for an attribute walk in type(fido) (which is Dog) and call it with the instance as the first argument (ie, self) - that is triggered by the usual way we define methods in Python:

class Dog:
    def walk(self):
         print "Yay! Walking! My favourite thing!"

Now, when you call repr(fido), it ends up calling the special method __repr__. It might be (poorly, but illustratively) defined like this:

class Dog:
    def __repr__(self):
          return 'Dog()'

But, the bold text is saying that it also makes sense to do this:

 repr(Dog)

Under the lookup process I just described, the first thing it looks for is a method called __repr__ assigned to Dog... and hey, look, there is one, because we just poorly but illustratively defined it. So, Python calls:

Dog.__repr__()

And it blows up in our face:

>>> Dog.__repr__()
Traceback (most recent call last):
  File "<pyshell#38>", line 1, in <module>
    Dog.__repr__()
TypeError: __repr__() takes exactly 1 argument (0 given)

because __repr__() expects a Dog instance to be passed to it as its self argument. We could do this to make it work:

class Dog:
    def __repr__(self=None):
       if self is None:
           # return repr of Dog
       # return repr of self

But, then, we would need to do this every time we write a custom __repr__ function. That it needs to know how to find the __repr__ of the class is a problem, but not much of a one - it can just delegate to Dog's own class (type(Dog)) and call its __repr__ with Dog as its self-argument:

 if self is None:
   return type(Dog).__repr__(Dog)

But first, this breaks if the classname changes in the future, since we've needed to mention it twice in the same line. But the bigger problem is that this is basically going to be boilerplate: 99% of implementations will just delegate up the chain, or forget to and hence be buggy. So, Python takes the approach described in those paragraphs - repr(foo) skips finding an __repr__ attached to foo, and goes straight to:

type(foo).__repr__(foo) 

OTHER TIPS

What you have to remember is that classes are instances of their metaclass. Some operations need to be performed not just on instances, but on types as well. If the method on the instance was run then it would fail since the method on the instance (really a class in this case) would require an instance of the class rather than the metaclass.

class MC(type):
  def foo(self):
    print 'foo'

class C(object):
  __metaclass__ = MC
  def bar(self):
    print 'bar'

C.foo()
C().bar()
C.bar()

Normal attribute retrieval obj.attr looks up attr in the instance attributes and class attributes of obj. It is defined in object.__getattribute__ and type.__getattribute__.

Implicit special method call special(obj, *args, **kwargs) (e.g. hash(1)) looks up __special__ (e.g. __hash__) in the class attributes of obj (e.g. 1), bypassing the instance attributes of obj instead of performing the normal attribute retrieval obj.__special__, and calls it. The rationale is that the instance attributes of obj may require a receiver argument (usually called self) which is an instance of obj to be called (e.g. function attributes) whereas special(obj, *args, **kwargs) does not provide one, contrary to the class attributes of obj which may require a receiver argument (usually called self) which is an instance of the class type(obj) to be called (e.g. function attributes) and special(obj, *args, **kwargs) provides one: obj.

Example

The special method __hash__ takes a single argument. Compare these two expressions:

>>> 1 .__hash__
<method-wrapper '__hash__' of int object at 0x103c1f930>
>>> int.__hash__
<slot wrapper '__hash__' of 'int' objects>
  • The first expression retrieves the method vars(type(1))['__hash__'].__get__(1) bound to 1 from the class attribute vars(type(1))['__hash__']. So the class attribute requires a receiver argument which is an instance of type(1) to be called, and we have already provided one: 1.
  • The second expression retrieves the function vars(int)['__hash__'].__get__(None, int) from the instance attribute vars(int)['__hash__']. So the instance attribute requires a receiver argument which is an instance of int to be called, and we have not provided one yet.
>>> 1 .__hash__()
1
>>> int.__hash__(1)
1

Since the built-in function hash takes a single argument, hash(1) can provide the 1 required in the first call (a class attribute call) while hash(int) cannot provide the 1 required in the second call (an instance attribute call). Consequently, hash(obj) should bypass the instance attribute vars(obj)['__hash__'] and directly access the class attribute vars(type(obj))['__hash__']:

>>> hash(1) == vars(type(1))['__hash__'].__get__(1)()
True
>>> hash(int) == vars(type(int))['__hash__'].__get__(int)()
True
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top