Question

So, Python is happy for me to write code like,

class A(): pass

a1 = A()
a2 = A()
a1.some_field = 5
a2.other_field = 7

Now, I have learned to stop worrying and love duck typing when it comes to passing objects to methods. And I accept that allowing different instances of a class to have different fields can be convenient sometimes.

My problem is, I'm building a medium-sized web application with a team of 4 developers, and I can't help but think that adding arbitrary fields to objects is going to make it much harder to reason about application state.

I guess my question is this: is the practice of adding arbitrary fields to objects just a natural extension of duck typing, or is it something to be avoided?

Here's a specific example:

class Visitor():
    def __init__(self, name, address, dob):
        self.name = name
        self.address = address
        self.dob = dob

    def summarize_visits(visits):
        self.most_recent_visit = find_most_recent_visit(visits)

In this case, code that deals with Visitor objects has to be aware of the fact that visitor.most_recent_visit will raise an AttributeError unless somebody has previously called summarize_visits on the same object. Seems like it will lead to a lot of if hasattr(...) blocks, no?

Was it helpful?

Solution

Writing code like that is in fact one of the greatest benefits of Python. My rule of thumb is to only use instance-specific fields internally (ie. within one function, and only when necessary) and not expect them to be used by external modules.

If my object was expected to be consumed by another person, I'd want for them to look at the class definition and find everything they need to know about its structure clearly delineated in one spot. Remember, explicit is better than implicit.

OTHER TIPS

It's often convenient to do, but I see your concern that it could lead to confusion when working with other developers. You could stop people from adding arbitrary values to classes by defining the __slots__ variable as discussed here. This would force people to be explicit about the attributes they want in an object, which could help avoid confusion.

I have an urge to cite the "be conservative in what you emit, be liberal in what you emit" principle here. I'd consider setting the design of "my own" objects (used within a module) "in stone", but to go easy on type checking of objects that cross module boundaries. It's always feasible to extend the behaviour of your internal classes using adapters or some other explicit pattern than attaching it ad-hoc.

I believe the Python term for this is "monkey-patching"; here's some related information:

Stack Overflow - monkey patching

A benefit of this is that your python code is very dynamic; you can add fields to customize a class, module, or any Python construct with attributes.

One very clever use of this is the "Bunch" python recipe, which defines a general-purpose container that can be constructed using keyword arguments:

Bunch recipe

A cost is in potential maintenance issues; it is difficult to track where a class or module was last "monkey-patched".

Of course, if you're really into duck typing, your code that uses some_field and other_field ought to try/except to make sure that the attributes are really present, and handle both cases.

In your specific example, it seems like Visitor.most_recent_visits could be initialized to None, or some other sentinel. IMHO, if you can initialize an attribute to a value that makes sense in the constructor, you should do so, and only reserve "monkey-patching" for extreme cases. Also, monkey-patching of library classes and functions seems like trouble.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top