Duck typing: how to avoid name collisions?

https://stackoverflow.com/questions/10033953

29-05-2021
|

Question

I think understand the idea of duck typing, and would like to use it more often in my code. However, I am concerned about one potential problem: name collision.

Suppose I want an object to do something. I know the appropriate method, so I simply call it and see what happens. In general, there are three possible outcomes:

The method is not found and AttributeError exception is raised. This indicates that the object isn't what I think it is. That's fine, since with duck typing I'm either catching such an exception, or I am willing to let the outer scope deal with it (or let the program terminate).
The method is found, it does precisely what I want, and everything is great.
The method is found, but it's not the method that I want; it's a same-name method from an entirely unrelated class. The execution continues, until either inconsistent state is detected later, or, in the worst case, the program silently produces incorrect output.

Now, I can see how good quality names can reduce the chances of outcome #3. But projects are combined, code is reused, libraries are swapped, and it's quite possible that at some point two methods have the same name and are completely unrelated (i.e., they are not intended to substitute for each other in a polymorphism).

One solution I was thinking about is to add a registry of method names. Each registry record would contain:

method name (unique; i.e., only one record per name)
its generalized description (i.e., applicable to any instance it might be called on)
the set of classes which it is intended to be used in

If a method is added to a new class, the class needs to be added to the registry (by hand). At that time, the programmer would presumably notice if the method is not consistent with the meaning already attached to it, and if necessary, use another name.

Whenever a method is called, the program would automatically verify that the name is in the registry and the class of the instance is one of the classes in the record. If not, an exception would be raised.

I understand this is a very heavy approach, but in some cases where precision is critical, I can see it might be useful. Has it been tried (in Python or other dynamically typed languages)? Are there any tools that do something similar? Are there any other approaches worth considering?

Note: I'm not referring to name clashes at the global level, where avoiding namespace pollution would be the right approach. I'm referring to clashes at the method names; these are not affected by namespaces.

Solution

Well, if this is critical, you probably should not be using duck typing...

In practice, programs are finite systems, and the range of possible types passed into any particular routine does not cause the issues you are worrying about (most often there's only ever one type passed in).

But if you want address this issue anyway, python provides ABCs (abstract base classes). these allow you to associated a "type" with any set of methods and so would work something like the registry you suggest (you can either inherit from an ABC in the normal way, or simply "register" with it).

You can then check for these types manually or automate the checking with decorators from pytyp.

But, despite being the author of pytyp, and finding these questions interesting, I personally do not find such an approach useful. In practice, what you are worrying about simply does not happen (if you want to worry about something, focus on the lack of documentation from types when using higher order functions!).

PS note - ABCs are purely metadata. They do not enforce anything. Also, checking with pytyp decorators is horrendously inefficient - you really want to do this only where it is critical.

OTHER TIPS

If you are following good programming practice or let me rather say if your code is Pythoic then chances are you would seldom face such issues. Refer the FAQ What are the “best practices” for using import in a module?. It is generally not advised to clutter the namespace and the only time when there could be a conflict if you are trying to reuse the Python reserved names and or standard libraries or name conflicts with module name. But if you encounter conflict as such then there is a serious issue with the code. For example

Why would someone name a variable as list or define a function called len?
Why would someone name a variable difflib when s/he is intending to import it in the current namespace?

To address your problem, look at abstract base classes. They're a Pythonic way to deal with this issue; you can define common behavior in a base class, and even define ways to determine if a particular object is a "virtual baseclass" of the abstract base class. This somewhat mimics the registry you're describing, without requiring all classes know about the registry beforehand.

In practice, though, this issue doesn't crop up as often as you might expect. Objects with an __iter__ method or a __str__ method are simply broken if the methods don't work the way you expect. Likewise, if you say an argument to your function requires a .callback() method defined on it, people are going to do the right thing.

If you are worried that the lack of static type checking will let some bugs get through, the answer isn't to bolt on type checking, it is to write tests.

In the presence of unit tests, a type checking system becomes largely redundant as a means of catching bugs. While it is true that a type checking system can catch some bugs, it will only catch a small subset of potential bugs. To catch the rest you'll need tests. Those unit tests will necessarily catch most of the type errors that a type checking system would have caught, as well as bugs that the type checking system cannot catch.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow