Pregunta

The principle of duck typing says that you shouldn't care what type of object you have - just whether or not you can do the required action with your object. For this reason the isinstance keyword is frowned upon. - -Definition

In below snippet(function) group_tweets_by_state, following definition of duck typing, action setdefault is performed on the object tweets_by_state by appending object tweet

def group_tweets_by_state(tweets):
    tweets_by_state = {}
    USA_states_center_position = {n: find_center(s) for n, s in us_states.items()}
    for tweet in tweets:                     # tweets are list of dictionaries 
        if hassattr(tweet, 'setdefault'):    # tweet is a dictionary 
            state_name_key = find_closest_state(tweet, USA_states_center_position)
            tweets_by_state.setdefault(state_name_key, []).append(tweet)
    return tweets_by_state

My understanding is, function hasattr(tweet, 'setdefault') is type checking tweet to be of <class 'dict'> type in duck typing style, before append.

Is my understanding correct? Does function group_tweets_by_state follow duck typing?

¿Fue útil?

Solución

A test like hassattr(tweet, 'setdefault') to make sure tweet is a dictionary is not a good one, since it obviously does not assure tweet provides all methods/properties of a dictionary. So as long tweet.setdefault is not the only method called by find_closest_state (which I think is unlikely), this test is not strict enough. On the other hand, a test like isinstance(tweet, dict) is too strict, because it forbids the usage of other, dictionary-like structures, which is exactly the idea of duck typing.

In your example the requirement is not really that tweet is a dictionary, the requirement is that find_closest_state can process the tweet, whatever methods it calls from a tweet, independently of the real type. The following solution will handle this in a generic manner, without the need of knowing exactly what methods inside find_closest_state are used:

def group_tweets_by_state(tweets):
    tweets_by_state = {}
    USA_states_center_position = {n: find_center(s) for n, s in us_states.items()}
    for tweet in tweets:              # a  tweet should behave like a dictionary
        try: 
            state_name_key = find_closest_state(tweet, USA_states_center_position)
            tweets_by_state.setdefault(state_name_key, []).append(tweet)
        except (AttributeError, TypeError):
            pass
    return tweets_by_state

The code checks for an AttributeError because that is the exception you get when find_closest_state calls a method not provided by tweet. It also checks for a TypeError, because that is what you get when you call tweet["abc"] on a non-dictionary. You may need to add some other exceptions, depending on how find_closest_state is implemented internally, but you should not add any artificial constraints.

And that's how duck typing should really be applied - by not making assumptions about the type of the object passed, only by testing whether or not you can do the required action (here: call find_closest_state without getting one of the above exceptions).

Otros consejos

First, I want to say that by far not everybody agrees that duck typing is a good thing at all, let alone that it is some sort of holy principle that should be followed. Often duck typing leads to error prone, hard to debug code, although it can be very flexible in how you can use it.

Duck typing just takes an object and does with the object what it wants to do E.g. if it expects an open file object that it wants to call .write() on, then it just does that. If that throws an exception because the object doesn't have that method, then you passed in a non-duck. Your fault. If you passed in a completely different kind of object that also works because it has a .write(), that is the benefit of duck typing.

So your code would use "duck typing" if it simply assumes that tweets is a list of dictionaries (like the comment says), and doesn't go to any trouble to check that.

def group_tweets_by_state(tweets):
    tweets_by_state = {}
    USA_states_center_position = {n: find_center(s) for n, s in us_states.items()}
    for tweet in tweets:                     # tweets are list of dictionaries 
        state_name_key = find_closest_state(tweet, USA_states_center_position)
        tweets_by_state.setdefault(state_name_key, []).append(tweet)
    return tweets_by_state
Licenciado bajo: CC-BY-SA con atribución
scroll top