Question

Suppose i have a dataframe:

df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})  

and i group it according to the type:

print df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})

                           Frequency
Type    Name                  
Bird    Flappy Bird          1
        Pigeon               2
Pokemon Jerry                3
        Mudkip               2

Could i create a dictionary from the above group ?? The key "Bird" will have a value of list containing ['Pigeon',Flappy Bird'] note that higher frequency name should appear first in the Value list.

Expected Output:

dict1 = { 'Bird':['Pigeon','Flappy Bird'] , 'Pokemon':['Jerry','Mudkip'] }
Was it helpful?

Solution

You can create a dictionary using a dictionary comprehension as below

df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})  
f = df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})
f.sort('Frequency',ascending=False, inplace=True)

d = {k:list(f.ix[k].index) for k in f.index.levels[0]}
print(d)
# {'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}

The dictionary comprehension will iterate through the outer index ('Bird', 'Pokemon') and then set the value as the inner index for your dictionary.

It is necessary to first sort your MultiIndex by the Frequency column to get the ordering you wish.

OTHER TIPS

Here's a one-liner.

df.groupby(['Type'])['Name'].apply(lambda grp: list(grp.value_counts().index)).to_dict()

# output
#{'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}

The value_counts function implicitly groups the Name field by count and returns descending order by default.

Bonus: if you want to include counts, you can do the following.

df.groupby(['Type']).apply(lambda grp: grp.groupby('Name')['Type'].count().to_dict()).to_dict()

# {'Bird': {'Flappy Bird': 1, 'Pigeon': 2}, 'Pokemon': {'Jerry': 3, 'Mudkip': 2}}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top