Question

I'm trying to create a rank column ( descending ) based on value in 'a' to create 'rank' Here's below is what I got after sorted , but now I have index as where I want as my column 'rank' how can I use index to create a new variable ?

or is there a function rank() that I can easily use in python to get descending ranked based on column 'a' ?

df = DataFrame(rand(10, 2), columns=list('ab'))
df.sort('a',ascending = False).reset_index()
# df.reset_index()
Was it helpful?

Solution

Use the Series rank method:

In [11]: df.a.rank()
Out[11]: 
0     4
1     1
2     8
3    10
4     6
5     2
6     3
7     9
8     7
9     5
Name: a, dtype: float64

It has a correspinding ascending argument:

In [12]: df.a.rank(ascending=False)
Out[12]: 
0     7
1    10
2     3
3     1
4     5
5     9
6     8
7     2
8     4
9     6
Name: a, dtype: float64

In the case of ties, this will take the average rank, you can also choose min, max or first:

In [21]: df = pd.DataFrame(np.random.randint(1, 5, (10, 2)), columns=list('ab'))

In [22]: df
Out[22]: 
   a  b
0  2  2
1  3  4
2  1  1
3  3  1
4  4  2
5  2  4
6  1  4
7  2  1
8  1  2
9  3  4

In [23]: df.a.rank()  # there are several 2s (which have rank 5)
Out[23]: 
0     5
1     8
2     2
3     8
4    10
5     5
6     2
7     5
8     2
9     8
Name: a, dtype: float64

In [24]: df.a.rank(method='first')
Out[24]: 
0     4
1     7
2     1
3     8
4    10
5     5
6     2
7     6
8     3
9     9
Name: a, dtype: float64
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top