Question

Consider a trivial example with a Dataframe df and a Series s

import pandas as pd

matching_vals = range(20,30)

df = pd.DataFrame(columns=['a'], index=range(0,10))
df['a'] = matching_vals
s  = pd.Series(list("ABCDEFGHIJ"), index=matching_vals)

df['b'] = s[df['a']]

At this point I would expect df['b'] to contain the letters A through J, but instead it's all NaN. However, if I replace the last line with

n = df['a'][2]
df['c'] = s[n]

then df['c'] is filled with Cs, as I'd expect, so I'm pretty sure it's not some strange type error.

I'm new to pandas, and this is driving me crazy.

Was it helpful?

Solution

s[df['a']] has an index which is different than df's index:

In [104]: s[df['a']]
Out[104]: 
a
20    A
21    B
22    C
23    D
24    E
25    F
26    G
27    H
28    I
29    J

When you assign a Series to a column of a DataFrame, Pandas tries to assign values according to the index. Since s[df['a']] does not have any values associated with the indices of df, NaN values are assigned. The assignment does not add new rows to df.

If you don't want the index to enter into the assignment, you could use

df['b'] = s[df['a']].values

For a demonstration of the matching of indices, notice how

import pandas as pd

df = pd.DataFrame(columns=['a'], index=range(0,10))
df['a'] = range(0,10)[::-1]
s  = pd.Series(list("ABCDEFGHIJ"), index=range(0,10)[::-1])
df['b'] = s[df['a']]

yields

In [123]: s[df['a']]
Out[123]: 
a
9    A
8    B
7    C
6    D
5    E
4    F
3    G
2    H
1    I
0    J
dtype: object

In [124]: df
Out[124]: 
   a  b
0  9  J
1  8  I
2  7  H
3  6  G
4  5  F
5  4  E
6  3  D
7  2  C
8  1  B
9  0  A

[10 rows x 2 columns]

The values of df['b'] are "flipped" to make the indices match.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top