문제

I have a pandas DataFrame, eg:

df = pd.DataFrame({'farm' : ['A','B','A','B'], 
                   'fruit':['apple','apple','pear','pear'], 
                   '2014':[10,12,6,8], 
                   '2015':[11,13,7,9]})

ie:

   2014  2015 farm  fruit
0    10    11    A  apple
1    12    13    B  apple
2     6     7    A   pear
3     8     9    B   pear

How can I convert it to the following?

  farm  fruit  value  year
0    A  apple     10  2014
1    B  apple     12  2014
2    A   pear      6  2014
3    B   pear      8  2014
4    A  apple     11  2015
5    B  apple     13  2015
6    A   pear      7  2015
7    B   pear      9  2015

I have tried stack and unstack but haven't been able to make it work.

도움이 되었습니까?

해결책

This can be done with pd.melt():

# value_name is 'value' by default, but setting it here to make it clear
pd.melt(x, id_vars=['farm', 'fruit'], var_name='year', value_name='value')

Result:

  farm  fruit  year  value
0    A  apple  2014     10
1    B  apple  2014     12
2    A   pear  2014      6
3    B   pear  2014      8
4    A  apple  2015     11
5    B  apple  2015     13
6    A   pear  2015      7
7    B   pear  2015      9

[8 rows x 4 columns]

I'm not sure how common "melt" is as the name for this kind of operation, but that's what it's called in R's reshape2 package, which probably inspired the name here.

다른 팁

It can be done using stack(); just that set_index() has to be called first to repeat farm and fruit for each year-value pair.

long_df = df.set_index(['farm', 'fruit']).rename_axis(columns='year').stack().reset_index(name='value')

result1

Also melt is a DataFrame method as well, so it can be called like:

long_df = df.melt(id_vars=['farm', 'fruit'], var_name='year', value_name='value')

One interesting function is pd.wide_to_long which can also be used to "melt" a frame. However, it requires a stubname, so wouldn't work for the case in the OP but works for other cases. For example, in the case below (note how years in the column labels have value_ in it).

long_df = pd.wide_to_long(df, 'value', i=['farm', 'fruit'], j='year', sep='_').reset_index()

result2

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top