Add multi-index to pandas dataframe and keep current index

https://stackoverflow.com/questions/20085308

02-08-2022
|

Question

I am trying to merge time-course data from different participants. I am iteratively extracting a dataframe per participant and concatenating them at the end of the loop. Before I concatenate, I would like to add the ID of my participants to an additional index.

This seems REALLY straightforward, but I was unable to find anything on this issue :(

I would like to turn this

    col
0     1
1   1.1
2   NaN

Into:

          col
ID    0     1
      1   1.1
      2   NaN

I know I could make a new index like:

multindex = [np.array(ID*len(data)),np.array(np.arange(len(data)))]

But that's inelegant without end, and - seeing as I am measuring with high frequency over half an hour - would even get kind of slow :/

I would like to mention that I have recently found my question to be a duplicate of this other question. However mine apparently has more upvotes and better answers. “Prepend” apparently doesn't seem to draw as many hits.

La solution

Maybe you can use keys argument of concat:

import numpy as np
import pandas as pd

df1 = pd.DataFrame(np.random.rand(3, 2))
df2 = pd.DataFrame(np.random.rand(4, 2))
df3 = pd.DataFrame(np.random.rand(5, 2))

print pd.concat([df1, df2, df3], keys=["A", "B", "C"])

output:

            0         1
A 0  0.863774  0.794880
  1  0.578503  0.418619
  2  0.215317  0.146167
B 0  0.655829  0.116917
  1  0.862316  0.812847
  2  0.500126  0.689218
  3  0.653439  0.270427
C 0  0.825213  0.882963
  1  0.579436  0.332047
  2  0.456948  0.718893
  3  0.795074  0.826773
  4  0.049676  0.697471

If you want to append other dataframes later:

df4 = pd.DataFrame(np.random.rand(6, 2))
pd.concat([df, pd.concat([df4], keys=["D"])])

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow