Вопрос

I like using nested data structures and now I'm trying to understand how to use Pandas

Here is a toy model:

a=pd.DataFrame({'x':[1,2],'y':[10,20]})
b=pd.DataFrame({'x':[3,4],'y':[30,40]})
c=[a,b]

now I would like to get:

sol=np.array([[[1],[3]],[[2],[4]]])

I have an idea to get both sol[0] and sol[1] as:

s0=np.array([item[['x']].ix[0] for item in c])
s1=np.array([item[['x']].ix[1] for item in c])

but to get sol I would run over the index and I don't think it is really pythonic...

Это было полезно?

Решение

It looks like you want just the x columns from a and b. You can concatenate two Series (or DataFrames) into a new DataFrame using pd.concat:

In [132]: pd.concat([a['x'], b['x']], axis=1)
Out[132]: 
   x  x
0  1  3
1  2  4

[2 rows x 2 columns]

Now, if you want a numpy array, use the values attribute:

In [133]: pd.concat([a['x'], b['x']], axis=1).values
Out[133]: 
array([[1, 3],
       [2, 4]], dtype=int64)

And if you want a numpy array with the same shape as sol, then use the reshape method:

In [134]: pd.concat([a['x'], b['x']], axis=1).values.reshape(2,2,1)
Out[134]: 
array([[[1],
        [3]],

       [[2],
        [4]]], dtype=int64)

In [136]: np.allclose(pd.concat([a['x'], b['x']], axis=1).values.reshape(2,2,1), sol)
Out[136]: True
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top