Cleaning or referencing pandas dataframes with mixed data

https://stackoverflow.com/questions/22856310

27-06-2023
|

Frage

I have a pandas dataframe that contains mixed data in table format:

import datetime
d1 = datetime.datetime(2014, 1, 1)
d2 = datetime.datetime(2014, 1, 1)
d3 = datetime.datetime(2014, 1, 1)
a = [[True, False, True], [100.0, 200.0, 200.0], [2, 5, 5], [d1, d2, d3]]
df = pd.DataFrame(a, columns = ['Series0', 'Series1', 'Series3'], index=['row1','row2', 'row3', 'row4'])
df 

        Series0 Series1 Series3
row1     True    False   True
row2     100     200     200
row3     2   5   5
row4     2014-01-01 00:00:00     2014-01-01 00:00:00     2014-01-01 00:00:00

now if I slice a row of data from the dataframe and try to multiply it with something of float64 type:

row2 = df.T['row2']
x = np.tan(1)
row2 * x

i get:

TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.float64'

I have checked and the error has gone away when using latest versions of numpy and pandas. For reference the error happens with Pandas 0.10.0, Numpy: 1.6.2.

Obvious answers of upgrading to latest versions is not available to me as the code has to be robust over versions. Is there another syntax for pulling out the row of data so that I can do calculations on it?. Definitely want something that will not break if I eventually upgrade the packages to the latest versions. Manipulating the data before it goes into the dataframe is also not an option as the pandas dataframe is generated by the DataNitro df method.

Lösung

If you are okay with manipulating after dataframe is created and before doing your operation, then you can try this:

row2 = df.T['row2']
row2 = row2.astype(float)
x = np.tan(1)
row2 * x

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow