Question

I've created the following Data Frame

User    Week1  Week2

UserA   5      7
UserB   7      0
UserC   8     20

from this original list

List = [['UserA',5,7],['UserB',7,0],['UserC',8,20]]

I'd like to calculate a formula for each user for every week and create a third column and fourth column of the result for those formulas.

The issue is when I try to do the following, I get a 'TypeError: Could not convert ...to numeric':

    return DF.apply(lambda x: (x - x.mean()) / x.std())

The following method works though:

 Python_Sublists = [subli[1:3] for subli in List]
 >>[[5,7],[7,0],[8,20]]

DF = pd.DataFrame(Python_Sublists,columns=['Week1','Week2'])

return DF.apply(lambda x: (x - x.mean()) / x.std())

I could then figure out how to append these lists back to the original list (though I have no idea how to go from a dataframe to a list again to do this). Is there more of a direct way to only apply the function to the numeric variables? Also, how would you change the pandas dataframe back to its original list form anyway?

Was it helpful?

Solution

Shouldn't df[['week1','week2']].apply(lambda x: (x - x.mean()) / x.std()) work? Your first column is not numerical data. I am sure that is causing the problem.

To 'append' the new one to the original dataframe, if needed, can be simply: df[['c3','c4']]=df[['week1','week2']].apply(lambda x: (x - x.mean()) / x.std()). The new columns will be named 'c3' and 'c4'

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top