Pandas Groupby makes kernel die in Jupyter notebook/Python
質問
I have a groupby in jupyter-notebook that takes ages to run and after 10 minutes of running it says 'kernel died...',
The groupby looks like this:
df1.groupby(['date', 'unit', 'company', 'city'])['col1',
'col2',
'col3',
'col4',
...
'col20'].mean()
All of the 'col' columns are float values. I am running everything locally. Any ideas?
UPDATE:
The shape of df1 is:
(1360, 24)
Memory and dtypes:
dtypes: category(3), datetime64[ns](2), float64(17), int64(2)
memory usage: 266.9 KB
The unique size of city, date, company, unit:
len(df1.date.unique()) = 789
len(df1.unit.unique()) = 76
len(df1.company.unique()) = 205
len(df1.city.unique()) = 237
I have 16GB of memory on MacBook Pro.
UPDATE 2:
It works only if I have date and unit inside the groupby columns as the only 2 columns. If I add either a company or city, it doesn't work anymore, it keeps running indefinitely.
正しい解決策はありません
所属していません datascience.stackexchange