How can I replace outliers with maximum non-outlier value?
-
02-11-2019 - |
質問
I am doing univariate outlier detection in python. When I detect outliers for a variable, I know that the value should be whatever the highest non-outlier value is (i.e., the max if there were no outliers).
How can I impute this value in python or sklearn? I guess I can remove the values, get the max, replace the outliers and bring them back. But hoping there’s a function for that already.
Second, is this a bad idea? I see others remove the outlier completely or replace with the mean or median. I wonder if my approach is wrong.
正しい解決策はありません
所属していません datascience.stackexchange