質問

I am doing univariate outlier detection in python. When I detect outliers for a variable, I know that the value should be whatever the highest non-outlier value is (i.e., the max if there were no outliers).

How can I impute this value in python or sklearn? I guess I can remove the values, get the max, replace the outliers and bring them back. But hoping there’s a function for that already.

Second, is this a bad idea? I see others remove the outlier completely or replace with the mean or median. I wonder if my approach is wrong.

正しい解決策はありません

ライセンス: CC-BY-SA帰属
所属していません datascience.stackexchange
scroll top