문제

I am doing univariate outlier detection in python. When I detect outliers for a variable, I know that the value should be whatever the highest non-outlier value is (i.e., the max if there were no outliers).

How can I impute this value in python or sklearn? I guess I can remove the values, get the max, replace the outliers and bring them back. But hoping there’s a function for that already.

Second, is this a bad idea? I see others remove the outlier completely or replace with the mean or median. I wonder if my approach is wrong.

올바른 솔루션이 없습니다

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top