Cleaning the univariate dataset with high noise

https://datascience.stackexchange.com/questions/41939

01-11-2019
|

Question

At this time, I am having a dataset containing the operating duration for some sensors. This could be considered as a univariate dataset because it has only 1 dimension.

For example:

[1]: [10, 12, 13, 15, 16] indicates that the sensor [1] will have some operating duration like [10, 12, 13, 15, 16].

I want to see the range of operating duration for each sensor, by measuring the mean and standard deviation for each sensor. But my problem is in my dataset, each sensor has many noises. For example:

[1]: [1, 1, 1, 1, 1, 2, 2, 2, 10, 12, 13, 15, 16, 200, 400, 500].

You could see that sensor [1] has many noises like 1, 2, 200, 400 and 500. In my dataset, there are many cases like this.

Without removing the noise, the standard deviation is always smaller than the mean. This makes my duration analysis not meaningful.

So my question is: I want to ask if there is any method for removing the noises like that in my dataset.

Thank you.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange