Вопрос

I am computing a distance on my data. The result is then being sorted in ascending order. The samples having distance more than a specific threshold are to be marked as outliers and will be discarded. Below is a plot of all distance values.

graph

As evident from the graph, after a certain point, the graph rises quite rapidly and even the datapoints get sparse. I need to calculate that point from where this happens and mark that point as the threshold value.

Нет правильного решения

Лицензировано под: CC-BY-SA с атрибуция
Не связан с datascience.stackexchange
scroll top