what arguments should I pass to dbscan or optic in order to divid the data in a specific way
-
02-11-2019 - |
题
I have thousands of very similar data set that needs to be divided in diagonal way to two groups. for example: and
I tried to play with the argument of dbscan and optic as eps and minPoints and even metric and none of them helped me to divide the data properly to 2 groups. I only succeed to divide the data using dbscan if I remove the noise between these groups to make them a complete separate 2 groups, I did it using histogram
j = 1
hist, bin_edges = np.histogram(data, bins=500)
max_bin = np.where(np.amax(hist) == hist)[0][0]
max_noise = bin_edges[max_bin+j]
filtered_indicies = data > max_noise
data = data[filtered_indicies]
these lines remove noise from the data, between the groups and also around it when
j > 1
and that causing me to remove necessary data that I need to reprocess later.
so Im going back the my main question, how can I know which eps, minPoints or other argument of dbscan can help me divide this data properly? or is there maybe a better way then what I presented here above (histogram) to remove the noise between these groups without removing necessary data?
没有正确的解决方案