Clustering with geolocation (lat/long pairs) attributes
-
02-11-2019 - |
Question
I am trying to cluster customer behavior based on where they shop given by lat/long pairs. I also have other numeric attributes such as volume, average amount spent, etc. I am considering using HDBSCAN to create clusters. However, I'm not sure whether to feed the dataframe directly to the clustering algorithm or whether I would need to scale/normalize the data.
Is it wise to scale the geolocation pairs? Or would important location information be lost?
Any help would be much appreciated.
https://stats.stackexchange.com/questions/89809/is-it-important-to-scale-data-before-clustering
This page explains a lot. However, in the answer by @Anony-Mousse, he mentions not to scale lat/long pairs. That's good but what about other continuous variables?
No correct solution