Choosing Clustering Method based on results

Question

What is "best"?

As some smart people noticed:

the validity of a clustering is often in the eye of the beholder

There is no objectively "better" for clustering, or you are not doing cluster analysis.

Even when a result actually is "better" on some mathematical measure such as separation, silhouette or even when using a supervised evaluation using labels - its still only better at optimizing towards some mathematical goal, not to your use case.

K-means finds a local optimal sum-of-squares assignment for a given k. (And if you increase k, there exists a better assignment!) DBSCAN (it's actually correctly spelled all uppercase) always finds the optimal density-connected components for the given MinPts/Epsilon combination. Yet, both just optimize with respect to some mathematical criterion. Unless this critertion aligns with your requirements, it is worthless. So there is no best, until you know what you need. But if you know what you need, you would not need to do cluster analysis.

So what to do?

Try different algorithms and different parameters and analyze the output with your domain knowledge, if they help you with the problem you are trying to solve. If they help you solving your problem, then they are good. If they do not help, try again.

Over time, you will collect some experience. For example, if the sum-of-squares is meaningless for your domain, don't use k-means. If your data does not have meaningful density, don't use density based clustering such as DBSCAN. It's not that these algorithms fail. They just don't solve your problem, they solve a different problem that you are not interested in. And they might be really good at solving this other problem...