Does MapR have scalable machine learning algos. Like Mahout?
-
28-10-2019 - |
Question
I am specifically wondering if MapR has Kmeans clustering just like Mahout?
Solution
As far as I know, MapR is only a "faster" Hadoop. There are no algorithms included.
So your jobs should be compatible.
But what is the deal in implementing your own? K-means is ultra simple. See my blog post: http://codingwiththomas.blogspot.com/2011/05/k-means-clustering-with-mapreduce.html
However I have implemented a k-means clustering with BSP (Bulk Synchronous Parallel) and Apache Hama which is almost ten times faster if you compare it with the Mahout benchmark results in this book: http://www.manning.com/ingersoll/ (linked jira: https://issues.apache.org/jira/browse/MAHOUT-588) Here is the benchmark of k-means with Apache Hama: http://wiki.apache.org/hama/Benchmarks
You can find it here: https://github.com/thomasjungblut/thomasjungblut-common/blob/master/src/de/jungblut/clustering/KMeansBSP.java