Question

GraphX is the Apache Spark library for handling graph data. I was able to find a list of 'graph-parallel' algorithms on these slides (see slide 23). However, I am curious what characteristics of these algorithms make them parallelizable.

Was it helpful?

Solution

Two words: associative and commutative

In other words, the operations that the algorithm does need to be independent of how you order or group your data...this minimizes the need for cross-talk in the algorithm and leads to more efficiency.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top