Best way to implement algorithm on Big Graph [closed]

https://stackoverflow.com/questions/22776149

25-06-2023
|

Question

Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 9 years ago.

Improve this question

We know that we can use adjacency list or adjacency matrix for algorithm on graph. It is pretty easy and straightforward for small graphs. But when the graph is very big, for example social network graph, what data structure should be best for implementing traditional algorithms like shortest path finding. Adjacency matrix or list won't work because of high memory requirements, right? What approach do social network engines use?

Solution

Adjacency lists are in use in the sources I have found. For very large data sizes you might end up either holding the data on disk or using multiple machines to solve the problem - so I suggest adding keywords such as "external memory" or Hadoop to the search. I tried adding Hadoop and found some papers on solving single source shortest path via parallel breadth first search - http://www.cs.kent.edu/~jin/Cloud12Spring/GraphAlgorithms.pptx, http://courses.cs.washington.edu/courses/cse490h/08au/lectures/algorithms.pdf, Hadoop MapReduce implementation of shortest PATH in a graph, not just the distance

In addition, http://researcher.watson.ibm.com/researcher/files/us-heq/Large%20Scale%20Graph%20Processing%20with%20Apache%20Giraph.pdf does not cover shortest path but is an interesting example of solving connected components using a layer on top of Hadoop that may make life easier.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow