Traverse a graph in parallel

Question 1

The natural data structure for a graph is, well, a graph, i.e. a set of graph elements (nodes) which can refer other elements. Though, for the better cache reuse, the elements can be placed/allocated in an array or arrays (generally, vectors) in order to put neighbor elements as close in memory as possible. Generally, each element or a group of elements should have a mutex (spin_mutex) to protect access to it, the contention means that some other thread is busy working on it, so no need to wait. Though, if possible, an atomic operation over the flag/state fields is preferable to mark the element as visited without a lock. For example, the simplest data structure can be the following:

struct object {
    vector<object*> references;
    atomic<bool> is_visited; // for simplicity, or epoch counter
                             // if nothing resets it to false
    void inspect();          // processing method
};
vector<object> objects;      // also for simplicity, if it can be for real
                             // things like `parallel_for` would be perfect here

Given this data structure and the way how GC work is described, it perfectly fits for a recursive parallelism like divide-and-conquer pattern:

void object::inspect() {
    if( ! is_visited.exchange(true) ) {
        for( object* o : objects )   // alternatively it can be `parallel_for` in some variants
            cilk_spawn o->inspect(); // for Cilk or `task_group::run` for TBB or PPL
        // further processing of the object
    }
}

If the data structure in the question is how the tasks are organized. I'd recommend a work-stealing scheduler (like tbb or cilk. There are tons of papers on this subject. To put it simple, each worker thread has its own but shared deque of tasks, and when the deque is empty, a thread steals tasks from others deques.

The scalability comes from the property that each task can add some other tasks which can work in prarallel..

Question 2

Your questions:

Think of any_old_process that has to traverse a graph and do some work on the objects it finds, including adding more work.
... what data structure can be parallelised to achieve the goals set out in the question?

Quoted questions:

Some stuff about garbage collection.

Since you are specifically interested in parallelizing graph algorithms, I'll give an example of one kind of graph traversal that can be parallelized well.

Executive Summary

Finding local minima ("basins") or maxima ("peaks") are useful operations in digital image processing. A concrete example is geological watershed analysis. One approach to the problem treats each pixel or small group of pixels in the image as a node and finds non-overlapping minimum spanning trees (MST) with the local minima as the tree roots.

Gory details

Below is a simplistic example. It's a web interview question from Palantir Technologies brought to Programming Puzzles & Code Golf by AnkitSablok. It's simplified by two assumptions (bolded below):

That a pixel/cell only has 4 neighbors instead of the usual eight.
That a cell has all uphill neighbors (it's the local minima) or has a unique downhill neighbor. I.e., plains aren't allowed.

Below that is some JavaScript that solves this problem. It violates every reasonable coding standard against use of side-effects, but illustrates where some of the opportunities for parallelization exist.

In the "Create list of sinks (i.e. roots)" loop, note that each cell can be evaluated completely independently for elevation with respect to it's neighbors as long as the elevation data is static. In a sequential program, one thread of execution examines each cell. In a parallel program, the cells are divvied up so that one, and only one, thread reads and writes the local minima state information (sink[] in the program below). If generating the list of minima/roots in parallel, the queuing operations for the stack would have to be synchronized. For a discussion how to do that for stacks and other queues, see "Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms", Michael & Scott, 1996. For modern updates, follow the citation tree on Google Scholar (no mutex required :).
In the "Each root explores it's basin" loop, note that each basin could explored/enumerated/flooded in parallel.

If you want dive deeper into parallelizing MSTs, see "Scalable Parallel Minimum Spanning Forest Computation", Nobari, Cao, arras, Bressan, 2012. The first two pages contain a clear and concise survey of the field.

Simplified example

A group of farmers has some elevation data, and we’re going to help them understand how rainfall flows over their farmland. We’ll represent the land as a two-dimensional array of altitudes and use the following model, based on the idea that water flows downhill:

If a cell’s four neighboring cells all have higher altitudes, we call this cell a sink; water collects in sinks. Otherwise, water will flow to the neighboring cell with the lowest altitude. If a cell is not a sink, you may assume it has a unique lowest neighbor and that this neighbor will be lower than the cell.

Cells that drain into the same sink – directly or indirectly – are said to be part of the same basin.

Your challenge is to partition the map into basins. In particular, given a map of elevations, your code should partition the map into basins and output the sizes of the basins, in descending order.

Assume the elevation maps are square. Input will begin with a line with one integer, S, the height (and width) of the map. The next S lines will each contain a row of the map, each with S integers – the elevations of the S cells in the row. Some farmers have small land plots such as the examples below, while some have larger plots. However, in no case will a farmer have a plot of land larger than S = 5000.

Your code should output a space-separated list of the basin sizes, in descending order. (Trailing spaces are ignored.)

Here's an example:

Input:
5
1 0 2 5 8
2 3 4 7 9
3 5 7 8 9
1 2 5 4 2
3 3 5 2 1

Output:  11 7 7

The basins, labeled with A’s, B’s, and C’s, are:
A A A A A
A A A A A
B B A C C
B B B C C
B B C C C

// lm.js - find the local minima


//  Globalization of variables.

/*
    The map is a 2 dimensional array. Indices for the elements map as:

    [0,0] ... [0,n]
    ...
    [n,0] ... [n,n]

Each element of the array is a structure. The structure for each element is:

Item    Purpose         Range       Comment
----    -------         -----       -------
h   Height of cell      integers
s   Is it a sink?       boolean
x   X of downhill cell  (0..maxIndex)   if s is true, x&y point to self
y   Y of downhill cell  (0..maxIndex)
b   Basin name      ('A'..'A'+# of basins)

Use a separate array-of-arrays for each structure item. The index range is
0..maxIndex.
*/
var height = [];
var sink = [];
var downhillX = [];
var downhillY = [];
var basin = [];
var maxIndex;

//  A list of sinks in the map. Each element is an array of [ x, y ], where
// both x & y are in the range 0..maxIndex.
var basinList = [];

//  An unordered list of basin sizes.
var basinSize = [];


//  Functions.

function isSink(x,y) {
    var myHeight = height[x][y];
    var imaSink = true;
    var bestDownhillHeight = myHeight;
    var bestDownhillX = x;
    var bestDownhillY = y;

    /*
        Visit the neighbors. If this cell is the lowest, then it's the
    sink. If not, find the steepest downhill direction.
    */
    function visit(deltaX,deltaY) {
        var neighborX = x+deltaX;
        var neighborY = y+deltaY;
        if (myHeight > height[neighborX][neighborY]) {
            imaSink = false;
            if (bestDownhillHeight > height[neighborX][neighborY]) {
                bestDownhillHeight = height[neighborX][neighborY];
                bestDownhillX = neighborX;
                bestDownhillY = neighborY;
            }
        }
    }
    if (x !== 0) {
        // upwards neighbor exists
        visit(-1,0);
    }
    if (x !== maxIndex) {
        // downwards neighbor exists
    visit(1,0);
    }
    if (y !== 0) {
        // left-hand neighbor exists
        visit(0,-1);
    }
    if (y !== maxIndex) {
        // right-hand neighbor exists
        visit(0,1);
    }

    downhillX[x][y] = bestDownhillX;
    downhillY[x][y] = bestDownhillY;
    return imaSink;
}

function exploreBasin(x,y,currentSize,basinName) {
    //  This cell is in the basin.
    basin[x][y] = basinName;
    currentSize++;

    /*
        Visit all neighbors that have this cell as the best downhill
    path and add them to the basin.
    */
    function visit(x,deltaX,y,deltaY) {
        if ((downhillX[x+deltaX][y+deltaY] === x) && (downhillY[x+deltaX][y+deltaY] === y)) {
            currentSize = exploreBasin(x+deltaX,y+deltaY,currentSize,basinName);
        }
        return 0;
    }
    if (x !== 0) {
        // upwards neighbor exists
        visit(x,-1,y,0);
    }
    if (x !== maxIndex) {
        // downwards neighbor exists
        visit(x,1,y,0);
    }
    if (y !== 0) {
        // left-hand neighbor exists
        visit(x,0,y,-1);
    }
    if (y !== maxIndex) {
        // right-hand neighbor exists
        visit(x,0,y,1);
    }

    return currentSize;
}

//  Read map from file (1st argument).
var lines = $EXEC('cat "' + $ARG[0] + '"').split('\n');
maxIndex = lines.shift() - 1;
for (var i = 0; i<=maxIndex; i++) {
    height[i] = lines.shift().split(' ');
    //  Create all other 2D arrays.
    sink[i] = [];
    downhillX[i] = [];
    downhillY[i] = [];
    basin[i] = [];
}
for (var i = 0; i<=maxIndex; i++) { print(height[i]); }

//  Everyone decides if they are a sink. Create list of sinks (i.e. roots).
for (var x=0; x<=maxIndex; x++) {
    for (var y=0; y<=maxIndex; y++) a
        if (sink[x][y] = isSink(x,y)) {
            //  This node is a root (AKA sink).
            basinList.push([x,y]);
        }
    }
}
//for (var i = 0; i<=maxIndex; i++) { print(sink[i]); }

//  Each root explores it's basin.
var basinName = 'A';
for (var i=basinList.length-1; i>=0; --i) { // i-- makes Closure Compiler sad
    var x = basinList[i][0];
    var y = basinList[i][5];
    basinSize.push(exploreBasin(x,y,0,basinName));
    basinName = String.fromCharCode(basinName.charCodeAt() + 1);
}
for (var i = 0; i<=maxIndex; i++) { print(basin[i]); }

//  Done.
print(basinSize.sort(function(a, b){return b-a}).join(' '));