Appropriate graph representation for network flow algorithms

Question 1

I would describe Ivaylo's representation as "edge-contiguous". There's also an "endpoint-contiguous" representation, which I believe from an extremely unscientific sample to be in wider use. I have implemented both at various times.

Absent hard data, my hypothesis would be that the endpoint-contiguous representation is better for the usual suspects network flow algorithms than the edge-contiguous representation, since edge-contiguous incurs a random access each time an arc is scanned, and endpoint-contiguous, each time flow is pushed on an arc (which presumably was preceded by scanning the arc). The clear upside of this representation is that it supports unsymmetric graphs (not so relevant for network flow). The clear downside of this representation is that changing the topology of the graph is much slower.

The representation consists of two arrays. first, with n+1 elements, stores the index of the first arc with the specified tail. The extra entry is m, the total number of arcs, so that the indexes of arcs with tail v are first[v] inclusive through first[v+1] exclusive. On Ivaylo's graph,

[0] = 0->1, [1] = 0->2,
[2] = 1->0, [3] = 1->2, [4] = 1->3,
[5] = 2->0, [6] = 2->1, [7] = 2->3, [8] = 2->4,
[9] = 3->1, [10] = 3->2, [11] = 3->5,
[12] = 4->2, [13] = 4->5,
[14] = 5->3, [15] = 5->4,

the array first is

0, 2, 5, 9, 12, 14, 16.

The arcs themselves are stored in an m-element array of the following structure type.

struct Arc {
    int head;
    int capacity;
    int symmetric;
};

symmetric is the index of the symmetric arc.

Question 2

The representation I use is somewhat a mixture between edges list and neighbours list. It has no official name that I know of so I will not name it. It manages to meet all the requirements above and requires only using arrays - a structure that is present in most(if not all) popular programming languages. I will be using c++ for the illustration but the code should be easily translatable to other languages. For this answer I will assume the vertices are numbered 0 to N-1 and that our graph has M edges.

The graph that we store will be directed as when dealing with network flows usually the edge and its reverse edge have different capacities(and these capacities sum up to the initial capacity of the edge).

An edge

As we are dealing with network flow algorithms each edge will have capacity(cap). In addition for each edge I will store its destination vertex(to) and another value that I will call next. We can also optionally add a source vertex, but it is not required because of the way the graph will be represented. I will assume all these values fit into an int type:

struct edge {
  // destination vertex
  int to;
  // capacity
  int cap;
  // next edge
  int next;
};

Storing the graph

I will store all edges in an array and in addition I will have one more array where I store "head" elements for the neighbours list of each vertex. I will name the array with the "head" elements first. first should be initialized with some value that is not a valid vertex number e.g. -1:

int first[N];
// in c++ we can also use memset
for (int i = 0; i < N; ++i) {
  first[i] = -1;
}

Because of the way max network flow algorithms are implemented, for each edge we should add a reverse edge with 0 capacity. For that reason the size of the array where we store the edges is in fact 2*M:

edge edges[M * 2];

Now there are two key things to the representation I suggest:

We form a (single)linked list of the neighbours of a given vertex and the index of the head(first) element of each linked list is stored in the array first.
For each edge we add its reverse edge right after it in the array of edges

Adding an element to a single linked list so there is only a small caveat in the add_edge function - we should also add the reverse edge. To simplify the code I will assume that we have a variable edges_num that represents the number of edges we have already added and I will use it as if it is a global variable. I implement an add_edge function that takes three arguments - the source vertex, the destination vertex and the capacity of the edge:

int edges_num = 0;
inline void add_edge(int from, int to, int cap) {
  edges[edges_num].to = to;
  edges[edges_num].cap = cap;
  edges[edges_num].next = first[from];
  first[from] = edges_num++;

  edges[edges_num].to = from;
  edges[edges_num].cap = 0;
  edges[edges_num].next = first[to];
  first[to] = edges_num++;
}

Note that the capacity of the reverse edge is 0 as this is usually the way it is initialized. That is pretty much all we need to store the graph using this representation.

An example

enter image description here

Let's see how the contents of the two arrays first and edges will changes:

Before adding any edge:

first:                edges:
 0  1  2  3  4  5     []
-1 -1 -1 -1 -1 -1

Let's add the edge 0 -> 2 with capacity 7. I will separate the two steps - adding the straight and the reverse edge:

first:                edges:
 0  1  2  3  4  5     [{to: 2, cap: 7, next: -1}]
 0 -1 -1 -1 -1 -1

And now the reverse edge:

first:                edges:
 0  1  2  3  4  5     [{to: 2, cap: 7, next: -1}, {to: 0, cap: 0, next: -1}]
 0 -1  1 -1 -1 -1

And now let's add 0->1 (capacity 5):

first:                edges:
 0  1  2  3  4  5     [{2, 7, -1}, {0, 0, -1}, {1, 5, 0}, {0, 0, -1}]
 2 -1  1 -1 -1 -1

Note that the edge with index 2 has a next value of 0 indicating that 0 is the next edge that has source 0. I will continue adding edges:

2->1 capacity 1:

first:                edges:
 0  1  2  3  4  5     [{2, 7, -1}, {0, 0, -1}, {1, 5, 0}, {0, 0, -1}, {1, 1, 1},
 2  5  4 -1 -1 -1      {2, 0, -1}]

And now fast forward adding 2->3(capacity 11), 2->4(capacity 8), 1->3(capacity 4), 4->5(capacity 3) and 3->5(capacity 6) in the same order:

first:                edges:
 0  1  2  3  4  5     [{2, 7, -1}, {0, 0, -1}, {1, 5, 0}, {0, 0, -1}, {1, 1, 1},
 2 10  8 14 12 15      {2, 0, -1}, {3, 11, 4}, {2, 0, -1}, {4, 8, 6}, {2, 0, -1},
                       {3, 4, 5}, {1, 0, 7}, {5, 3, 9}, {4, 0, -1}, {5, 6, 11}, 
                       {3, 0, 13}]

Hope this example makes it clear how the representation works.

Iterating over all neighbours

The iteration over all neighbours of a given vertex v is simple - just an iteration over a single linked list:

for (int cv = first[v]; cv != -1; cv = edges[cv].next) {
   // do something
}

It is apparent that this operation is linear over the number of neighbours.

Accessing the reverse edge

Using the fact that the reverse edge is always added right after the straight one, the formula for the index of the reverse edge is really simple. The reverse edge of edge with index e in edges is the edge with index e ^ 1. This works for both accessing the reverse of a straight edge and the reverse of a reverse edge. Again this is apparently constant and very easy to code.

Memory consumption

The required memory is O(M + N) - we have edges that is of size M*2 and first that is of size N. Of course N < M for any sensible graph so the overall memory complexity is O(M). Also the memory consumption will be much(at least twice) less than the memory consumption of the solution that uses hash maps for the neighbours list.

Summary

This graph representation implements all the required operations with the best possible complexity and also it has little memory overhead. An addition advantage of the representation is that it only uses a very basic structure that is built-in in most languages - the array. This structure may be used for other algorithms too, but the fast access to the reverse edge is particularly useful for graph algorithms.

Appropriate graph representation for network flow algorithms

Adjacency Matrix

Edges list

Neighbours list

The question