Use a Graph Library/Node Network Library or Write My Own?

https://stackoverflow.com/questions/1499878

19-09-2019
|

Question

I'm trying to decide between going with a pre-made graph/node network library or to roll my own.

I'm implementing some graph search algorithms which might require some significant customization to the class structure of the node and/or edges.

The reason I'm not sure what to do is that I'm unsure if customization of a pre-made might be more expensive/trouble than making my own in the first place. I'm also curious, but less so, of the performance trade-offs.

Is there anyone out there have direct experience with using one of the libraries out there and have advice based on a success or failure story? I want to hear the worst so that whatever I chose, I know what I'm getting into.

There are only two that I've found in my search so far: The Boost Graph Library (BGL) and GOBLIN. Specific advice on either of these, or suggestions for others is greatly appreciated as well. BGL seems pretty damn arcane. Is it worth struggling through?

Solution

I can perhaps provide a little guidance on the BGL.

The library is very flexible. The cost of this is that the syntax can be very baroque, in order to accommodate all the possibilities. However, it is sufficiently flexible that simple things can be done simply.

Unfortunately the boost documentation goes at things full tilt, providing a description only of the full complexity, without a hint of how simple things can be.

( "Any sufficiently advanced technology is indistinguishable from magic" - Arthur C. Clarke. What he should have said is "Any advanced technology, sufficiently badly documented, is indistinguishable from magic )

Consider:

typedef property_map<Graph, vertex_index_t>::type IndexMap;
IndexMap index = get(vertex_index, g);
typedef graph_traits<Graph>::vertex_iterator vertex_iter;
std::pair<vertex_iter, vertex_iter> vp;
for (vp = vertices(g); vp.first != vp.second; ++vp.first) {
   std::cout << index[*vp.first] <<  " ";
}

This is how the "quick tour" suggests we print out a list of the graph vertices. However, a little study shows that a vertex is no more than an integer index, and the code can be greatly simplified to

for (int v = *vertices(g).first; v != *vertices(g).second; ++v)
    std::cout << v << " ";

Perhaps there are some magical things that cannot be achieved with this simplified code, but for every day use it reasonable to drastically prune the syntax that encrust BGL so you can see what your are coding.

Sometimes the elaborate syntax cannot be removed. ( Or perhaps I have just not noticed the underlying truth ). Then I usually use a little utility function to encapsulate the complexity abd keep it away from the algorithm I am working on.

For example, I often need to loop over the children of a vertex

vector<int> getVertexChildren( int v )
{
    vector<int> vc;
    typedef std::pair<graph_traits<graph_t>::out_edge_iterator, graph_traits<graph_t>::out_edge_iterator> out_edge_iter_pair_t;
    for( out_edge_iter_pair_t ep = out_edges(v,m_tree);
        ep.first != ep.second; ++(ep.first))
    {
        vc.push_back( target( *ep.first, m_tree ) );
    }
    return vc;
}
#define FOR_ALL_CHILDREN( v ) vector<int> vc=getVertexChildren(v); BOOST_FOR_EACH( int child, vc )

The bottom line is: go ahead and use BGL. It can be simplified to do simple things, but once you have learned to use it, all the immense flexibility will be available whenever you do need it.

OTHER TIPS

“require some significant customization to the class structure of the node and/or edges.”

Have you looked at the “bundled properties” feature of the BGL?

http://www.boost.org/doc/libs/1_40_0/libs/graph/doc/bundles.html

This is both powerful and not the least bit acrcane.

You define a classes for your edges and your vertices

class cMyVertex
{
…
};
class cMyEdge
{
…
};

Now you define a graph type which will use your classes

typedef boost::adjacency_list<
    boost::listS, boost::vecS, boost::bidirectionalS,
    cMyVertex, cMyEdge > graph_t;

Now you can create graphs of this type that will use your classes, whatever they are, for the vertices and edges.

You can access the attributes and methods of your classes in a perfectly natural way

Graph_t g(10)       // create a graph with 10 vertices
g[1].setA1( “alpha” );  // invoke an accessor on node #1

I recently gave the boost graph library a trial for Dijkstras shortest path problem. Results:

Very High performance
Very little code needed
Very flexible and extensible
Hard to understand or debug

Advice: Use it but before you do so read The Boost Graph Library: User Guide and Reference Manual

I think if you can leverage the graph traversal algorithms, it is worth using the Boost Graph. Creating and using custom vertices is pretty easy. The examples included with BGL were useful for learning how to use it.

I agree with Clifford, I've used GraphViz to help visualize the graph during development and found it very useful.

You may also take a try with LEMON graph library.

I could use it to perform Dijsktra shortest path search after reading the graph from a configuration file.

A new version has just come out.

This really isn't a concise answer, its more of adding to what has already been said.

The solution to this problem depends on the context of the problem. If you wish to get a great understanding how graph algorithms, and software works. Write your own. If you want to get an advanced knowledge of graph algorithms and structures or to implement them into your own program then learn a standard graph library. (See the reasons for using reusable code)

The best of both worlds. Do both. If you are stretched for time, get a book on this or follow tutorials and the examples.

Another suggestion: Ask a new pointed question about what is the {best, most reliable, easiest to learn} graph library for {describe a very general problem}? This will help you choose what to learn, rather than trying at random to find the best one that suits your needs. Someone has already seen this problem, ask for advice.

Using Reusable Code:

Code that someone else has written including test cases will generally be more reliable than your own.
Fixes are easer to implement (with updates to the component you are reusing), this is true in Boost (since they do frequent updates, and that Boost is peer reviewed).
Once you learn one framework you can apply that to other applications... who knows you might get a job later in life using the framework. Companies like reusing rather than reinventing the wheel.

I rolled my own. You shouldn't follow that example, unless you're absolutely certain you need to.

Still, it's a great learning experience, if you're graph theory is rusty.

I had lots of "fun" with combining digraphs using EBNF operators, and doing epsilon elimination and Hopcroft-based minimisation. Especially with the minimisation, if you can call desperately trying to find good information and figure out how it works "fun" :-(

If you do roll your own, I recommend separating high level operations from the low level digraph data structures - different classes, not just different methods. I didn't - and pretty soon I'll be refactoring heavily because of that poor decision.

Some graphs annotate nodes, some annotate edges, some both. Sometimes two annotations are useful, even if they're just foreign keys for some external data. For example in finite state machines, an edge may be annotated with an input and an output. You're likely to be interested in determinism WRT the input but not the output, so having them both hidden behind a single ID is a pain - and that's besides the whole issue of secondary containers for the what-does-this-ID-refer-to lookups.

Also, sometimes you just want to work with things that weren't designed explicitly as a digraph IYSWIM - e.g. a bunch of data structure nodes that link to each other.

IOW, it's useful to be able to plug in different digraph representations, yet still use the same high-level tools for many operations.

IMO I got it right when I wrote a 'tree tool' class which does things like iterating and subscripting in treeview order and XML element tag order. The underlying tree representation is pluggable (policy based template). With digraphs, I messed up.

Anyway, I don't know what Boost or any other graph library actually provides, but if it provides what you need I say use it. It'll save a lot of headaches.

Before deciding to build your own graph library, I would have a good look at igraph (http://igraph.sourceforge.net/). It is written in C and is extremely fast and offers a wider range of basic and advanced graph algorithms (including visualization). In addition, it has a python wrapper for fast coding so I think this solution combines speed of coding and speed of execution.

I use the BGL very much, but what disturbs me with BGL is the lack of basic algorithms like edge and vertex connectivity, min cost flow and, general maximum weight perfect matching, just to name those which I miss the most.

LEMON offers all of that and also has a simpler syntax, what bugs me with LEMON is the installation and compilation problems on WINDOWS platforms, but I will probably switch to LEMON despite those problems.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow