Question

I am given a string of characters, in which every consequent pair of characters comprises an edge. What I mean by that is this is the string: ABBCAD. Edges of the string are:

A->B
B->C
A->D

Shortest path distance is A->D

The task at hand is to build up a Directed Acyclic Graph in memory from the string using the above rule and find the shortest path staring at the root node(in the example given it's A label) ending at terminal node.

NJKUUGHBNNJHYAPOYJHNRMNIKAIILFGJSNAICZQRNM

I gather one of the approaches that suites the task is to use the Depth First Search algo.

This is not homework...

Was it helpful?

Solution

This is a job for Djikstra's Algorithm. Once you build a representation of your graph it should be easy enough to produce the lowest cost traversal ... since in your case it seems that all paths have an equal cost (1).

You can look here on CodeProject for a reasonably decent implementation of Djikstra in C#.

could you present me with a pseudo code of your version of the graph representation for this case?

There are multiple ways to represent a graph in such a problem. If the number of vertices in your graph are likely to be small - it would be sufficient to use an adjacency matrix for the representation of edges. In which case, you could just use a .NET multidimensional array. The trick here is that you need to convert vertices labelled with characters to ordinal values. One approach would be to write a wrapper class that maps character codes to indices into the adjacency matrix:

class AdjacencyMatrix
{
    // representation of the adjacency matrix (AM)
    private readonly int[,] m_Matrix;
    // mapping of character codes to indices in the AM
    private readonly Dictionary<char,int> m_Mapping;

    public AdjacencyMatrix( string edgeVector )
    {
        // using LINQ we can identify and order the distinct characters
        char[] distinctChars = edgeVector.Distinct().OrderBy(x => x);

        m_Mapping = distinctChars.Select((c,i)=>new { Vertex = c, Index = i })
                                 .ToDictionary(x=>x.Vertex, x=>x.Index);

        // build the adjacency cost matrix here...
        // TODO: I leave this up to the reader to complete ... :-D
    }

    // retrieves an entry from within the adjacency matrix given two characters
    public int this[char v1, char v2]
    {
        get { return m_Matrix[m_Mapping[v1], m_Mapping[v2]];
        private set { m_Matrix[m_Mapping[v1], m_Mapping[v2]] = value; }
    }
}

OTHER TIPS

For this particular case, Dijkstra's could be greatly simplified. I imagine something like this would do the trick.

class Node {
    public object Value;

    // Connected nodes (directed)
    public Node[] Connections;

    // Path back to the start
    public Node Route;
}

Node BreadthFirstRoute(Node[] theStarts, Node[] theEnds) {
    Set<Node> visited = new Set<Node>();

    Set<Node> frontier = new Set<Node>();
    frontier.AddRange(theStarts);

    Set<Node> ends = new Set<Node>();
    ends.AddRange(theEnds);

    // Visit nodes one frontier at a time - Breadth first.
    while (frontier.Count > 0) {

        // Move frontier into visiting, reset frontier.
        Set<Node> visiting = frontier;
        frontier = new Set<Node>();

        // Prevent adding nodes being visited to frontier
        visited.AddRange(visiting);

        // Add all connected nodes to frontier.
        foreach (Node node in visiting) {               
            foreach (Node other in node.Connections) {
                if (!visited.Contains(other)) {
                    other.Route = other;
                    if (ends.Contains(other)) {
                        return other;
                    }
                    frontier.Add(other);
                }
            }
        }
    }

    return null;
}

Just for the record. This is your graph example:

alt text

Where the shortest path from A to M is marked in blue.

It is a very short program in Mathematica:

a = StringSplit["NJKUUGHBNNJHYAPOYJHNRMNIKAIILFGJSNAICZQRNM", ""]
b = Table[a[[i]] -> a[[i + 1]], {i, Length@a - 1}]
vertexNumber[g_, vName_] := Position[ Vertices[g, All], vName][[1, 1]];
Needs["Combinatorica`"]  

c     = ToCombinatoricaGraph[b]
sp    = ShortestPath[c, vertexNumber[c, "A"], vertexNumber[c, "M"]]
vlsp  = GetVertexLabels[c, sp]
vlspt = Table[{vlsp[[i]], vlsp[[i + 1]]}, {i, Length@vlsp - 1}] 

GraphPlot[b, VertexLabeling -> True, ImageSize -> 250, 
         DirectedEdges -> True, Method -> {"SpringEmbedding"}, 
         EdgeRenderingFunction ->
           (If[Cases[vlspt, {First[#2], Last[#2]}] == {}, 
                {Red, Arrowheads[Medium], Arrow[#1, .1]}, 
                {Blue, Arrowheads[Medium], Arrow[#1, .1]}] &)]

LBushkin's answer is correct. Eric Lippert has a series about implementing the A* algorithm in C#. A* is a more general case of Dijkstra's algorithm : if your cost estimation function always returns 0, it is exactly equivalent to Dijkstra.

Another option would be to use a graph library that has various shortest path algorithms implemented. One I have used in the past and found to be good is QuickGraph. The documentation is quite good. For example, here is a page in the documentation that explains how to use Dijkstra's algorithm.

Because your graph is acyclic the Viterbi algorithm can be used, visit the states in topological order and update the costs in preceding vertices (states).

The below code implements the search algorithm and data structure. I wasn't 100% sure of the definition of the valid based on the original question. But it should be straight forward to modify the code to construct other topologies and solve various dynamic programming task using a DAG.

Changing the outer for loop when computing the state potentials to while loop with a queue will allow for different shortestpath algorithms to used easily changed by changing the queue discipline. For example a binary heap based queue will give Dijkstra algorithm or a FIFO queue will Bellman-Ford algorithm.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace DAGShortestPath
{
  class Arc
  {
    public Arc(int nextstate, float cost)
    {
      Nextstate = nextstate;
      Cost = cost;
    }
    public int Nextstate { get; set; }
    public float Cost { get; set; }
  };

  class State
  {
    public bool Final { get; set; }
    public List<Arc> Arcs { get; set; }

    public void AddArc(int nextstate, float cost)
    {
      if (Arcs == null)
        Arcs = new List<Arc>();
      Arcs.Add(new Arc(nextstate, cost));
    }
  }
  class Graph
  {
    List< State > _states  = new List< State >();
    int _start = -1;

    public void AddArc(int state, int nextstate, float cost)
    {
      while (_states.Count <= state)
        AddState();
      while (_states.Count <= nextstate)
        AddState();
      _states[state].AddArc(nextstate, cost);
    }

    public List<Arc> Arcs(int state)
    {
      return _states[state].Arcs;
    }

    public int AddState()
    {
      _states.Add(new State());
      return _states.Count - 1;
    }

    public bool IsFinal(int state)
    {
      return _states[state].Final;
    }

    public void SetFinal(int state)
    {
      _states[state].Final = true;
    }

    public void SetStart(int start)
    {
      _start = -1;
    }

    public int NumStates { get { return _states.Count; } }

    public void Print()
    {
      for (int i = 0; i < _states.Count; i++)
      { 
        var arcs = _states[i].Arcs;
        for (int j = 0; j < arcs.Count; j++)
        {
          var arc = arcs[j];          
          Console.WriteLine("{0}\t{1}\t{2}", i, j, arc.Cost);
        }
      }
    }
  }

  class Program
  {
    static List<int> ShortertPath(Graph graph)
    {
      float[] d = new float[graph.NumStates];
      int[] tb = new int[graph.NumStates];
      for (int i = 0; i < d.Length; i++)
      {
        d[i] = float.PositiveInfinity;
        tb[i] = -1;
      }      
      d[0] = 0.0f;
      float bestscore = float.PositiveInfinity;
      int beststate = -1;

      for (int i = 0; i < graph.NumStates; i++)
      {
        if (graph.Arcs(i) != null)
        {
          foreach (var arc in graph.Arcs(i))
          {
            if (arc.Nextstate < i)
              throw new Exception("Graph is not topologically sorted");

            float e = d[i] + arc.Cost;
            if (e < d[arc.Nextstate])
            {
              d[arc.Nextstate] = e;
              tb[arc.Nextstate] = i;

              if (graph.IsFinal(arc.Nextstate) && e < bestscore)
              {
                bestscore = e;
                beststate = arc.Nextstate;
              }
            }
          }
        }
      }

      //Traceback and recover the best final sequence

      if (bestscore == float.NegativeInfinity)
        throw new Exception("No valid terminal state found");
      Console.WriteLine("Best state {0} and score {1}", beststate, bestscore);
      List<int> best = new List<int>();
      while (beststate != -1)
      {
        best.Add(beststate);
        beststate = tb[beststate];
      }

      return best;
    }

    static void Main(string[] args)
    {
      Graph g = new Graph();
      String input = "ABBCAD";      
      for (int i = 0; i < input.Length - 1; i++)
        for (int j = i + 1; j < input.Length; j++)
        {
          //Modify this of different constraints on-the arcs
          //or graph topologies
          //if (input[i] < input[j])
          g.AddArc(i, j, 1.0f);
        }
      g.SetStart(0);
      //Modify this to make all states final for example
      //To compute longest sub-sequences and so on...
      g.SetFinal(g.NumStates - 1);
      var bestpath = ShortertPath(g);

      //Print the best state sequence in reverse
      foreach (var v in bestpath)
      {
        Console.WriteLine(v);        
      }
    }
  }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top