Question

Back in 2000 when I was a student I took a course in automata theory. In the exercises to this course we essentially re-programmed a unix command-line tool called Grail (http://www.csd.uwo.ca/Research/grail/). Grail allows you to read in files with regular expressions or deterministic/nondeterministic finite state machines and apply the typical theoretical operations on them: minimizing the FSM, checking for emptiness, reversing, product of FSMs, FSM to RegEx and RegEx to FSMs, applying an input string and simulating the machine and so on.

Grail seems to be available but apparently has not been developed since 2002. Therefore my question: does anybody know about similar tools that are still under active development? (i.e., the modern Grail?) What is used in classes today?

What I am looking for is a command-line tool that reads FSMs or RegExes from stdin, applies an operation, and outputs the result to stdout, the Unix way, so that you can create your own pipes. Simple FSMs and RegExes are enough, so things like pushdown automatas or Büchi automatas are not really necessary.

If no command-line tool, are there good libraries or graphical tools out there?

Thanks!

Was it helpful?

Solution

OpenFst is a set of command line tools for manipulating Finite state Transducers, there is a companion grammar toolkit OpenGrm. The machines can be visualized with GraphViz.

foma is another transducer toolkit with regex support.

OTHER TIPS

The Vaucanson platform, like OpenFst, provides a command line tool, called TAFKit that does exactly what you are looking for: it reads FSM or RegExp from stdin or from stored files for already computed objects, performs operations on them (about 70 functions available), and outputs the result on stdout or in files for further treatment. You can pipe, and create shell scripts.

For downloading Vaucanson 1.4.1, go to

http://www.lrde.epita.fr/cgi-bin/twiki/view/Vaucanson/Vaucanson141

You'll find there a rather complete user's manual.

For further information, do not hesitate to mail me.

                   Jacques Sakarovitch
                   sakarovitch@enst.fr

jFlap is a similar program written in Java. The last update was May 2011

Take a look at the tokenize and lookup command line tools at http://www.stanford.edu/~laurik/fsmbook/home.html. These are designed for tokenizing text (dividing it into words) and looking up morphological readings of words, but are more generally applicable.

For the Thrax language built on OpenFst, there is a command line tool usable as follows:

cat input1 bab babbab baba

cat input1 | thraxrewrite-tester --far=testrewrite.far --rule=R1 | cut -f5 -d' ' bib bibbib biba

R1 substitutes 'i' for 'a' in the context b _ b.

For java, I personally use the http://www.brics.dk/automaton/ which is not really under active development anymore, but it is still widely used, very stable and pretty quick. One could tranform a regular expression into an automaton and apply different operstions such as concatenation, intersection, union, etc.. The only missing bit, however, was the backward translation of an automaton to a regular expression string.

I have implemented a library on top of dk.brics that allows you to derive the regular expression string from their automaton representation by means of state elimination. By using this library, it should be possible to develop the command line tool you've described above.

If you are interested, you could have a look at https://github.com/julianthome/autorex .

Best wishes and kind regards, Julian

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top