Visualize LALR grammar

https://stackoverflow.com/questions/8154790

02-03-2021
|

Question

I'd like to visualize a grammar file (actually the Jison grammar for coffee-script). So the input file is a grammar file of Bison/Yacc style. The expected output could be a Graphviz dot file or something similar.

I'm not necessarily looking for a complete IDE, like GOLD. But it's important to be able to handle a LALR input, that's why the excellent ANLTRWorks doesn't come into account.

I also checked a comparison of parsers on Wikipedia, but it includes only IDE support, but not visualization.

This is the coffeescript grammar file I actually want to visualize.

Solution

Here are the instructions for creating a syntax diagram.

The content of grammar.coffee is executable code, which must be run for getting the actual Jison grammar. I used the Try CoffeeScript page to compile it, after having replaced the Jison call by a Javascript alert. Then ran the resulting Javascript to obtain the grammar, which looks like this:

{
  "tokens":" TERMINATOR TERMINATOR TERMINATOR STATEMENT INDENT OUTDENT INDENT OUTDENT IDENTIFIER NUMBER STRING JS REGEX BOOL = = INDENT OUTDENT : : INDENT OUTDENT RETURN RETURN HERECOMMENT PARAM_START PARAM_END -> =>  ,  , ... = ... . ?. :: :: INDEX_START INDEX_END INDEX_SOAK { }  , TERMINATOR INDENT OUTDENT CLASS CLASS CLASS EXTENDS CLASS EXTENDS CLASS CLASS CLASS EXTENDS CLASS EXTENDS SUPER SUPER  FUNC_EXIST CALL_START CALL_END CALL_START CALL_END THIS @ @ [ ] [ ] .. ... [ ] , TERMINATOR INDENT OUTDENT INDENT OUTDENT , TRY TRY TRY FINALLY TRY FINALLY CATCH THROW ( ) ( INDENT OUTDENT ) WHILE WHILE WHEN UNTIL UNTIL WHEN LOOP LOOP FOR FOR FOR OWN , FORIN FOROF FORIN WHEN FOROF WHEN FORIN BY FORIN WHEN BY FORIN BY WHEN SWITCH INDENT OUTDENT SWITCH INDENT ELSE OUTDENT SWITCH INDENT OUTDENT SWITCH INDENT ELSE OUTDENT LEADING_WHEN LEADING_WHEN TERMINATOR IF ELSE IF ELSE POST_IF POST_IF UNARY - + -- ++ -- ++ ? + - MATH SHIFT COMPARE LOGIC RELATION COMPOUND_ASSIGN COMPOUND_ASSIGN INDENT OUTDENT EXTENDS",
  "bnf":
  {
    "Root":
    [
      ["","return $$ = new yy.Block;",null],
      ["Body","return $$ = $1;",null],
      ["Block TERMINATOR","return $$ = $1;",null]
    ],
    "Body":
    [
      ["Line","$$ = yy.Block.wrap([$1]);",null],
      ["Body TERMINATOR Line","$$ = $1.push($3);",null],
      ["Body TERMINATOR","$$ = $1;",null]
    ],
    "Line":
    [
      ["Expression","$$ = $1;",null],
      ["Statement","$$ = $1;",null]
    ],
    ...

The above can be fed to the Jison-to-W3C grammar converter, resulting in a grammar like this:

Root     ::= ( Body | Block TERMINATOR )?
Body     ::= Line ( TERMINATOR Line | TERMINATOR )*
Line     ::= Expression
           | Statement
...

From here we can have the Railroad Diagram Generator create a syntax diagram:

CoffeeScript Syntax Diagram

. . .

Note that the converter only evaluates the "bnf" part of the grammar, so it does not take the token definitions into account. This could be improved by doing some manual postprocessing of the W3C-style grammar.

OTHER TIPS

so i tried again and found my most blatant mistake right away—the json i had posted was incorrectly using single instead of double quotes. let me detail the workflow; it's simple enough, and if you're already running CoffeeScript on NodeJS you're ready to go:

locate the node_modules/coffee-script/lib/coffee-script/grammar.js module in your file system;
copy & paste the code of that file into the source pane of the js->coffee pane on the js2coffee site (you could skip that, but i find it much more agreeable to edit CS than to fiddle with JS).
save the translated code to node_modules/coffee-script/lib/coffee-script/grammar.coffee;

go and locate

exports.parser = new Parser(
  tokens: tokens.join(" ")
  bnf: grammar
  operators: operators.reverse()
  startSymbol: "Root"
)

in the code; replace it with

console.log JSON.stringify
  tokens: tokens.join " "
  bnf: grammar
  operators: operators.reverse()
  startSymbol: "Root"

while taking care to use the exact same indentation (two space for the first line, four for the rest).

from the command line, run sth like coffee node_modules/coffee-script/lib/coffee-script/grammar.coffee > /tmp/coffee.grammar;
copy and paste the code of the resulting file into the grammar converter;
copy and paste the resulting EBNF grammar from the converter into the grammar editor over at the railroad diagram generator;
go over to the View Diagram tab and — rejoice!

it's sort of a chore to do all of this copy'n'pastish stuff, but certainly good enough for any one-off visualization. i've been searching the web a lot for a reasonable RR diagram generator, and this particular one is definitely among the ones with the prettiest output. sort of surprising when you think of how simple railroad diagrams really are.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow