Question

I have the following grammar and I want to parse inputs to get associated ASTs. Everything is easy with ANTLR for Java. Since ANTLR4, in grammar files, you don't have to specify options `output=AST; to get ASTs information.

Hello.g

grammar  Hello; //  Define  a  grammar  called  Hello
stat    :   expr NEWLINE       
    |   ID '=' expr NEWLINE 
    |   NEWLINE   
        | expr
    ;

expr:   atom (op atom)* ;

op  : '+'|'-' ;

atom    :   INT |   ID;

ID  :   [a-zA-Z]+ ;

INT :  [0-9]+ ;

NEWLINE :   '\r' ? '\n' ;

WS  :   [ \t\r\n]+ -> skip ;

Test.java

import  org.antlr.v4.runtime.*;
import  org.antlr.v4.runtime.tree.*;
import java.io.*;
import lib.HelloLexer;
import lib.HelloParser;
public class Test {
    public  static  void  main(String[]  args)  throws  Exception  {
        ANTLRInputStream  input  =  new  ANTLRInputStream("5 + 3");
        //  create  a  lexer  that  feeds  off  of  input  CharStream
        HelloLexer  lexer  =  new  HelloLexer(input);
        //  create  a  buffer  of  tokens  pulled  from  the  lexer
        CommonTokenStream  tokens  =  new  CommonTokenStream(lexer);
        //  create  a  parser  that  feeds  off  the  tokens  buffer
        HelloParser  parser  =  new  HelloParser(tokens);
        ParseTree  tree  =  parser.expr();  //  begin  parsing  at  init  rule
        //System.out(tree.toStringTree(parser));  //  print  LISP-style  tree
        System.out.println(tree.toStringTree(parser));
    }   
}

The output will be:

(expr (atom 5) (op +) (atom 3))

But would you please tell me how to obtain the same result with Python implementation? Currently, I'm using ANTLR 3.1.3 Runtime for Python. The following code only returns "(+ 5 3)"

Test.py

import sys
import antlr3
import antlr3.tree
from antlr3.tree import Tree
from HelloLexer import *
from HelloParser import *

char_stream = antlr3.ANTLRStringStream('5 + 3')
lexer = ExprLexer(char_stream)
tokens = antlr3.CommonTokenStream(lexer)
parser = ExprParser(tokens)
r = parser.stat()

print r.tree.toStringTree()
Was it helpful?

Solution

There is an antlr4 runtime for Python now (https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Python+Target) but toStringTree is a class method in the Python runtimes. You can call it like this to get the lisp style parse tree including stringified tokens:

from antlr4 import *
from antlr4.tree.Trees import Trees
# import your parser & lexer here

# setup your lexer, stream, parser and tree like normal

print(Trees.toStringTree(tree, None, parser))

# the None is an optional rule names list

OTHER TIPS

There is currently no Python target for ANTLR 4, and ANTLR 3 did not support the automatic generation of parse trees to produce the output you are looking at.

You might be able to use the AST creation functionality in ANTLR 3 to produce a tree, but it will not have the same form (and certainly not the simplicity) of ANTLR 4.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top