Is it possible to use JavaCC parser to parse String and apply distributive law on it? [closed]

StackOverflow https://stackoverflow.com/questions/21186553

  •  29-09-2022
  •  | 
  •  

Question

For example in my case,

Given String is keyword1 AND (keyword2 OR keyword3) must be expanded to (keyword1 AND Keyword2) OR (keyword1 AND keyword3).

If possible please suggest me the way I can achieve this one. I am newbie to JavaCC.

Was it helpful?

Solution

To do your task, you need:

  • A grammar for your boolean expressions
  • A parser for your boolean language
  • An AST build from the parsing steps
  • Some machinery (tree matching/hacking) to implement your rewriting rules.

JavaCC provides you with just a parser; you have to provide the grammar. JavaCC itself doesn't provide any specific help with building trees, but a related package, JJTree, makes this easier. It provides zero help with the rewriting step; you'd have to implement this procedurally, which means climbing up and down the tree to match AND within OR (your fundamantal starting place for the distributive law, and then hacking at the tree links to transform the tree. Finally, you want to print the revised tree back out; JavaCC provides no help here, either.

ANTLR4 is bit closer. You provide it with a grammar; apparantly it will automatically build a tree. You have to implement the rewriting step procedurally, as with JavaCC. It won't print the revised tree by itself, but it so-called StringTemplates that help with the task.

In general, what you really want is a program transformation system, which lets you do all these tasks in an organized way.

Our DMS is one of these, and can solve this problem easily. You provide DMS with a grammar. It automatically builds trees as it parses. You provide DMS with pattern-directed re-writing rules stated in terms of your grammar's surface language; DMS will apply them and revise the trees. As part of specifying the parser, you also specify (easily) pretty-printing rules, thus it is easy to get the revised result back.

With this DMS grammar (this skips the lexer step, but that's really easy):

    BOOLEXP = disjunction ;
    disjunction = conjunction ;
    [associative-commutative] disjunction = disjunction 'OR' conjunction ;
    conjunction = term ;
    [associative-commutative] conjunction = conjunction 'AND' term ;
    term = VARIABLENAME ;
    term = '(' disjunction ')' ;
    term = 'NOT' term ;

The "[associative-commutative]" part tells DMS that these rules have those algebraic properties ("AC").

Then this DMS rewrite rule implements your request directly:

     rule apply_distributive_law(t1:term,t2:disjunction,t3:conjunction):disjunction->disjunctions
          = "\t1 AND ( \t2 OR \t3 ) " ->  " \t1 AND \t2 OR \t1 AND \t3 "; 

Notice how we didn't talk much about the trees, let alone manipulate them. What isn't obvious is this will not only match the specific pattern provided, but any AC rearrangement of these terms. This is extremely hard to code procedurally.

Because rewrites are easy, we can also pick the case when somebody has written your condition in negated form:

     rule  apply_demorgan(t1:term, t2: conjunction): disjunction -> disjunction
          = " NOT ( \t1 AND \t2 ) " ->   " NOT \t1 OR NOT (\t2) ";

It should be easy to see how to write DeMorgan's law for NOT-OR from this. DMS by default applies all the rules you give it until no further rule applies. If you hand it both of the above rules, it will convert a negated form using apply_demorgan, and then the distributive law. As you add more rules, the effects can be remarkable, and really hard to replicate with procedural code.

You can see a fully worked example on conventional algebra (rather than boolean algebra). All the same issues arise.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top