Question

I need to quickly build a parser for a very simplified version of a html-like markup language in Java. In python, I would use pyparsing library to do this. Is there something similar for Java? Please, don't suggest libraries already out there for html parsing, my application is a school assignment which will demonstrate walking a tree of objects and serializing to text using visitor pattern, so I'm not thinking in real world terms here. Basically all I need here is tags, attributes and text nodes.

Was it helpful?

Solution

Another good parser generator is ANTLR, that might be what you're looking for.

OTHER TIPS

May be overkill for your use, but javacc is an excellent industrial-strength parser generator. I've used this program/library several times, its reliable and worth learning, particularly if you are going to work with languages and compilers. Here's the description of the program from the website listed above:

Java Compiler Compiler [tm] (JavaCC [tm]) is the most popular parser generator for use with Java [tm] applications. A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar. In addition to the parser generator itself, JavaCC provides other standard capabilities related to parser generation such as tree building (via a tool called JJTree included with JavaCC), actions, debugging, etc.

A quick search for parser generators in Java yields JParsec. I've never used it - but it's inspired by a Haskell library, so by definition it must be good:-)

I like JParsec (which I just discovered thanks to Torsten) because it doesn't generate code... :-) Perhaps less efficient, but enough for small tasks.
I found a similar library, JTopas.

There is a good list of parser (generators or not) at Java Source.

There are quite a number choices for stringhandling in java. Maybe the very basic java.util.Scanner and java.util.StringTokenizer Classes are helpfull for you?

Another good choice is maybe the org.apache.commons.lang.text library. http://commons.apache.org/lang/apidocs/org/apache/commons/lang/text/package-summary.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top