Question

I have a system where I query a REST / Atom server for documents. The queries are inspired by GData and look like :

http://server/base/feeds/documents?bq=[type in {'news'}]

I have to parse the "bq" parameter to know which type of documents will be returned without actually doing the query. So for example,

bq=[type = 'news']                      ->  return ["news"]
bq=[type in {'news'}]                   ->  return ["news"]
bq=[type in {'news', 'article'}]        ->  return ["news", "article"]
bq=[type = 'news']|[type = 'article']   ->  return ["news", "article"]
bq=[type = 'news']|[title = 'My Title'] ->  return ["news"]

Basically, the query language is a list of predicate that can be combined with OR ("|") or AND (no separator). Each predicate is constraint on a field. The constraint can be =, <, >, <=, >=, in, etc... There can be spaces everywhere where it make sense.

I'm a bit lost between Regexp, StringTokenizer, StreamTokenizer, etc... and I am stuck with Java 1.4, so no Parser ...

Who can point me in the right direction ?

Thanks !

Was it helpful?

Solution

The right way would be to use parser generator like Antlr, JFlex or JavaCC.

A quick and dirty way would be:

String[] disjunctedPredicateGroups = query.split("\|");
List<String[]> normalizedPredicates = ArrayList<String[]>;
for (String conjunction : disjunctedPredicateGroups ) {
   normalizedPredicates.add(conjunction.split("\[|\]"));
}
// process each predicate
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top