Question

I'm working on a parser that parses a few terms that have an optional initial character that is the same for all the terms. My problem is, some of the terms first character is the same as the initial character. I'm trying to avoid rewriting each term with and without the initial character. I figured a packrat parser sounded like the solution but I've yet to get it working the way I want. Below is a minimal example of what I'm trying to accomplish.

import scala.util.parsing.combinator._
import scala.util.parsing.input._

object test extends App {
  object MyParser extends RegexParsers with PackratParsers {
    lazy val p1:Parser[String] = "ab"
    lazy val p2:Parser[String] = "cd"
    lazy val parser:PackratParser[String] = ("a".?) ~> ( p2 | p1 )
  }

    def parsing[T](s: String)(implicit p: MyParser.Parser[T]): T = {
      import MyParser._
      val phraseParser = phrase(p)
      val input = new PackratReader(new CharSequenceReader(s))
      phraseParser(input) match {
        case Success(t, _) => t
        case NoSuccess(msg, _) => throw new IllegalArgumentException(
          "Could not parse '"+s+"': "+msg)
      }
    }                                 //> parsing: [T](s: String)(implicit p: test.MyParser.Parser[T])T

  implicit val parser = MyParser.log(MyParser.parser)("parser")
  println(parsing("cd"))  /* works */
  println(parsing("acd")) /* works */
  println(parsing("aab")) /* works */

  // Should work but doesn't.
  println(parsing("ab"))  /* expected "ab" but found "b" */
}
Was it helpful?

Solution

After looking at @ishaaq's answer I came up with a solution that I believe is a little more elegant.

implicit class ParserExtension[T](val p: Parser[T]) extends AnyVal {
  def ?~[U](q: => Parser[U]): Parser[Option[T] ~ U] = {
    ((p ^^ Some.apply) ~ q) | (success(None) ~ q)
  }
}

This implicit class adds a method to all my parsers so I can have an optional parser with back tracking. You can use it like so:

val a:Parser[String] = "a"
val parser:Parser[Option[String] ~ String] = a ?~ ( p2 | p1 )

Here is a gist with the full source for my example

OTHER TIPS

I believe the problem is that it isn't easy to get RegexParsers to backtrack and retry alternate paths in partially successful matches, i.e. the "a" in "ab" matches the optional "a".? part and then the "b" fails the subsequent match, but then after that failure, no further attempt is made to try matching with the optional prefix.

See this answer for more detail and a (non-trivial) solution.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top