However, the parser gets stuck (adding log() reveals that it is repeatedly trying the word and character parser).
The rep
combinator corresponds to a *
in perl-style regex notation. This means it matches zero or more characters. I think you want it to match one or more characters. Changing that to a rep1
(corresponding to +
in perl-style regex notation) should fix the problem.
However, your definition still seems a little verbose to me. Why are you parsing individual characters instead of just using \w+
as the pattern for a word? Here's how I'd write it:
object Example extends RegexParsers {
override def skipWhitespace = false
def word: Parser[String] = """\w+""".r
def sentence: Parser[List[String]] = rep1sep(word, whiteSpace) <~ "."
}
Notice that I use rep1sep
to parse a non-empty list of words separated by whitespace. There's a repsep
combinator as well, but I think you'd want at least one word per sentence.