The problem is that anyChar*
parses a List[String]
(where in this case each string is a single character), and the result of calling toString
on a list of strings is "List(...)"
, not the string you'd get by concatenating the contents. In addition, the case text =>
pattern is matching on the entire letter ~ (anyChar*)
, not just the anyChar*
part.
It's possible to address both of these issues pretty straightforwardly:
case class Model(name: String) {
override def toString : String = "[model " + name + "]"
}
import scala.util.parsing.combinator._
object ModelParser extends RegexParsers {
def model: Parser[Model] = "[model" ~> "[name" ~> name <~ "]]" ^^ (Model(_))
def name: Parser[String] = letter ~ (anyChar*) ^^ {
case first ~ rest => (first :: rest).mkString
}
def anyChar = letter | digit | "_".r | "-".r
def letter = """[a-zA-Z]""".r
def digit = """\d""".r
}
We just append the first character string to the list of the rest, and then call mkString
on the entire list, which will concatenate the contents. This works as expected:
scala> ModelParser.parseAll(ModelParser.model, "[model [name helloWorld]]")
res0: ModelParser.ParseResult[Model] = [1.26] parsed: [model helloWorld]
As you note, it would be possible (and possibly clearer and more performant) to let the regular expressions do more of the work:
object ModelParser extends RegexParsers {
def model: Parser[Model] = "[model" ~> "[name" ~> name <~ "]]" ^^ (Model(_))
def name: Parser[String] = """[a-zA-Z\d_-]+""".r
}
This example also illustrates the way that the parsing combinator library uses implicit conversions to cut down on some of the verbosity of writing parsers. As you say, def hello = "hello"
defines a string, and "[a-zA-Z]+".r
defines a Regex
(via the r
method on StringOps
), but either can be used as a parser because RegexParsers
defines implicit conversions from String
(this one's named literal
) and Regex
(regex
) to Parser[String]
.