You almost had it. Parse for what you don't care and then do nothing with it.
I added dontCareText and skipDontCare and then in your document parser indicated that skipDontCare was optional.
import scala.util.parsing.combinator.RegexParsers
object MyParser extends RegexParsers {
val beginToken: Parser[String] = "begin"
val dontCareToken: Parser[String] = "DONT CARE"
val text: Parser[String] = not(dontCareToken) ~> """([^\n]+)""".r
val dontCareText: Parser[String] = not(beginToken) ~> """([^\n]+)""".r
val skipDontCare = dontCareToken ~ dontCareText ^^ { case c => "" }
val document: Parser[String] =
beginToken ~> text.+ <~ opt(skipDontCare) ^^ {
_.mkString("\n")
}
val documents: Parser[Iterable[String]] = document.+
}
val s = """begin
Text I care about
Text I care about
DONT CARE
Text I don't care about
begin
More text I care about
"""
MyParser.parseAll(MyParser.documents,s)