Parse a date from a CharSequence with standard patterns…
-
28-10-2019 - |
Frage
I'm writing a parser for a command line interface of an external tool and I'm using Scala's parser combinators library. As part of this I need to parse a standard date of the format EEE MMM d HH:mm:ss yyyy Z.
Scala's parser-combinators are "stream-based" and works with CharSequence's instead of Strings. That makes it hard for me to use either java.text.DateTimeFormat or DateTimeFormat from JodaTime since they both work with Strings.
As of now, I hade to write my own regex-parser like this to parse the date, but I would much rather incorporate the work that has been done with JodaTime into my parser. I really don't want to reinvent the wheel. I've been looking at the source-code of JodaTime and I'm not really sure why it needs to work with Strings instead of just CharSequences. Am I missing some aspect?
Lösung 3
This is my solution right now:
I forked joda-time and made small changes for it to work on CharSequence
s instead of String
s. It's over here https://github.com/hedefalk/joda-time/commit/ef3bdafd89b334fb052ce0dd192613683b3486a4
Then I could write a DateParser
like this:
trait DateParsers extends RegexParsers {
def dateTime(pattern: String): Parser[DateTime] = new Parser[DateTime] {
val dateFormat = DateTimeFormat.forPattern(pattern);
def jodaParse(text: CharSequence, offset: Int) = {
val mutableDateTime = new MutableDateTime
val newPos = dateFormat.parseInto(mutableDateTime, text, offset)
(mutableDateTime.toDateTime, newPos)
}
def apply(in: Input) = {
val source = in.source
val offset = in.offset
val start = handleWhiteSpace(source, offset)
val (dateTime, endPos) = jodaParse(source, start)
if (endPos >= 0)
Success(dateTime, in.drop(endPos - offset))
else
Failure("Failed to parse date", in.drop(start - offset))
}
}
}
Then I can use this trait to have production rules like:
private[this] def dateRow = "date:" ~> dateTime("EEE MMM d HH:mm:ss yyyy Z")
Am I overworking this? I'm really tired right now…
Andere Tipps
Got it, now. Ok, there's a simpler solution than forking. Here:
trait DateParsers extends RegexParsers {
def dateTime(pattern: String): Parser[DateTime] = new Parser[DateTime] {
val dateFormat = DateTimeFormat.forPattern(pattern);
def jodaParse(text: CharSequence, offset: Int) = {
val mutableDateTime = new MutableDateTime
val maxInput = text.source.subSequence(offset, dateFormat.estimateParsedLength + offset).toString
val newPos = dateFormat.parseInto(mutableDateTime, maxInput, 0)
(mutableDateTime.toDateTime, newPos + offset)
}
def apply(in: Input) = {
val source = in.source
val offset = in.offset
val start = handleWhiteSpace(source, offset)
val (dateTime, endPos) = jodaParse(source, start)
if (endPos >= 0)
Success(dateTime, in.drop(endPos - offset))
else
Failure("Failed to parse date", in.drop(start - offset))
}
}
}
I'm not sure what you are asking. Are you asking why RegexParser.parse()
's in
parameter takes a CharSequence
? If so there's another overloaded RegexParser.parse()
that takes a Reader
, which you can write a simple conversion function like so:
def stringToReader(str: String): Reader = new StringReader(str)
As to the date format, I find it perfectly fine to define it as a token in the parser.
Hope this helps.