Question

I'm a little confused by this behaviour of attoparsec.

$ ghci
> :m Data.Attoparsec.Text
> :m + Data.Text
> parse (string (pack "module")) (pack "mox")
Partial _
> parse (string (pack "module")) (pack "moxxxx")
Fail "moxxxx" [] "Failed reading: takeWith"
> 

Why do I need addition characters present to trigger the Fail?

Shouldn't it Fail as soon as the first "x" is encountered?

Was it helpful?

Solution

It's an implementation detail, a string parser doesn't finish before it knows whether there is enough input remaining for it to possibly succeed. It's a consequence of the all-or-nothing behaviour of these parsers (which, I think, is generally good for performance).

string :: Text -> Parser Text
string s = takeWith (T.length s) (==s)

string s tries to take length s units of Text, and then compare them with s.

takeWith :: Int -> (Text -> Bool) -> Parser Text
takeWith n p = do
  s <- ensure n
  let h = unsafeTake n s
      t = unsafeDrop n s
  if p h
    then put t >> return h
    else fail "takeWith"

takeWith n p first tries to ensure that n units of Text are available, and

ensure :: Int -> Parser Text
ensure !n = T.Parser $ \i0 a0 m0 kf ks ->
    if lengthAtLeast (unI i0) n
    then ks i0 a0 m0 (unI i0)
    else runParser (demandInput >> go n) i0 a0 m0 kf ks
  where
    go n' = T.Parser $ \i0 a0 m0 kf ks ->
        if lengthAtLeast (unI i0) n'
        then ks i0 a0 m0 (unI i0)
        else runParser (demandInput >> go n') i0 a0 m0 kf ks

ensure n creates a continuation asking for more gruel input (a Partial result) if it doesn't find enough input immediately.

You can get a failure with

Prelude Data.Attoparsec.Text Data.Text> parseOnly (string (pack "module")) (pack "mox")
Left "not enough input"

telling the parser up front that it won't get any more input (then the demandInput from ensure makes it fail), or later

Prelude Data.Attoparsec.Text Data.Text> parse (string (pack "module")) (pack "mox")
Partial _
Prelude Data.Attoparsec.Text Data.Text> feed it (pack "")
Fail "mox" ["demandInput"] "not enough input"

by telling the Partial result that that was it, feeding it an empty Text.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top