Pergunta

I'm a beginner with Haskell, so it might be very obvious what I'm doing wrong...

While trying to parse "1:1,2, 2:18, 3:100" into [(1,1), (1,2), (2,18), (3,100)] I got stuck on a lookahead.

To know if a number is a verse number it should look ahead for a colon, because then it is a chapter number instead.

The problem lies in the last function verseNr, it should parse+consume the number if not followed by a colon, otherwise fail without consuming anything (leaving the number to be parsed as a chapter number by refGroupByChapter).

Except for this issue it seems to work nicely :)

import Text.ParserCombinators.Parsec

main = do
  case (parse refString "(unknown)" "1:1,2, 2:18, 3:100") of
    Left  e -> do putStr "parse error at "; print e
    Right x -> print x  -- expecting: [(1,1), (1,2), (2,18), (3,100)]

refString :: GenParser Char st [(Int, Int)]
refString = do
  refGroups <- many refGroupByChapter
  eof
  return $ concat $ map flatten refGroups
  where flatten (_, [])   = []
        flatten (c, v:vs) = (c, v):(flatten (c, vs))

refGroupByChapter :: GenParser Char st (Int, [Int])
refGroupByChapter = do
  chapterNum <- many digit
  char ':'
  verseNums <- verseNrs
  return ((read chapterNum :: Int), verseNums)

verseNrs :: GenParser Char st [Int]
verseNrs = do
  first <- verseNr
  remaining <- remainingVerseNrs
  return (first:remaining)
  where
    remainingVerseNrs = do  -- allow for spaces around the commas
      (spaces >> oneOf "," >> spaces >> verseNrs) <|> (return [])
    verseNr = try $ do
      n <- many1 digit
      notFollowedBy $ char ':'  -- if followed by a ':' it's a chapter number
      return (read n :: Int)
Foi útil?

Solução

The trick for your particular problem would be to use the sepBy family of functions. You're parsing lists of numbers separated by commas, which is exactly what sepBy is for. A list of verses has the following properties: there has to be at least one verse number and there is a trailing comma. Combining the two, we realize we need the sepEndBy1 function. These functions are usually written in an infix position, so your code would look something like this:

verseNrs = verseNr `sepEndBy1` (spaces >> char ',' >> spaces)

I don't think you need to change anything else to get the code to work.

A couple of other minor style notes: you have some unnecessary parentheses. This isn't important, it just annoys me personally. E.g. in case ... of you do not need parens around the ... bit. Also, you do not need the type signature when you use read--the compiler can infer the type. That is, since verseNrs returns [Int], it's completely clear both to the compiler and to me that read n produces an Int. There is no need to say it explicitly.

Outras dicas

There are two problems. First, the function verseNr may not always succeed on parsing a number, as the number might be followed by a :. While theverseNrs function always assumes that verseNr succeeds on parsing a digit by pattern matching against it as first. Second, the function verseNrs does not handle the case of the last digit in the string, which is not followed by a ,.

I believe Tikhon's suggestion is the best. However, if you insist on implementing it manually, here is how I would do it.

import Control.Monad (void)
import Control.Applicative ((<*))

verseNrs :: GenParser Char st [Int]
verseNrs = do
    first <- fmap Just (try (many1 digit
                          <* spaces
                          <* (eof <|> void (char ','))
                          <* spaces))
             <|> return Nothing
    case first of
      Just first -> fmap (read first:) verseNrs
      Nothing    -> return []

The rest of the code is the same.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top