Say I have a Parser p in Parsec and I want to specify that I want to ignore all superfluous/redundant white space in p. Let's for example say that I define a list as starting with "[", end with "]", and in the list are integers separated by white space. But I don't want any errors if there are white space in front of the "[", after the "]", in between the "[" and the first integer, and so on.

In my case, I want this to work for my parser for a toy programming language.

I will update with code if that is requested/necessary.

有帮助吗?

解决方案 2

Use combinators to say what you mean:

import Control.Applicative
import Text.Parsec
import Text.Parsec.String

program :: Parser [[Int]]
program = spaces *> many1 term <* eof

term :: Parser [Int]
term = list

list :: Parser [Int]
list = between listBegin listEnd (number `sepBy` listSeparator)

listBegin, listEnd, listSeparator :: Parser Char
listBegin = lexeme (char '[')
listEnd = lexeme (char ']')
listSeparator = lexeme (char ',')

lexeme :: Parser a -> Parser a
lexeme parser = parser <* spaces

number :: Parser Int
number = lexeme $ do
  digits <- many1 digit
  return (read digits :: Int)

Try it out:

λ :l Parse.hs
Ok, modules loaded: Main.
λ parseTest program " [1, 2, 3] [4, 5, 6] "
[[1,2,3],[4,5,6]]

This lexeme combinator takes a parser and allows arbitrary whitespace after it. Then you only need to use lexeme around the primitive tokens in your language such as listSeparator and number.

Alternatively, you can parse the stream of characters into a stream of tokens, then parse the stream of tokens into a parse tree. That way, both the lexer and parser can be greatly simplified. It’s only worth doing for larger grammars, though, where maintainability is a concern; and you have to use some of the lower-level Parsec API such as tokenPrim.

其他提示

Just surround everything with space:

parseIntList :: Parsec String u [Int]
parseIntList = do
    spaces
    char '['
    spaces
    first <- many1 digit
    rest <- many $ try $ do
        spaces
        char ','
        spaces
        many1 digit
    spaces
    char ']'
    return $ map read $ first : rest

This is a very basic one, there are cases where it'll fail (such as an empty list) but it's a good start towards getting something to work.

@Joehillen's suggestion will also work, but it requires some more type magic to use the token features of Parsec. The definition of spaces matches 0 or more characters that satisfies Data.Char.isSpace, which is all the standard ASCII space characters.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top