Question

I'm trying to write a parser for a simple language; basically right now it has literals, ifs, function application and not much else.

Here's the code I've got:

import Text.ParserCombinators.Parsec
import Control.Monad (liftM)

data Expr = Term Term
          | Apply Expr Expr
          | If Expr Expr Expr
          deriving (Show)

data Term = Bool Bool
          | Num Double
          | String String
          | Identifier String
          | Parens Expr
          deriving (Show)

sstring s = spaces >> string s
schar c = spaces >> char c

keyword k = do
  kw <- try (sstring k)
  notFollowedBy alphaNum
  return kw

pBool :: Parser Bool
pBool = do
  bool <- keyword "True" <|> keyword "False"
  case bool of
    "True" -> return True
    "False" -> return False

pDouble :: Parser Double
pDouble = do
  ds <- many1 digit
  dot <- optionMaybe $ char '.'
  case dot of
    Nothing -> return $ read ds
    _ -> do
      ds' <- many1 digit
      return $ read (ds ++ "." ++ ds')

pString :: Parser String
pString = do
  char '"'
  str <- many1 $ noneOf "\""
  char '"'
  return str

pIdentifier :: Parser String
pIdentifier = spaces >> many1 letter

pParens :: Parser Expr
pParens = do
  schar '('
  expr <- pExpr
  schar ')'
  return expr

pTerm :: Parser Term
pTerm = try (liftM Bool pBool)
  <|> try (liftM Num pDouble)
  <|> try (liftM String pString)
  <|> try (liftM Identifier pIdentifier)
  <|> try (liftM Parens pParens)

-- TODO: make this left-associative
pApply :: Parser Expr
pApply = do
  term <- pTerm'
  mApp <- spaces >> optionMaybe pApply
  return $ case mApp of
    Just app -> Apply term app
    Nothing -> term

-- pulls "parens" expressions out of terms
pTerm' :: Parser Expr
pTerm' = do
  term <- pTerm
  case term of 
    Parens expr -> return expr
    otherwise -> return $ Term term

pIf :: Parser Expr
pIf = do
  keyword "if"
  cond <- pExpr
  keyword "then"
  ifTrue <- pExpr
  keyword "else"
  ifFalse <- pExpr
  return $ If cond ifTrue ifFalse

pExpr :: Parser Expr
pExpr = try pIf <|> pApply

test parser = parse parser ""

Now, if I try to parse a single number expression in ghci, all is well:

> test pExpr "1"
Right (Term (Num 1.0))

Great! And many other things work too:

> test pExpr "1.234"
Right (Term (Num 1.234))
> test pApply "neg 1"
Right (Apply (Term (Identifier "neg")) (Term (Num 1.0)))
> test pExpr "f g 1"
Right (Apply (Term (Identifier "f")) (Apply (Term (Identifier "g")) (Term (Num 1.0))))

But now, if I try to parse an if statement, I get an error:

> test pIf "if 1 then 2 else 3"
Left (line 1, column 4):
unexpected "1"
expecting space, "if", "True", "False", letter or "("

This doesn't make sense to me! Let's step through this, looking at the rule for parsing an if statement:

We parse an "if" keyword (no problem). Then for the next parse (the 1), we need to parse pExpr, which itself can be an pIf or a pApply. Well it's not an if, so we try the apply, which itself tries pTerm', which tries pTerm, which tries a pBool, which fails, and then a pNum, which succeeds! Then pTerm succeeds with a Num 1.0, so pTerm' succeeds with a Term (Num 1.0), which means pExpr succeeds with a Term (Num 1.0), and that gets passed into the cond variable... right? Well, clearly not, because it's failing! I don't see why it should fail here.

Était-ce utile?

La solution

You have problems with not eating all the spaces, and the then and else are being interpreted as identifiers. A lexeme rule is handy for eating spaces after any token. Your pIdentifier needs to explicitly check that it hasn't gobbled up a reserved word. I fixed these problems, and took the liberty of using some of the existing combinators, and changed to applicative style...

import Text.ParserCombinators.Parsec
import Control.Applicative hiding ((<|>))

data Expr = Term Term
          | Apply Expr Expr
          | If Expr Expr Expr
          deriving (Show)

data Term = Bool Bool
          | Num Double
          | String String
          | Identifier String
          | Parens Expr
          deriving (Show)

keywords = ["if", "then", "else", "True", "False"]
lexeme p = p <* spaces
schar = lexeme . char

keyword k = lexeme . try $
  string k <* notFollowedBy alphaNum

pBool :: Parser Bool
pBool = (True <$ keyword "True") <|> (False <$ keyword "False")

pDouble :: Parser Double
pDouble = lexeme $ do
  ds <- many1 digit
  option (read ds) $ do
    char '.'
    ds' <- many1 digit
    return $ read (ds ++ "." ++ ds')

pString :: Parser String
pString = lexeme . between (char '"') (char '"') . many1 $ noneOf "\""

pIdentifier :: Parser String
pIdentifier = lexeme . try $ do
  ident <- many1 letter
  if ident `elem` keywords
    then unexpected $ "reserved word " ++ show ident
    else return ident

pParens :: Parser Expr
pParens = between (schar '(') (schar ')') pExpr

pTerm :: Parser Term
pTerm = choice [ Bool       <$> pBool
               , Num        <$> pDouble
               , String     <$> pString
               , Identifier <$> pIdentifier
               , Parens     <$> pParens
               ]

-- TODO: make this left-associative
pApply :: Parser Expr
pApply = do
  term <- pTerm'
  option term $
    Apply term <$> pApply

-- pulls "parens" expressions out of terms
pTerm' :: Parser Expr
pTerm' = do
  term <- pTerm
  case term of
    Parens expr -> return expr
    _ -> return $ Term term

pIf :: Parser Expr
pIf = If <$ keyword "if"   <*> pExpr 
         <* keyword "then" <*> pExpr
         <* keyword "else" <*> pExpr

pExpr :: Parser Expr
pExpr = pIf <|> pApply

test parser = parse (spaces *> parser <* eof) ""

Autres conseils

You need to make few changes.

pExpr :: Parser Expr
pExpr = try pIf <|> pTerm'


pIf :: Parser Expr
pIf = do
  keyword "if"
  spaces
  cond <- pExpr
  keyword "then"
  spaces
  ifTrue <- pExpr
  keyword "else"
  spaces
  ifFalse <- pExpr
  return $ If cond ifTrue ifFalse
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top