Question

I want to parse assembly programs. I have a fixed format for parsing an assembly address: [ register + offset + label ] I implemented parsers for registers, offsets and labels. Now I want to create a parser which parses the whole address.

The combinations I want to accept:

[register]
[offset]
[label]
[register + offset]
[register + label]
[offset + label]
[register + offset + label]

And what I don't want to accept:

[]
[register offset]
[register + ]
...

Of course the simple solution is to have something like:

choice $ try (parseRegister >>= \r -> Address (Just r) Nothing Nothing)
       <|> try ...

But it is ugly and does not scale well with more types of elements. So I'm looking for a cleaner solution.

Était-ce utile?

La solution

If you reorder your table, you see it’s a series of choices:

[register + offset + label]
[register + offset        ]
[register          + label]
[register                 ]
[           offset + label]
[           offset        ]
[                    label]

The grammar for which might be written:

address      = '[' (register ('+' offset-label)? | offset-label) ']'
offset-label = offset ('+' label)? | label

Which in Applicative style is pretty straightforward, made only slightly noisy by wrapping everything in constructors:

parseAddress :: Parser Address
parseAddress = do
  (register, (offset, label)) <- between (char '[') (char ']') parseRegisterOffsetLabel
  return $ Address register offset label

parseRegisterOffsetLabel :: Parser (Maybe Register, (Maybe Offset, Maybe Label))
parseRegisterOffsetLabel = choice
  [ (,)
    <$> (Just <$> parseRegister)
    <*> option (Nothing, Nothing) (char '+' *> parseOffsetLabel)
  , (,) Nothing <$> parseOffsetLabel
  ]

parseOffsetLabel :: Parser (Maybe Offset, Maybe Label)
parseOffsetLabel = choice
  [ (,)
    <$> (Just <$> parseOffset)
    <*> option Nothing (char '+' *> (Just <$> parseLabel))
  , (,) Nothing . Just <$> parseLabel
  ]

If we add a couple of utility functions:

plus :: Parser a -> Parser a
plus x = char '+' *> x

just :: Parser a -> Parser (Maybe a)
just = fmap Just

We can clean up these implementations a bit:

parseRegisterOffsetLabel = choice
  [ (,)
    <$> just parseRegister
    <*> option (Nothing, Nothing) (plus parseOffsetLabel)
  , (,) Nothing <$> parseOffsetLabel
  ]

parseOffsetLabel = choice
  [ (,)
    <$> just parseOffset
    <*> option Nothing (plus (just parseLabel))
  , (,) Nothing <$> just parseLabel
  ]

Then factor out the repetition, giving us a decent final solution:

parseChain begin def rest = choice
  [ (,) <$> just begin <*> option def (plus rest)
  , (,) Nothing <$> rest
  ]

parseRegisterOffsetLabel = parseChain
  parseRegister (Nothing, Nothing) parseOffsetLabel

parseOffsetLabel = parseChain
  parseOffset Nothing (just parseLabel)

I’ll let you take care of whitespace around + and inside [].

Autres conseils

Something like that:

parsePlus = many1 (char ' ') >> char '+' >> many1 (char ' ')

parseRegisterModified = parsePlus >> parseOffsetLabel

parseOffsetModified = parsePlus >> parseLabel

parseRegister' = do
    Address r _ _ <- parseRegister 
    optionMaybe parseRegisterModified >>=
    return $ maybe 
           (Address r Nothing Nothing) 
           (\Address _ o l -> Address r o l) 

parseOffset' = do
    Address _ o _ <- parseOffset 
    optionMaybe parseOffsetModified >>=
    return $ maybe 
           (Address Nothing o Nothing) 
           (\Address _ _ l -> Address Nothing o l)

parseOffsetLabel = try parseOffset' <|> parseLabel

parseAddress = 
     try parseRegister'
     <|> parseOffset'
     <|> parseLabel

I've been looking for something like that and found Control.Applicative.Permutation from action-permutations. Though my case may scale independently from low-level platform.

In your case might look like

operand = do
    (r, o, l) <- runPermsSep (char '+') $ (,,)
        <$> maybeAtom register
        <*> maybeAtom offset
        <*> maybeAtom label
    -- backtrack on inappropriate combination
    when (null $ catMaybes [r, o, l]) . fail $ "operand expected"
    return (r, o, l)

Note that you actually want optional permutation parser that requires at least one optional element to be present which makes your wanted parsers combinator pretty specific.

You could have more elegant solution using Monoids and sepBy1.

But it allows to write [register + register] (in our case adding them both)

parsePlus = many1 (char ' ') >> char '+' >> many1 (char ' ')

parseAddress1 = 
     try parseRegister
     <|> parseOffset
     <|> parseLabel

parseAddress = sepBy1 parsePlus parseAddress1 >>= return . mconcat

instance Monoid Address where
   mempty  = Address Nothing Nothing Nothing
   Address r o l `mappend` Address r' o' l' = 
           Address (r `mappendA` r') (o `mappendA` o') (l `mappendA` l')
   where
         a `mappendA` a' = fmap getSum $ fmap Sum a `mappend` fmap Sum a'

Choosing Monoid (Sum a, First a, Last a) for r o l, we change the behavior:

Sum adds each other, First chooses first one, Last chooses the last one

   ... where
      a `mappendA` a' = getFirst $ First a `mappend` First a'
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top