Question

I am trying to separate a string using a delimiter consisting of multiple characters, but the problem is that each of those characters can occur by itself in non-delimiting string. For example, I have foo*X*bar*X*baz, where the delimiter is *X*, so I want to get [foo, bar, baz], but each one of those can contain * or X.

I have tried

sepBy (many anyChar) delimiter

but that just swallows the whole string, giving "foo*X*bar*X*baz", if I do

sepBy anyChar (optional delimiter)

it filters out the delimiters correctly, but doesn't partition the list, returning "foobarbaz". I don't know which other combination I could try.

Était-ce utile?

La solution

Perhaps you want something like this,

tok = (:) <$> anyToken <*> manyTill anyChar (try (() <$ string sep) <|> eof)

The anyToken prevents us from looping forever at the end of input, the try lets us avoid being over-eager in consuming separator characters.

Full code for a test,

module ParsecTest where
import Control.Applicative ((<$), (<$>), (<*>))
import Data.List (intercalate)
import Text.Parsec
import Text.Parsec.String

sep,msg :: String
sep = "*X*"
msg = intercalate "*X*" ["foXo", "ba*Xr", "bX*az"]

tok :: Parser String
tok = (:) <$> anyToken <*> manyTill anyChar (try (() <$ string sep) <|> eof)

toks :: Parser [String]
toks = many tok

test :: Either ParseError [String]
test = runP toks () "" msg
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top