How do I use takeTill until a tab or newline in Haskell Attoparsec? (Problems with Boolean expressions)

StackOverflow https://stackoverflow.com/questions/22385456

Question

I'm writing my first Haskell program. The program parses ordinary CSV files, but I'm running into many issues that no doubt stem from my inexperience with the syntax.

Currently, the code parses one record successfully, but on the final record, the parser takes up the newline and therefore doesn't process records on subsequent lines.

My proposed solution is to add a check to my fieldData specification to check for 'takeTill tab or newline', but I don't know how to do this.

Current code:

fieldData = takeTill (== '\t')

Attempts:

fieldData = takeTill (== '\t' || '\n') -- wrong, something about infix precedence
fieldData = takeTill (== ('\t' || '\n')) -- wrong, type error
fieldData = takeTill ((== '\t') || (== '\n')) -- wrong, type error
fieldData x = takeTill ((x == '\t') || (x == '\n')) -- wrong, type error
fieldData x = takeTill x ((x == '\t') || (x == '\n')) -- wrong, not enough arguments

I feel that I have some fundamental misunderstanding of how to construct Boolean conditions in Haskell and would like help. For example, in ghci I can do let fun x = (x == 'a' || x == 'b') and it'll match different characters fine, so I'm clearly missing something when it comes to using it with a function.

Alternatively, is this even the correct approach? If this is not the right way to approach the problem I would appreciate pointers to the "correct" way.

Complete code below:

{- Parsing a tab-separated file using Attoparsec.
A record contains:
number\tname\tgenre\tabilities\tweapon\n

-}
import System.FilePath.Posix
import Data.Attoparsec.Char8
import Control.Applicative
import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as C

data AbilitiesList = AbilitiesList String deriving Show

data PlayerCharacter = PlayerCharacter {
    id :: Integer,
    name :: String,
    genre :: String,
    abilities :: AbilitiesList,
    weapon :: String
} deriving Show

type Players = [PlayerCharacter]

fieldData = takeTill (== '\t')
tab = char '\t'

parseCharacter :: Parser PlayerCharacter
parseCharacter = do
    id <- decimal
    tab
    name <- fieldData
    tab
    genre <- fieldData
    tab
    abilities <- fieldData
    tab
    weapon <- fieldData
    return $ PlayerCharacter id (C.unpack name) (C.unpack genre) (AbilitiesList (C.unpack abilities)) (C.unpack weapon)

abilitiesFile :: FilePath
abilitiesFile = joinPath ["data", "ff_abilities.txt"]

playerParser :: Parser Players
playerParser = many $ parseCharacter <* endOfLine

main :: IO ()
main = B.readFile abilitiesFile >>= print . parseOnly playerParser
Was it helpful?

Solution

For this you probably want to use a lambda:

takeTill (\x -> x == '\t' || x == '\n')

A lambda function is an anonymous, one-use, inline function. You can use them just like normal functions, except they aren't bound to a name.

You could also define a function

tabOrNL :: Char -> Bool
tabOrNL '\t' = True
tabOrNL '\n' = True
tabOrNL _    = False

-- Or equivalently

tabOrNL :: Char -> Bool
tabOrNL x = x == '\t' || x == '\n'

Then you could just do

takeTill tabOrNL

If you wanted to get really fancy, the Applicative instance for functions can come in handy here:

(<||>) :: Applicative f => f Bool -> f Bool -> f Bool
(<||>) = liftA2 (||)
infixr 2 <||>

Then you can just do

takeTill ((== '\t') <||> (== '\n'))

Or even

takeTill ((== '\t') <||> (== '\n') <||> (== ','))

That way you avoid the lambda or helper function entirely, the <||> lets you just "or together" several predicates as if they were values. You can do similarly with (<&&>) = liftA2 (&&), but it's probably not as useful for you here.

OTHER TIPS

Another solution is to use elem to check if the character is in a list:

takeTill (`elem` "\t\n")

although I would only recommend it over @bheklilr's solutions for cases with more values to check.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top