Question

I'm a Haskell beginner trying to wrap my head around the conduit library.

I've tried something like this, but it does not compile:

import Data.Conduit
import Data.Conduit.Binary as CB
import Data.ByteString.Char8 as BS

numberLine :: Monad m => Conduit BS.ByteString m BS.ByteString
numberLine = conduitState 0 push close
  where
    push lno input = return $ StateProducing (lno + 1) [BS.pack (show lno ++ BS.unpack input)]
    close state = return state

main = do
  runResourceT $ CB.sourceFile "wp.txt" $= CB.lines $= numberLine $$ CB.sinkFile "test.txt"

It seems that the state in conduitState must be of the same type as the conduit's input type. Or at least that's what I understand from the error message:

$ ghc --make exp.hs
[1 of 1] Compiling Main             ( exp.hs, exp.o )

exp.hs:8:27:
    Could not deduce (Num [ByteString]) arising from the literal `0'
    from the context (Monad m)
      bound by the type signature for
                 numberLine :: Monad m => Conduit ByteString m ByteString
      at exp.hs:(8,1)-(11,30)
    Possible fix:
      add (Num [ByteString]) to the context of
        the type signature for
          numberLine :: Monad m => Conduit ByteString m ByteString
      or add an instance declaration for (Num [ByteString])
    In the first argument of `conduitState', namely `0'
    In the expression: conduitState 0 push close
    In an equation for `numberLine':
        numberLine
          = conduitState 0 push close
          where
              push lno input
                = return
                  $ StateProducing (lno + 1) [pack (show lno ++ unpack input)]
              close state = return state

How can this be done using conduits? I want to read lines from a file and append a line number to each line.

Was it helpful?

Solution

close state = return state

Herein lies the type error. Your close function should have type (state -> m [output]) (as per the docs). In your case state = Int (you may want to add type annotations to make sure it selects Int) and output = BS.ByteString, so probably just return the empty list, since at the point of closing the conduit, you haven't really saved any ByteStrings to produce or anything like that.

close _ = return []

Especially note from the docs for that argument:

The state need not be returned, since it will not be used again

OTHER TIPS

Yes, it can be done. I prefer to use the helper functions in Data.Conduit.List and also avoid Data.ByteString.Char8 if at all possible. I'm assuming your file is UTF-8 encoded.

import Data.Conduit
import Data.Conduit.Binary as CB
import Data.Conduit.List as Cl
import Data.Conduit.Text as Ct
import Data.Monoid ((<>))
import Data.Text as T

numberLine :: Monad m => Conduit Text m Text
numberLine = Cl.concatMapAccum step 0 where
  format input lno = T.pack (show lno) <> T.pack " " <> input <> T.pack "\n"
  step input lno = (lno+1, [format input lno])

main :: IO ()
main =
  runResourceT
     $ CB.sourceFile "wp.txt"
    $$ Ct.decode Ct.utf8
    =$ Ct.lines
    =$ numberLine
    =$ Ct.encode Ct.utf8
    =$ CB.sinkFile "test.txt"

An alternative solution with pipes 3.0, though it does use string instead of ByteString. The main advantage in my mind is being able to use the normal state monad methods get and put. Another benefit is that the starting line number is not hidden in the addLineNumber(numberLine) so it is easier to start at any given line number.

import System.IO
import Data.Monoid ((<>))
import Control.Proxy
import qualified Control.Proxy.Trans.State as S

addLineNumber r = forever $ do
    n <- S.get
    line <- request r -- request line from file
    respond $ show n <> " " <> line
    S.put (n + 1) -- increments line counter

main = 
    withFile "wp.txt" ReadMode    $ \fin  ->
    withFile "test.txt" WriteMode $ \fout ->
    runProxy $ S.execStateK 1 -- start at line number at 1
             $ hGetLineS fin >-> addLineNumber >-> hPutStrLnD fout

Find out how to do more fine grained resource management at the announce blog post of pipes-safe..

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top