Pergunta

Say I'm going to open a file and parse its contents, and I want to do that lazily:

parseFile :: FilePath -> IO [SomeData]
parseFile path = openBinaryFile path ReadMode >>= parse' where
    parse' handle = hIsEOF handle >>= \eof -> do
        if eof then hClose handle >> return []
               else do
                   first <- parseFirst handle
                   rest  <- unsafeInterleaveIO $ parse' handle
                   return (first : rest)

The above code is fine if no error occurs during the whole reading process. But if an exception is thrown, there would be no chance to execute hClose, and the handle won't be correctly closed.

Usually, if the IO process isn't lazy, exception handling could be easily solved by catch or bracket. However in this case normal exception handling methods will cause the file handle to be closed before the actual reading process starts. That of course not acceptable.

So what is the common way to release some resources that need to be kept out of its scope because of laziness, like what I'm doing, and still ensuring exception safety?

Foi útil?

Solução

Instead of using openBinaryFile, you could use withBinaryFile:

parseFile :: FilePath -> ([SomeData] -> IO a) -> IO a
parseFile path f = withBinaryFile path ReadMode $ \h -> do
    values <- parse' h
    f values
  where
    parse' = ... -- same as now

However, I'd strongly recommend you consider using a streaming data library instead, as they are designed to work with this kind of situation and handle exceptions properly. For example, with conduit, your code would look something like:

parseFile :: MonadResource m => FilePath -> Producer m SomeData
parseFile path = bracketP
    (openBinaryFile path ReadMode)
    hClose
    loop
  where
    loop handle = do
        eof <- hIsEOF handle
        if eof
            then return ()
            else parseFirst handle >>= yield >> loop handle

And if you instead rewrite your parseFirst function to use conduit itself and not drop down to the Handle API, this glue code would be shorter, and you wouldn't be tied directly to Handle, which makes it easier to use other data sources and perform testing.

The conduit tutorial is available on the School of Haskell.

UPDATE One thing I forgot to mention is that, while the question focuses on exceptions preventing the file from being closed, even non-exceptional situations will result in that, if you don't completely consume the input. For example, if you file has more than one record, and you only force evaluation of the first one, the file will not be closed until the garbage collector is able to reclaim the handle. Yet another reason for either withBinaryFile or a streaming data library.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top