Вопрос

I'm trying to write a program that takes two integers on the command line and does something interesting with them. I would like write the reading/parsing of the integers as easily and imperatively as possible since it should be relatively simple code.

The problem that I'm facing is that in Haskell handling errors is not so simple. It seems that in Haskell pattern matching is often used. This seems to make the code slightly harder to follow than the imperative version.

The program would be run like this (in this example it just adds together the two numbers):

$ ./my_prog
ERROR: Must run like `./my_prog NUM_A NUM_B`.
$ ./my_prog cat 1
ERROR: Could not parse NUM_A "cat" as integer
$ ./my_prog 10 dog
ERROR: Could not parse NUM_B "dog" as integer
$ ./my_prog 10 1
11

Here's is what I would like to do in imperative pseudo-Python:

function die (errorMessage):
    print("ERROR: %s" % errorMessage)
    sys.exit(1)

function main ():
    if len(sys.argv) != 2:
        die("Must run program like `%s NUM_A NUM_B`" % sys.progname)

    num_a = 0
    num_b = 0

    try:
        num_a = int(sys.argv[0])
    except:
        die("Could not parse NUM_A \"%s\" as integer" % sys.argv[0])

    try:
        num_b = int(sys.argv[1])
    except:
        die("Could not parse NUM_B \"%s\" as integer" % sys.argv[1])

     doSomethingInteresting(num_a, num_b)

function doSomethingInteresting (num_a, num_b):
    print(num_a + num_b)

In python you can basically read the main function from top to bottom and all the error handling is straightforward. Is there a way to implement this simple, straightforward error handling in Haskell without doing multiple pattern matchings?

Here is the Haskell code I came up with that does this same task, but it seems much more complicated than the Python code because of the multiple pattern matching sections.

module Main ( main
            )
            where

import System.Environment (getArgs, getProgName)
import System.Exit (ExitCode(..), exitWith)
import Text.Read (readMaybe)

die :: String -> IO a
die err = do putStrLn $ "ERROR: " ++ err
             exitWith (ExitFailure 1)

main :: IO ()
main = do
        args <- getArgs
        progName <- getProgName
        case args of
            [strNumA, strNumB] -> do
                let maybeNumA = readMaybe strNumA :: Maybe Int
                    maybeNumB = readMaybe strNumB :: Maybe Int
                checkMaybeArgs strNumA maybeNumA strNumB maybeNumB
            _ -> die ("Must run like `" ++ progName ++ " NUM_A NUM_B`.")
    where
        checkMaybeArgs :: String -> Maybe Int -> String -> Maybe Int -> IO ()
        checkMaybeArgs badStrNumA Nothing _ _ =
            die ("Could not parse NUM_A \"" ++ badStrNumA ++ "\" as integer")
        checkMaybeArgs _ _ badStrNumB Nothing =
            die ("Could not parse NUM_B \"" ++ badStrNumB ++ "\" as integer")
        checkMaybeArgs _ (Just numA) _ (Just numB) = doSomethingInteresting numA numB

        doSomethingInteresting :: Int -> Int -> IO ()
        doSomethingInteresting numA numB = print $ numA + numB

(Also if there is anything else wrong with my Haskell style I would be very grateful for any corrections.)

edit: I recently found a blog post talking about the many different ways to handle exceptions in Haskell. It is somewhat related:

http://www.randomhacks.net/articles/2007/03/10/haskell-8-ways-to-report-errors

Это было полезно?

Решение

Here's how I'd write this (without using any external libraries)

import Text.Read
import Text.Printf
import System.Environment
import Control.Monad
import System.Exit


parsingFailure :: String -> String -> IO a
parsingFailure name val = printf "ERROR: Couldn't parse %s : %s as an integer\n" 
                                  name val >> exitWith (ExitFailure 1)

validateArgs :: [String] -> IO (Integer, Integer)
validateArgs [a, b] = liftM2 (,) (parse "VAL A" a) (parse "VAL B" b)
  where parse name s = maybe (parsingFailure name s) return $ readMaybe s
validateArgs _      = putStrLn "Wrong number of args" >> exitWith (ExitFailure 1)

main :: IO ()
main = do
  (a, b) <- getArgs >>= validateArgs
  print $ a + b

The interesting bit of course being validateArgs. First we do a single pattern match, but from then on we just use the maybe combinator to nicely abstract away our pattern matching. This results in far cleaner code IMO. The maybe combinator takes a default value b and a continuation a -> b and unwraps Maybe a to b. In this case, our default value is a parsing failure, and our continuation injects a into the IO monad.

Другие советы

Although I like @jozefgs solution for its brevity, I dislike that it uses exitWith. Sure, it works fine for such a small example as this, but in general I think it's a bad idea to terminate the program "prematurely" like that. It's indeed a property I dislike of the Python version as well. In any case, my solution follows, and I'll explain a few parts of it below.

import System.Environment (getProgName, getArgs)
import Text.Read (readMaybe)

doSomethingInteresting :: Int -> Int -> IO ()
doSomethingInteresting a b = print (a + b)

readWith :: Read a => (String -> e) -> String -> Either e a
readWith err str = maybe (Left $ err str) return $ readMaybe str


main = do
  progName <- getProgName
  args     <- getArgs

  either putStrLn id $ do
    (a,b) <- case args of
               [a,b] -> return (a,b)
               _     -> Left ("Must run program like `" ++ progName ++ " NUM_A NUM_B`")

    num_a <- readWith (\s -> "Could not parse NUM_A \"" ++ s ++ "\" as integer") a
    num_b <- readWith (\s -> "Could not parse NUM_B \"" ++ s ++ "\" as integer") b

    return $ doSomethingInteresting num_a num_b

I know there's a pattern matching in this, but that is really the cleanest way to express what I want. If the list has two elements, I want them out of it. If it doesn't, I want an error message. There is no neater way of expressing that.

But there are a few other things going on here that I want to highlight.

Single exit point

First, which I've already mentioned, the program does not have several "exit points", i.e. it doesn't terminate the program in the middle of something. Imagine if the main function was not the main function but somewhere deeper down, and you had some files open or whatnot in a function above. You would want to go out the proper way to close the files and so on. That's why I choose to not terminate the program, but rather let it run its course and just don't do the calculations if it doesn't have the number.

Just terminating the program is rarely a good idea. Often, you have things you want to save, files you want to close, state you want to restore and so on.

Either annotates errors

It might not be very evident for a beginner, but what readWith basically does is try to read something, and if it succeeds, it returns Right <read value>. If it fails, it will return Left <error message>, where the error message can depend on what string it tried to read.

So, in this program,

λ> readWith errorMessage "15"
Right 15

while

λ> readWith errorMessage "crocodile"
Left "Cannot read \"crocodile\" as an integer, dummy!"

The Either type is great when you want to transport a value but there might occur an error along the way, and you want to keep the error message around until you know what to do about the error. The community consensus is that Right should indicate the "correct" value and Left should indicate that some error occurred.

Either values are a sort of more controlled (read: better) exception system.

do syntax is your friend

The do syntax is very handy for handling errors. As soon as any computation results in a Left value, Haskell will break out of the do block with the error message carried by the Left. This means that the entire inner do block will in this case either result in something similar to

Right (print 41)

or something like

Left "Could not parse NUM_A \"crocodile\" as integer"

The either function then makes sure to either print the error message or just return the IO action that is the Right value. It might be weird coming from Python that you can store a print action as a value, but that's normal in Haskell. We say that I/O actions are first class citizens in Haskell. In other words, we can pass them around and store them in data structures, and then we decide ourselves when we want them executed.

Here's my go at it:

module Main where

import Text.Read (readMaybe)
import System.Environment (getProgName, getArgs)

main = getArgs >>= \argv -> case argv of
  [x, y] -> case (readMaybe x, readMaybe y) of
    (Nothing, _      ) -> error $
      "ERROR: Could not parse NUM_A " ++ show x ++ " as integer"
    (_      , Nothing) -> error $
      "ERROR: Could not parse NUM_B " ++ show y ++ " as integer"
    (Just a , Just b ) -> print $ a + b
  _ -> do
    pname <- getProgName
    error $ "ERROR: Must run like `" ++ pname ++ " NUM_A NUM_B`."

Notice how you can pattern match on a tuple to match on multiple expressions at once, saving you from needing multiple embedded case expressions.

Here's my take at it:

import System.Environment
import Text.Printf
import Text.Read


main :: IO ()
main = do
  args <- getArgs
  either putStrLn (uncurry doSomethingInteresting) $ parseArgs args


doSomethingInteresting :: Int -> Int -> IO ()
doSomethingInteresting a b = print $ a + b


parseArgs :: [String] -> Either String (Int, Int)
parseArgs [a, b] = do
  a' <- parseArg "A" a
  b' <- parseArg "B" b
  return (a', b')
parseArgs _ = fail "Wrong number of args"


parseArg :: String -> String -> Either String Int
parseArg name arg
  | Just x <- readMaybe arg = return x
  | otherwise = fail $ printf "Could not parse NUM_%s \"%s\" as integer" name arg

Or for Applicative fans, another way of writing parseArgs:

parseArgs :: [String] -> Either String (Int, Int)
parseArgs [a, b] = (,) <$> parseArg "A" a <*> parseArg "B" b
parseArgs _ = fail "Wrong number of args"
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top