Question

Sorry for the somewhat general question. I am new to Haskell, and I am the type of person who learns by diving into a problem and figuring out the required pieces as I go along. So, I've started developing a Haskell module that aims to do simple linear regression. Here is my directory setup:

mymod/
- mymod.cabal
- src/
-- MyMod/
--- Linear.hs
--- Linear/
---- Regression.hs
--- Data.hs
--- Data/
---- Examples.hs
- tst/

My cabal file looks like this:

library
  exposed-modules:     MyLib.Linear, MyLib.Linear.Classifier, 
                       MyLib.Data, MyLib.Data.Examples
  build-depends:       base == 4.6.*
  hs-source-dirs:      src

Right now, I am writing the Examples module, which is essentially a CSV file parser. That looks like this:

module Exampels (load) where

import Text.ParserCombinators.Parsec
import Control.Applicative

examples = line `endBy` eol
line = cell `sepBy` (char ',')

cell :: GenParser Char st Double
cell = rd <$> many1 (noneOf ",\n")
    where rd = read :: String -> Double

eol = char '\n'

load :: String -> Either ParseError [[Double]]
load input = parse examples "(unknown)" input

This was the first piece of the system that I wrote. I tested this by using ghci and :l Examples.hs from mylib/src/MyLib/Data/ followed by load "5\n" and verified the result. Now I want to start writing the regression logic, but I want to test this code in conjunction with the CSV parser I already wrote. How to people typically go about testing code like this?

For instance, in Java I would typically create a new package with a class that has a main method in it. With Java, this is straight forward to me because I understand how the classpath works and can direct the compiler to look for my classes that I want to run. How do I do this in Haskell?

Thanks!

Was it helpful?

Solution

Essentially there are four approaches: writing tests, writing executables, experimenting with REPL (GHCI) and writing benchmarks. Luckily the latest Cabal (1.18) supports all of them. Also for reference I have a project, which exhibits some.

Tests

Probably the best approach when you have some feature you need to test is to write a unit-test. Accumulating tests with the growth of your project is the key to its reliability.

There are three main frameworks out there: HUnit for unit-testing, QuickCheck for property-testing and doctest for testing examples from doc comments. There also are dome frameworks, like HTF, which unites HUnit and QuickCheck and relieves you from some of their boilerplate.

In Cabal you can define test-suites as separate compilation units with their own settings. Here is an example. You can then run them with cabal test.

Executables

There are some cases when testing does not really cover the requirements. A standard case is a program, which demonstrates, how the library is supposed to be used. Here is an example.

You may also use executables as sandboxes for testing the APIs of your library, but again, a wiser approach is to write tests.

You can run executables with cabal run [name], where the "name" specifies the executable name if there's a need to disambiguate (i.e., when you have more than one).

REPL (GHCI)

The main benefit is it allows you to experiment with the APIs of your project's modules in a "live" mode, when you load internal modules, run their functions and reload them on updates. This might be useful for analysing the API, but I personally find the two approaches above covering most of what I may need from it.

You can run GHCI on your project with cabal repl [name].

Benchmarks

Criterion is a single dominant library for benchmarking. Similar to the above you can declare your benchmark executables in cabal with the benchmark [name] block. You can then run them with cabal bench.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top