How to use length annotated lists in Haskell

Question 1

You don't have to hard-code the length of the list; instead, you can define the following type:

data UList a where
    UList :: Nat n => List n a -> UList a

where

class Nat n where
    asInt :: n -> Int

instance Nat Z where
    asInt _ = 0

instance Nat n => Nat (S n) where
    asInt x = 1 + asInt (pred x)
      where
        pred = undefined :: S n -> n

and we also have

fromList :: [a] -> UList a
fromList [] = UList Nil
fromList (x:rest) =
    case fromList rest of
        UList xs -> UList (Cons x xs)

This setup allows you to create lists whose lengths are not known at compile time; you can access the length by doing a case pattern match to extract the type from the existential wrapper, and then use the Nat class to turn the type into an integer.

You might wonder what the advantage is of having a type that you don't know the value of at compile time; the answer is that although you don't know what the type will be, you can still enforce invariants. For example, the following code is guaranteed to not change the length of the list:

mapList :: (a -> b) -> List n a -> List n b

And if we have type addition using a type family called, say, Add, then we can write

concatList :: List m a -> List n a -> List (Add m n) a

which enforces the invariant that concatenating two lists gets you a new list with the sum of the two lengths.

Question 2

You pretty much need to hard-code it, since the type is of course fixed at compile-time and the GHC type checker's Turing completeness can't feasibly be abused to generate them "on its own"¹. However, that's not as dramatic as it sounds: you basically just need to write the length annotation type once. The rest can be done without mention of particular length, albeit with some strange-looking class around:

class LOL l where
  lol :: [a] -> l a

instance LOL (List Z) where
  lol _ = Nil

instance (LOL (List n)) => LOL (List (S n)) where
  lol (x:xs) = Cons a $ lol xs
  lol [] = error "Not enough elements given to make requested type length."

Then you can just use something like

type Four = S(S(S(S Z)))

get4Vect :: Read a => IO (List Four a)
get4Vect = lol . read <$> getLine    -- For input format [1,2,3,4].

¹I shan't discuss Template Haskell here, which can of course generate anything automatically at compile-time quite easily.

Question 3

The length encoding works during the compile time so obviously the type-checker cannot verify the lengths of lists that are created at run-time from e.g. user input. The idea is that you wrap any run-time lists in an existential type that hides the length parameter and then you have to supply proofs about the length in order to use the list.

For example:

{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}

module Lists where

data Nat = Z | S Nat

data List l a where
    Nil  :: List Z a
    Cons :: a -> List n a -> List (S n) a

data DynList a where
    DynList :: List l a -> DynList a

data Refl a b where
    Refl :: Refl a a

fromList :: [a] -> DynList a
fromList []     = DynList Nil
fromList (x:xs) = cons (fromList xs) where
    cons (DynList rest) = DynList $ Cons x rest

toList :: List l a -> [a]
toList Nil = []
toList (Cons x xs) = x : toList xs

dot :: Num a => List l a -> List l a -> List l a
dot Nil Nil = Nil
dot (Cons x xs) (Cons y ys) = Cons (x*y) (dot xs ys)

haveSameLength :: List l a -> List l' b -> Maybe (Refl l l')
haveSameLength Nil Nil                 = Just Refl
haveSameLength (Cons _ xs) (Cons _ ys) = case haveSameLength xs ys of
    Just Refl -> Just Refl
    Nothing   -> Nothing
haveSameLength _ _                     = Nothing

main :: IO ()
main = do
    dlx :: DynList Double <- fmap (fromList . read) getLine
    dly :: DynList Double <- fmap (fromList . read) getLine

    case (dlx, dly) of
        (DynList xs, DynList ys) -> case haveSameLength xs ys of
            Just Refl -> print $ toList $ dot xs ys
            Nothing   -> putStrLn "list lengths do not match"

Here DynList is a list of dynamic length (i.e. the length is only known at run-time) which wraps a properly typed List. Now, we have a dot function that calculates the dot product for two lists that have the same length so if we read the lists from stdin like we do in the example, we have to supply proof that the lists, in fact, have identical lengths.

The "proof" here is the Refl constructor. The way the constructor is declared means that if we can supply a value of type Refl a b then a and b must be the same type. Hence we use the hasSameLength to verify the types and pattern match on the produced Refl value and that gives the type-checker enough information that it lets us call dot on the two run-time lists.

So what this essentially means is that the type-checker will force us to manually verify the length of any list not known at compile time in order to compile the code.