Int vs Word in common use?

https://stackoverflow.com/questions/12432154

02-07-2021
|

Question

It seems like the common pattern of taking/returning Int (ie ByteString.hGet and Data.List.length) is contrary to the Haskell pattern of using strongly-descrbing types, since many of these cases can only handle positive numbers. Would it not be better to use Word, or is there a reason these functions are partial on Int?

Solution

It is true that the expressiveness of the Haskell type system encourages users to assign precise types to entities they define. However, seasoned Haskellers will readily acknowledge that one must strike a balance between ultimate type precision (which besides isn't always attainable given the current limits of Haskell type system) and convenience. In short, precise types are only useful to a point. Beyond that point, they often just cause extra bureaucracy for little to no gain.

Let's illustrate the problem with an example. Consider the factorial function. For all n bigger than 1, the factorial of n is an even number, and the factorial of 1 isn't terribly interesting so let's ignore that one. Therefore, in order to make sure that our implementation of the factorial function in Haskell is correct, we might be tempted to introduce a new numeric type, that can only represent unsigned even integers:

module (Even) where

newtype Even = Even Integer

instance Num Even where
  ...
  fromInteger x | x `mod` 2 == 0 = Even x
                | otherwise = error "Not an even number."

instance Integral Even where
  ...
  toInteger (Even x) = x

We seal this datatype inside a module that doesn't export the constructor, to make it abstract, and make it an instance of the all the relevant type classes that Int is an instance of. Now we can give the following signature to factorial:

factorial :: Int -> Even

The type of factorial sure is more precise than if we just said that it returns Int. But you'll find that definining factorial with such a type is really quite annoying, because you need a version of multiplication that multiplies an (even or odd) Int with an Even and produces and Even. What's more, you might have to introduce extraneous calls to toInteger on the result of a call to factorial in client code, which can be a significant source of clutter and noise for little gain. Besides, all these conversion functions could potentially have a negative impact on performance.

Another problem is that when introducing a new, more precise type, you often end up having to duplicate all manner of library functions. For instance, if you introduce the type List1 a of non-empty lists, then you will have to reimplement many of the functions that Data.List already provides, but for [a] only. Sure, one can then make these functions methods of ListLike type class. But you quickly end up with all manner of adhoc type classes and other boilerplate, with again not much gain.

One final point is that one shouldn't consider Word to be an unsigned variant of Int. The Haskell report leaves the actual size of Int unspecified, and only guarantees that this type should be capable of representing integers in the range [− 2²⁹, 2²⁹ − 1]. The type Word is said to provide unsigned integers of unspecified width. It isn't guaranteed that in any conforming implementation the width of a Word corresponds to the width of an Int.

Though I make a point guarding against excessive type proliferation, I do acknowledge that introducing a type of Natural of naturals could be nice. Ultimately, though, whether Haskell should have a dedicated type for natural numbers, in addition to Int, Integer, and the various Word* types, is largely a matter of taste. And the current state of affairs is probably in very large part just an accident of history.

OTHER TIPS

The same reasoning applies as in C. The reason to use more precise types is to prevent mistakes. Mistakes, in this case, such as trying to use negative numbers where they are not meaningful. But the behaviour of Word on over- or underflow, as of unsigned int in C, is to wrap around. If you attempt to use a negative number where a Word (or unsigned int) is expected, you don't get the compiler yelling at you, or even an exception at runtime. You get a large positive number. Which you have no way of distinguishing from any other ("legitimate") large positive number!

Look at this:

Prelude Data.Word> -1 :: Word
4294967295

Instead of making mistakes impossible to commit, you have made them impossible to detect. With Int (and int), you at least have the possibility of checking for negative values manually. With Word and unsigned int, you have nothing.

What would be valuable is an unsigned type that reacts to over- or underflow by throwing an exception. That still wouldn't make mistakes impossible, but it would make them easier to detect. However, it would come at a performance cost.* I don't know if it's possible to rule them out at compile time, but it doesn't seem easy.

* At least, x86 requires an extra instruction -- after every operation! -- to check whether over- or underflow occured. I don't know if there's an architecture that does it "for free", though it would be nice. Or maybe a distinguished NaN value like we have for floating point numbers (perhaps instead of the most negative number) which would be used to denote unrepresentable values...

My first guess is that unsigned arithmetic has some issues that would cause stupid bugs if you'r not paying attention:

Prelude Data.Word> let x = 0 :: Word in if x - 1 > x then "Ouch" else "Ohayoo"
"Ouch"

It will have some issues in polymorphic functions that appears to be correct:

Prelude Data.Word> let f acc x y = if x <= y then f (acc + 1) x (y - 1) else acc
Prelude Data.Word> :t f
f :: (Num a1, Num a, Ord a) => a1 -> a -> a -> a1

Using a standard length function:

Prelude Data.Word> let len' x = if null x then 0 else 1 + len' (tail x) :: Int
Prelude Data.Word> :t len'
len' :: [a] -> Int

Prelude Data.Word> f 0 0 (len' [1,2,3])
4

And using a length function returing Word:

Prelude Data.Word> let len x = if null x then 0 else 1 + len (tail x) :: Word
Prelude Data.Word> :t len
len :: [a] -> Word

Prelude Data.Word> f 0 0 (len [1,2,3])
...diverges...

And, of course, if those functions returned Word instead of Int, you introduce the need to keep transforming your Words into Ints to use it in other common places.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow