Using low bitsize integral types like `Int8` and what they are for

https://stackoverflow.com/questions/8963942

18-04-2021
|

Question

Recently I've learned that every computation cycle performs on machine words which on most contemporary processors and OS'es are either 32-bit or 64-bit. So what are the benefits of using the smaller bit-size values like Int16, Int8, Word8? What are they exactly for? Is it storage reduction only?

I write a complex calculation program which consists of several modules but is interfaced by only a single function which returns a Word64 value, so the whole program results in Word64 value. I'm interested in the answer to this question because inside this program I found myself utilizing a lot of different Integral types like Word16 and Word8 to represent small entities, and seeing that they quite often got converted with fromIntegral got me thinking: was I making a mistake there and what was the exact benefit of those types which I not knowing about got blindly attracted by? Did it make sense at all to utilize other integral types and evetually convert them with fromIntegral or maybe I should have just used Word64 everywhere?

Solution

These smaller types give you a memory reduction only when you store them in unboxed arrays or similar. There, each will take as many bits as indicated by the type suffix.

In general use, they all take exactly as much storage as an Int or Word, the main difference is that the values are automatically narrowed to the appropriate bit size when using fixed-width types, and there are (still) more optimisations (in the form of rewrite rules mainly) for Int and Word than for Int8 etc., so some operations will be slower using those.

Concerning the question whether to use Word64 throughout or to use smaller types, that depends. On a 64-bit system, when compiling with optimisations, the performance of Word and Word64 should mostly be the same since where it matters both should be unpacked and the work is done on the raw machine Word#. But there probably still are a few rules for Word that have no Word64 counterpart yet, so perhaps there is a difference after all. On a 32-bit system, most operations on Word64 are implemented via C calls, so there operations on Word64 are much slower than operations on Word.

So depending on what is more important, simplicity of code or performance on different systems, either

use Word64 throughout: simple code, good performance on 64-bit systems
use Word as long as your values are guaranteed to fit into 32 bits and transform to Word64 at the latest safe moment: more complicated code, but better performance on 32-bit systems.

OTHER TIPS

In GHC, the fixed-size integral types all take up a full machine word, so there's no space savings to be had. Using machine-word-sized types (i.e. Int and Word) will probably be faster than the fixed-size types in most cases, but using a fixed-size integral type will be faster than doing explicit wrap-around.

You should choose the appropriate type for the range of values you're using. maxBound :: Word8 is 255, 255 + 1 :: Word8 is 0 — and if you're dealing with octets, that's exactly what you want. (For instance, ByteStrings are defined as storing Word8s.)

If you just have some integers that don't need a specific number of bits, and the calculations you're doing aren't going to overflow, just use Int or Word (or even Integer). Fixed-size types are less common than the regular integral types because, most of the time, you don't need a specific size.

So, don't use them for performance; use them if you're looking for their specific semantics: fixed-size integral types with defined overflow behaviour.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow