What is the functional programming answer to type-based invariants?

https://softwareengineering.stackexchange.com/questions/284543

08-10-2020
|

Question

I am aware that the concept of invariants exists across multiple programming paradigms. For example, loop invariants are relevant in OO, functional and procedural programming.

However, one very useful kind found in OOP is an invariant of the data of a particular type. This is what I'm calling "type-based invariants" in the title. For example, a Fraction type might have a numerator and denominator, with the invariant that their gcd is always 1 (i.e. the fraction is in a reduced form). I can only guarantee this by having some kind of encapsulation of the type, not letting its data be set freely. In return, I never have to check whether it's reduced, so I can simplify algorithms like equality checks.

On the other hand, if I simply declare a Fraction type without providing this guarantee through encapsulation, I can't safely write any functions on this type which assume that the fraction is reduced, because in the future somebody else could come along and add a way of getting hold of a non-reduced fraction.

Generally, the lack of this kind of invariant might lead to:

More complex algorithms as pre-conditions need to be checked/ensured in multiple places
DRY violations as these repeated pre-conditions represent the same underlying knowledge (that the invariant should be true)
Having to enforce pre-conditions through runtime failures rather than compile-time guarantees

So my question is what the functional programming answer to this kind of invariant is. Is there a functional-idiomatic way of achieving more or less the same thing? Or is there some aspect of functional programming which makes the benefits less relevant?

Solution

Some functional languages such as OCaml have built-in mechanisms to implement abstract data types therefore enforcing some invariants. Languages which do not have such mechanisms rely on the user “not looking under the carpet” to enforce the invariants.

Abstract data types in OCaml

In OCaml, modules are used to structure a program. A module has an implementation and a signature, the latter being a kind of summary of values and types defined in the module, while the former provides the actual definitions. This can be loosely compared to the diptych .c/.h familiar to C programmers.

As an example, we can implement the Fraction module like this:

# module Fraction = struct
  type t = Fraction of int * int
  let rec gcd a b =
    match a mod b with
    | 0 -> b
    | r -> gcd b r

  let make a b =
   if b = 0 then
     invalid_arg "Fraction.make"
   else let d = gcd (abs a) (abs b) in
     Fraction(a/d, b/d)

  let to_string (Fraction(a,b)) =
    Printf.sprintf "Fraction(%d,%d)" a b

  let add (Fraction(a1,b1)) (Fraction(a2,b2)) =
    make (a1*b2 + a2*b1) (b1*b2)

  let mult (Fraction(a1,b1)) (Fraction(a2,b2)) =
    make (a1*a2) (b1*b2)
end;;

module Fraction :
  sig
    type t = Fraction of int * int
    val gcd : int -> int -> int
    val make : int -> int -> t
    val to_string : t -> string
    val add : t -> t -> t
    val mult : t -> t -> t
  end

This definition can now be used like this:

# Fraction.add (Fraction.make 8 6) (Fraction.make 14 21);;
- : Fraction.t = Fraction.Fraction (2, 1)

Anyone can produce values of the type fraction directly, bypassing the safety net built in Fraction.make:

# Fraction.Fraction(0,0);;
- : Fraction.t = Fraction.Fraction (0, 0)

To prevent this, it is possible to hide the concrete definition of the type Fraction.t like that:

# module AbstractFraction : sig
  type t
  val make : int -> int -> t
  val to_string : t -> string
  val add : t -> t -> t
  val mult : t -> t -> t
end = Fraction;;

module AbstractFraction :
sig
  type t
  val make : int -> int -> t
  val to_string : t -> string
  val add : t -> t -> t
  val mult : t -> t -> t
end

The only way to create an AbstractFraction.t is to use the AbstractFraction.make function.

Abstract data types in Scheme

The Scheme language does not have the same mechanism of abstract data types as OCaml does. It relies on the user “not looking under the carpet” to achieve encapsulation.

In Scheme, it is customary to define predicates like fraction? recognising values giving the opportunity to validate the input. In my experience the dominant usage is to let the user validate its input, if he forges a value, rather than to validate input in each library call.

There is however several strategies to enforce abstraction of returned values, like returning a closure which yields the value when applied or returning a reference to a value in a pool managed by the library – but I never saw any of them in practice.

OTHER TIPS

Encapsulation is not a feature that came with OOP. Any language that supports proper modularization has it.

Here's roughly how you do it in Haskell:

-- Rational.hs
module Rational (
    -- This is the export list. Functions not in this list aren't visible to importers.
    Rational, -- Exports the data type, but not its constructor.
    ratio,
    numerator,
    denominator
    ) where

data Rational = Rational Int Int

-- This is the function we provide for users to create rationals
ratio :: Int -> Int -> Rational
ratio num den = let (num', den') = reduce num den
                 in Rational num' den'

-- These are the member accessors
numerator :: Rational -> Int
numerator (Rational num _) = num

denominator :: Rational -> Int
denominator (Rational _ den) = den

reduce :: Int -> Int -> (Int, Int)
reduce a b = let g = gcd a b
             in (a `div` g, b `div` g)

Now, to create a Rational, you use the ratio function, which enforces the invariant. Because data is immutable, you cannot later violate the invariant.

This does cost you something, though: it's no longer possible for the user to use the same deconstructing declaration as denominator and numerator use.

You do it the same way: make a constructor that enforces the constraint, and agree to use that constructor whenever you create a new value.

multiply lhs rhs = ReducedFraction (lhs.num * rhs.num) (lhs.denom * rhs.denom)

But Karl, in OOP you don't have to agree to use the constructor. Oh really?

class Fraction:
  ...
  Fraction multiply(Fraction lhs, Fraction rhs):
    Fraction result = lhs.clone()
    result.num *= rhs.num
    result.denom *= rhs.denom
    return result

In fact, opportunities for this kind of abuse are fewer in FP. You have to put the constructor last, because of immutability. I wish people would stop thinking of encapsulation as some sort of protection against incompetent colleagues, or as obviating the need for communicating constraints. It doesn't do that. It just limits the places you have to check. Good FP programmers also use encapsulation. It just comes in the form of communicating a few preferred functions to make certain kinds of modifications.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange