Is the concept of Algebraic Data Type akin to Class definitions in OO languages?

https://stackoverflow.com/questions/10510685

06-06-2021
|

Question

Both concepts allow new data types to be created. The only difference I can see is that in functional languages, one can perform pattern matching on algebraic data types. But there is no comparable succinct feature for OO languages. Is this an accurate statement ?

Solution

I can see three major differences between algebraic data types and OO-style classes, not counting (im)mutablility because that varies.

Algebraic data types allows sums as well as products, whereas OO-style classes only allow products.
OO-style classes allow you to bundle a complex data item with it's accepted operations, whereas algebraic data types don't.
Algebraic data types don't distinguish between the data passed to the constructor and the data stored in the resulting value, whereas OO-style classes do (or can).

One thing I deliberately left out of that list was subtyping. While the vast majority of OO languages allow you to subclass (non-final, non-sealed, currently accessible) classes, and the vast majority of generally ML-family functional languages do not, it is clearly possible to forbid inheritance completely in a hypothetical OO (or at least OO-like) language, and it is likewise possible to produce subtyping and supertyping in algebraic data types; for a limited example of the latter, see this page on O'Haskell, which has been succeeded by Timber

OTHER TIPS

Algebraic data types are so named because they form an "initial algebra",

+ represents sum types (disjoint unions, e.g. Either).
• represents product types (e.g. structs or tuples)
X for the singleton type (e.g. data X a = X a)
1 for the unit type ()
and μ for the least fixed point (e.g. recursive types), usually implicit.

from these operators all regular data types can be constructed. Algebraic data types also support parametric polymophism -- meaning they can be used as constainers for any underlying type, with static guarantees of safety. Additionally, ADTs are provided with uniform syntax for introducing and eliminating data types (via constructors and pattern matching). E.g.

-- this defines a tree
data Tree a = Empty | Node a (Tree a) (Tree a)

-- this constructs a tree
let x = Node 1 (Node 2 Empty) Empty

-- this deconstructs a tree
f (Node a l r) = a + (f l) + (f r)

The richness and uniformity of algebraic data types, along with the fact they're immutable, distinguish them from OO objects, which largely:

only represent product types (so no recursive or sum-types)
do not support pattern matching
are mutable
do not support parametric polymorphism

A class is more than just a type definition -- classes in most OO languages are really kitchen sink features that provide all sorts of loosely related functionality.

In particular, classes act as a kind of module, giving you data abstraction and namespacing. Algebraic data types don't have this built in, modularity is usually provided as a separate, orthogonal feature (usually modules).

In some sense one can see it this way. Every language has only so many mechanisms to create user defined types. In functional (ML, Haskell style) languages, the only one is creation of an ADT. (Haskell's newtype can be seen as a degenerate case of an ADT). In OO languages, it's classes. In procedural languages it is struct or record.

It goes without saying that the semantics of a user defined data type vary from language to language, and much more so from language in paradigm#1 to language in paradigm#2. @Pharien's Flame has already outlined typical differences.

Is the concept of Algebraic Data Type akin to Class definitions in OO languages?

in functional languages, one can perform pattern matching on algebraic data types. But there is no comparable succinct feature for OO languages. Is this an accurate statement ?

That is a part of it.

As Andreas said, classes are a kitchen sink feature in statically-typed object oriented languages derived from Simula like C++, Java and C#. Classes are a jack of all trades but master of none feature in this respect: they solve many problems badly.

Comparing vanilla ML and Haskell with OO as seen in C++, Java and C# you find:

Classes can contain other classes whereas algebraic datatypes can refer to each other but cannot contain the definitions of each other.
Class hierarchies can be arbitrarily deep whereas algebraic datatypes are one-level deep: the type contains its type constructors and that is it.
New classes can be derived from old classes so classes are extensible types whereas algebraic datatypes are usually (but not always) closed.

So ADTs are not really "akin" to classes because they only solve one specific problem: class hierarchies that are one level deep. In this sense we can see two approximate observations:

ADTs require composition over inheritance.
Classes make it easy to extend a type but hard to extend the set of member functions whereas ADTs make it easy to extend functions over the type but hard to extend the type.

You might also look at the GoF design patterns. They have been expressed using classes in C++. The functional equivalents are not always ADTs but, rather, things like lambda functions instead of the command pattern and higher-order functions like map and fold instead of the visitor pattern and so on.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow