Differences between data/type constructors and functions?

https://stackoverflow.com/questions/5353555

27-10-2019
|

Question

Could anyone explain to me what are the differences between data/type constructors and functions? Haskell mix them and give us a universal interface (all looks like functions, in particular, we can partially apply them), while the ML family languages distinguish them.

Solution

Your dichotomy "Haskell vs. the ML world" is wrong¹. SML also promotes constructors to functions, and Caml Light used to. I'm not sure why it was removed in OCaml, I think the designer thought there wasn't a real need for it.

There are deep reasons why constructors are slightly particular. They can be used in patterns, while general functions cannot -- I guess this can be explained in term of polarized logic and focusing². Also, a constructor applied to values can be considered a value, while this is not true for general functions. It is common in call-by-value languages to restrict recursive values definition to a subclass that allows constructor application, but not general function application.

I agree that the "semantic sugar" of promoting all constructors to functions is nice. I think however that this would not be so useful if we had a nice syntactic sugar for short abstraction, such as Scala's Some(_). Whether constructors should be curryfied (your "partial application" remark) or not is a different and orthogonal question, in my opinion.

¹: besides being a wrong dichotomy, the tone of your question has a certain taste of "Haskellers and MLers, here is the ring, please fight !". It may not be intentional, but anyway you should probably avoid such formulations. Language design is made of compromises, and assuming when two different languages made different choices that one of them is right and not the other is not a good approach.

PS: the question has since been edited and is now much more neutral. Thanks for editing.

²: as requested by petebu, here is a little more (actually a lot more) information on "focusing and polarity". I would like to point out that I'm really not an expert on this topic (hence the "I guess").

I would recommend first the paper Focusing on Pattern Matching (PDF) by Neelakantan Krishnaswami, 2009. It contains an introduction for people not proficient in polarised logic and focusing (but you need to be at least familiar with sequent calculus). As usual in sequent calculus, types/propositions are introduced "on the right" and eliminated/deconstructed "on the left". But here we separate types by "polarity", product and sums being positive and functions negative (my intuition is that product/sums are data while functions are computations, but the separation is motivated by very natural considerations of their behaviour in sequent calculus), and the paper shows that left-elimination of arrow types corresponds to function application, while left-elimination of sum/product types corresponds to pattern matching. There is a case construct that behaves similarly to pattern matching (with some differences), that eliminates positives, so could not be applied to functions.

The other important reference would be Focusing on Binding and Computation, by Dan Licata, Noam Zeilberger and Robert Harper, 2008 (note that if you're planning to submit a paper on these topics, "Focusing on ..." is becoming a bit cliché). It emphasizes much less the link with ML-style pattern matching, but introduces the very nice idea that while computational arrows are "negative on the left" (easily seen as the classical equivalence A→B ≡ ¬A∨B), one may introduce a different arrow, "positive on the left", therefore positively polarized, that can be pattern-matched. It turns out that this arrow is a good match for representing variable bindings (if you're familiar with Higher Order Abstract Syntax, the idea is that the left polarity rules out "exotic terms" that try to compute on their variables), so that terms with variables bindings are data structure that can be pattern-matched like sums or products.
I found their paper a bit hard to read, so I would start with Dan Licata's slides. Finally, Robert Harper has made other slides that provides a different intuition in termes of inductive judgements/derivations : the positive arrows represent derivability (could you build a derivation of C under the hypothesis A and B?) while negative arrows represent admissibility (given derivations of A and B, how would you rewrite/explore/manipulate them to build a derivation of C ?). Very interesting stuff.

OTHER TIPS

First the difference between value and type must be explained.

"Because Haskell is a purely functional language, all computations are done via the evaluation of expressions (syntactic terms) to yield values (abstract entities that we regard as answers). Every value has an associated type." -- A gentle introduction to Haskell 98 (This tutorial is not quite gentle)

Haskell values are "first-class" while Haskell types are not. Types are used to describe values and the association of a value and its type is called typing.

The difference between data/type constructor is that: applying a data constructor yields a value, but applying a type constructor yields a type.

Functions are merely expressions describing values, which is associated with a type. When you evaluate a function you are just evaluating the expressions describing the values.

Haskell does reduce and simplify some of the OCaml syntax. But Haskell syntax does unambiguously distinguish constructors and functions. Functions start with a lowercase letter and Constructors start with an Upper case letter. With infix operators data constructors must start with a colon and normal operators may never start with a colon.

The interface for constructors can look like simple (perhaps partial) function application, this makes one argument constructors and most newtypes very easy to use (Just "hello"). But Haskell does let you use OCaml-like record names and {field=value,field=value} style, though in Haskell you are not forced to have field names or forced to use this syntax. So OCaml has simple syntax only for single fields, but Haskell lets you have simple syntax for more than one field. Eventually, avoiding the field names is bad for large type as the positional function-like syntax is harder to refactor.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow