Why does Scala name monadic composition as “for comprehension”?

https://softwareengineering.stackexchange.com/questions/307565

11-12-2020
|

Question

Not sure if it's an appropriate question, but here it goes.

I know Haskell's do notation pretty well. And I realized that Scala's "for comprehension" really is just mostly the same as do notation in Haskell. Something I don't quite understand is why did the designers choose this name though? The words "for" and "yield" seem likely to confuse programmers who are used to, for example, Java's for loop and Ruby's yield.

Yet I don't think monadic composition has much in common with those things at all. Is there a particular reason to name this syntactic sugar with such words? Or is it just mostly because a lot of keywords have already been occupied in the language, so they had to use these keywords which are relatively free?

Solution

I think you are creating a bit of a false dichotomy here.

Haskell has monad comprehensions built into the language. One reason for that is the use of monads for imperative-style I/O. Therefore, the designers of Haskell decided to make it look mostly like a code block in a generic C-style language, complete with curly braces, semicolons and even return. The assignments look a bit non-C, but left-arrow for assignment is typical imperative pseudo-code.

So, monad comprehensions in Haskell were designed to look like imperative sequential side-effecting code blocks in some generic C/pseudo-code like language, because that's an important use case for them. This even permeates the API design with the naming of the aforementioned return function. Note that the name of this function doesn't really make much sense if you use it outside of do notation.

C# has monad comprehensions built into the language. One reason for that is the use of monads for querying of datasets. Therefore, the designers of C# decided to make it look mostly like SQL or XQuery, complete with FROM, SELECT, WHERE, GROUPBY, ORDERBY, etc. The order looks a bit non-SQL (FROM before SELECT), but it's the same one as used by XQuery, actually.

So, monad comprehensions in C# were designed to look like SQL queries, because that's an important use case for them. This even permeates the API design with the naming of e.g. Select for the transformation function (instead of map), SelectMany (instead of flatMap), Where (instead of select / filter), etc. Note that the names of these operations don't really make much sense if you use them outside of a query comprehension, and in fact, can even be actively confusing (Select is what is usually called map, whereas what is usually called select is Where).

Scala has monad comprehensions built into the language. One reason for that is the use of monads for collection operations. Therefore, the designers of Scala decided to make it look mostly like a for loop in a generic C-style language. The assignments look a bit non-C, but left-arrow for assignment is typical imperative pseudo-code.

So, monad comprehensions in Scala were designed to look like imperative for loops in some generic C/pseudo-code like language, because that's an important use case for them.

Note that both in the case of C# and Scala, the comprehension syntax can actually do more and/or less than just perform the two monadic operations join and bind (or map and flatMap). In Scala, a for comprehension without yield translates into foreach, i.e. an imperative side-effecting iteration, which really doesn't have much to do with monads at all. A yield can have a guard (yield foo if bar), which translates into a call to a call to withFilter, i.e. filtering elements. C# can do the same with Where. C# can also do aggregation (group by) and sorting (order by).

They are actually more of a generalization and/or fusion of monad comprehensions and list comprehensions generalized to arbitrary collections and monads. Note that Haskell also has both, but they are different things, in C# and Scala, they are fused together.

Haskell has a lot of history, and has pioneered a lot of concepts. As is typical with pioneers, often the people who come after them, discover better, shorter, safer routes. Maybe if Haskell had been designed with hindsight instead of innovation, they also would have generalized list comprehensions to work with more collections, and would have fused monad comprehensions and generalized collection comprehensions together, who knows? I know that there are proposed variants and extensions to Haskell, which add Arrow Comprehensions, for example.

The words "for" and "yield" seem likely to confuse programmers who are used to, for example, Java's for loop and Ruby's yield.

The first one is intentional. It is supposed to look like a for loop.

The second one is unfortunate. The word yield has two (actually related but not obviously so) meanings. One is the meaning used in concurrency, coroutines, fibres, threads, and for the yield keyword and the Fibre#yield method in Ruby, where it means that a piece of code yields control of execution to another piece of code. The other is the meaning of a computation yielding a result. That's the meaning that is used in Python generators, C# generators, and interestingly also in Ruby, in the Enumerator::Yielder#yield method, and is the interpretation that is meant for the yield keyword in Scala for comprehensions.

OTHER TIPS

I don't have much background on why Scala designers made that particular wording choice, but in F#, local "do notation" equivalent is called a "computation expression". The reasoning behind that word choice is one of PR and marketing rather than any concrete technical reason.

Haskell has a hard-earned reputation of being a language for academics, rocket scientists and other weirdos, and one of the things that perpetuates this notion is its community's insistence on using type-theory derived terms out in the open. This is also a form of marketing - you could say its a conscious choice that makes it appealing to certain kinds of people (functional programming hipsters) and off-putting to others (say, like enterprise adopters).

For F#, they went with "computation expressions" and "computation expression builders" instead of "do notation" and "monads" supposedly in an attempt to not scare away people who are likely to pick up the language (existing .NET programmers who don't have a functional programming background).

I would imagine the case is similar for Scala. By calling it "for comprehensions", they want to sell the feature to people who might know nothing of monads, but who are familiar with list comprehensions in languages like Ruby and Python.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange