Why do some languages decouple operations from data type?

https://softwareengineering.stackexchange.com/questions/419466

18-03-2021
|

Question

This is a pretty difficult question to frame. But I'll try my best.

In some languages, data types are decoupled from the operations they can perform, while in other languages like JavaScript they are tied together.

An example: Arrays in Javascript have a filter method so you could easily do

array.filter(condition-function)

Other languages do this with a separate Array class providing static utility methods for handling arrays:

Array.filter(array,condition-function)

Seems the second way is usually done in functional languages.

I've got 2 questions:

What is this concept called? So I can do more research on it.
What are some merits to using the second option over the first?

Solution

The question is about how coupled are operations with the types of their operand. Let’s take the example of a hypothetical type List in a hypothetical language.

Core operations on the type

First of all , there is a question of semantics:

the type List does not make any sense without the basic operations to build it and to access its elements;
a mutable List moreover is not very helpful if you can’t add and remove list elements;

So this set of core operations belong to the abstract type, regardless how these operations are implemented.

Non core operations on the type

There are more elaborate operations that a very useful when working with List, such as filter, map, reduce or simply sort. But many use cases of List don’t necessarily need them. In other words, you can imagine the List type without these advanced operations.

Moreover you can imagine different efficient implementations without access to the type’s internals. So as designer, you are free to bundle such operations with the type or provide them separately.

Lastly there is a grey zone. For example size. You don’t really need size of the List to define the abstract type. But many algorithms use the size, and it can only be implemented efficiently with access to the internals. This is why it’s mostly provided together with the type.

Generic algorithms and specializations

Until now, I have presented the operations from the point of view of the types. But we may also consider the point of view of the software developers.

Filter, map, and reduceare very generic algorithms that make sense with many underlying containers (lists, arrays, queues, sets, bags, maps, ...). So you can design a generic algorithm independently of the type in which you’ll use it:

More knowledge is needed on the algorithms than on the datatype. So a different team could develop them, with a different release schedule.
Depending on the application area, you could think of specialised adaptations of the algorithm, that takes into account domain-specific knowledge.
There are even further specializations of the reduce: average, standardDeviation, minimum, maximum, median, .... Where to draw the boundary to keep a general, useful but not cluttered type?

All these arguments are in favour of a decoupling of the non-core operations. But here as well, there is a grey area: I can imagine that some of these algorithms could be implemented more efficiently if the internals are known.

Conclusion

It’s difficult to think about advantages and inconveniences without knowing the context. I have therefore highlighted the criteria that will help you to make the analysis yourself.

So there’s no best way. It always depend on the context. Some language designs you could look at:

Ada language is designed around the definition of abstract types (with or without object orientation) and modularity and therefore promotes minimalist types.
C++ provides containers in its standard library. It separates the container libraries (rather minimalist, but with a couple of comfort operations and operations needed to enable generics) from the generic algorithm libraries. The template mechanisms allows to provide partial specialisations that combine knowledge on the type on which the algortihm is applied.
JavaScript provides filter, map, reduce as part of the array type. Probably this is related to the history of the language which was in its inception tightly linked to the internet browser domain. Perhaps also performance considerations in early implementations.

OTHER TIPS

This is a broad question, partly because a supposed method call myobject.method(args) can actually mean very different things depending on the programming language.

In languages like Lua, method could just be a member variable of object which happens to be a function, and you call that function.
```
myobject = { method = function(args) ... end }
```
Because it is inefficient to save all of an object's method in it as data, there is often some mechanisms to re-use definitions.
- Python has classes, so myobject.method(args) is the same as myobject.__class__.method(myobject, args). That's why member functions have a self argument.
- Lua has metatables, and an call explicit syntax -- myobject:method(args) is the same as myobject.method(myobject,args).
- JavaScript has prototypes/this
Languages like C++ or Java have a concept of classes and virtual methods
In D and Rust, myobject.method(args) is translated as method(myobject, args).

So moving beyond the syntactic differences between myobject.method(args) and method(myobject, args), let me try and reformulate the core of your question as this:

If I write method(myobject, args) and there are multiple objects which have a method -- like lists, arrays, dictionaries all have a map -- how do we make sure the correct one gets called?

This problem is called method dispatch, and there are various approaches

allow only a single method called method in the current namespace. A lot of functional languages take this approach
statically resolve the method depending on type information available. Function overloading, generics and Haskell's typeclasses fit here
Inspect myobject at runtime and decide what to call. That's single dispatch polymorphism, present in most OO languages. My Lua/Python examples above are idioms for implementing single dispatch in these languages
Inspect all arguments at runtime and decide what to call. That's multiple dispatch, used extensively at the moment in the scientific computing language Julia

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange