Why do some languages decouple operations from data type?
https://softwareengineering.stackexchange.com/questions/419466
-
18-03-2021 - |
Question
This is a pretty difficult question to frame. But I'll try my best.
In some languages, data types are decoupled from the operations they can perform, while in other languages like JavaScript they are tied together.
An example: Arrays in Javascript have a filter method so you could easily do
array.filter(condition-function)
Other languages do this with a separate Array class providing static utility methods for handling arrays:
Array.filter(array,condition-function)
Seems the second way is usually done in functional languages.
I've got 2 questions:
- What is this concept called? So I can do more research on it.
- What are some merits to using the second option over the first?
Solution
The question is about how coupled are operations with the types of their operand. Let’s take the example of a hypothetical type List
in a hypothetical language.
Core operations on the type
First of all , there is a question of semantics:
- the type
List
does not make any sense without the basic operations to build it and to access its elements; - a mutable
List
moreover is not very helpful if you can’t add and remove list elements;
So this set of core operations belong to the abstract type, regardless how these operations are implemented.
Non core operations on the type
There are more elaborate operations that a very useful when working with List
, such as filter
, map
, reduce
or simply sort
. But many use cases of List
don’t necessarily need them. In other words, you can imagine the List type without these advanced operations.
Moreover you can imagine different efficient implementations without access to the type’s internals. So as designer, you are free to bundle such operations with the type or provide them separately.
Lastly there is a grey zone. For example size
. You don’t really need size
of the List
to define the abstract type. But many algorithms use the size, and it can only be implemented efficiently with access to the internals. This is why it’s mostly provided together with the type.
Generic algorithms and specializations
Until now, I have presented the operations from the point of view of the types. But we may also consider the point of view of the software developers.
Filter
, map
, and reduce
are very generic algorithms that make sense with many underlying containers (lists, arrays, queues, sets, bags, maps, ...). So you can design a generic algorithm independently of the type in which you’ll use it:
- More knowledge is needed on the algorithms than on the datatype. So a different team could develop them, with a different release schedule.
- Depending on the application area, you could think of specialised adaptations of the algorithm, that takes into account domain-specific knowledge.
- There are even further specializations of the
reduce
:average
,standardDeviation
,minimum
,maximum
,median
, .... Where to draw the boundary to keep a general, useful but not cluttered type?
All these arguments are in favour of a decoupling of the non-core operations. But here as well, there is a grey area: I can imagine that some of these algorithms could be implemented more efficiently if the internals are known.
Conclusion
It’s difficult to think about advantages and inconveniences without knowing the context. I have therefore highlighted the criteria that will help you to make the analysis yourself.
So there’s no best way. It always depend on the context. Some language designs you could look at:
- Ada language is designed around the definition of abstract types (with or without object orientation) and modularity and therefore promotes minimalist types.
- C++ provides containers in its standard library. It separates the container libraries (rather minimalist, but with a couple of comfort operations and operations needed to enable generics) from the generic algorithm libraries. The template mechanisms allows to provide partial specialisations that combine knowledge on the type on which the algortihm is applied.
- JavaScript provides filter, map, reduce as part of the array type. Probably this is related to the history of the language which was in its inception tightly linked to the internet browser domain. Perhaps also performance considerations in early implementations.
OTHER TIPS
This is a broad question, partly because a supposed method call myobject.method(args)
can actually mean very different things depending on the programming language.
In languages like Lua,
method
could just be a member variable ofobject
which happens to be a function, and you call that function.myobject = { method = function(args) ... end }
Because it is inefficient to save all of an object's method in it as data, there is often some mechanisms to re-use definitions.
- Python has classes, so
myobject.method(args)
is the same asmyobject.__class__.method(myobject, args)
. That's why member functions have aself
argument. - Lua has metatables, and an call explicit syntax --
myobject:method(args)
is the same asmyobject.method(myobject,args)
. - JavaScript has prototypes/
this
- Python has classes, so
Languages like C++ or Java have a concept of classes and virtual methods
In D and Rust,
myobject.method(args)
is translated asmethod(myobject, args)
.
So moving beyond the syntactic differences between myobject.method(args)
and method(myobject, args)
, let me try and reformulate the core of your question as this:
If I write
method(myobject, args)
and there are multiple objects which have amethod
-- like lists, arrays, dictionaries all have amap
-- how do we make sure the correct one gets called?
This problem is called method dispatch, and there are various approaches
- allow only a single method called
method
in the current namespace. A lot of functional languages take this approach - statically resolve the method depending on type information available. Function overloading, generics and Haskell's typeclasses fit here
- Inspect
myobject
at runtime and decide what to call. That's single dispatch polymorphism, present in most OO languages. My Lua/Python examples above are idioms for implementing single dispatch in these languages - Inspect all arguments at runtime and decide what to call. That's multiple dispatch, used extensively at the moment in the scientific computing language Julia