Why is “tight coupling between functions and data” bad?

https://softwareengineering.stackexchange.com/questions/212515

30-09-2020
|

Question

I found this quote in "The Joy of Clojure" on p. 32, but someone said the same thing to me over dinner last week and I've heard it other places as well:

[A] downside to object-oriented programming is the tight coupling between function and data.

I understand why unnecessary coupling is bad in an application. Also I'm comfortable saying that mutable state and inheritance should be avoided, even in Object-Oriented Programming. But I fail to see why sticking functions on classes is inherently bad.

I mean, adding a function to a class seems like tagging a mail in Gmail, or sticking a file in a folder. It's an organizational technique that helps you find it again. You pick some criteria, then put like things together. Before OOP, our programs were pretty much big bags of methods in files. I mean, you have to put functions somewhere. Why not organize them?

If this is a veiled attack on types, why don't they just say that restricting the type of input and output to a function is wrong? I'm not sure whether I could agree with that, but at least I'm familiar with arguments pro and con type safety. This sounds to me like a mostly separate concern.

Sure, sometimes people get it wrong and put functionality on the wrong class. But compared to other mistakes, this seems like a very minor inconvenience.

So, Clojure has namespaces. How is sticking a function on a class in OOP different from sticking a function in a namespace in Clojure and why is it so bad? Remember, functions in a class don't necessarily operate just on members of that class. Look at java.lang.StringBuilder - it operates on any reference type, or through auto-boxing, on any type at all.

P.S. This quote references a book which I have not read: Multiparadigm Programming in Leda: Timothy Budd, 1995.

Solution

In theory, loose function-data coupling makes it easier to add more functions to work on the same data. The down side is it makes it more difficult to change the data structure itself, which is why in practice, well-designed functional code and well-designed OOP code have very similar levels of coupling.

Take a directed acyclic graph (DAG) as an example data structure. In functional programming, you still need some abstraction to avoid repeating yourself, so you're going to make a module with functions to add and delete nodes and edges, find nodes reachable from a given node, create a topological sorting, etc. Those functions are effectively tightly coupled to the data, even though the compiler doesn't enforce it. You can add a node the hard way, but why would you want to? Cohesiveness within one module prevents tight coupling throughout the system.

Conversely on the OOP side, any functions other than the basic DAG operations are going to be done in separate "view" classes, with the DAG object passed in as a parameter. It's just as easy to add as many views as you want that operate on the DAG data, creating the same level of function-data decoupling as you would find in the functional program. The compiler won't keep you from cramming everything into one class, but your colleagues will.

Changing programming paradigms doesn't change best practices of abstraction, cohesion, and coupling, it just changes which practices the compiler helps you enforce. In functional programming, when you want function-data coupling it's enforced by gentlemen's agreement rather than the compiler. In OOP, the model-view separation is enforced by gentlemen's agreement rather than the compiler.

OTHER TIPS

In case you didn't know it already take this insight: The concepts of object-oriented and closures are two sides of the same coin. That said, what is a closure? It takes variable(s) or data from surrounding scope and binds to it inside the function, or from an OO-perspective you effectively do the same thing when you, for example, pass something into a constructor so that later on you can use that piece of data in a member function of that instance. But taking things from surrounding scope is not a nice thing to do - the larger the surrounding scope, the more evil it is to do this (though pragmatically, some evil is often necessary to get work done). Use of global variables is taking this to the extreme, where functions in a program are using variables at program scope - really really evil. There are good descriptions elsewhere about why global variables are evil.

If you follow OO techniques you basically already accept that every module in your program will have a certain minimum level of evil. If you take a functional approach to programming, you are aiming for an ideal where no module in your program will contain closure evil, though you may still have some, but it will be a lot less than OO.

That's the downside of OO - it encourages this kind of evil, coupling of data to function through making closures standard (a kind of a broken window theory of programming).

The only plus side is that, if you knew you were going to use lots of closures to start with, OO at least provides you with an idealogical framework to help organise that approach so that the average programmer can understand it. In particular the variables being closed over are explicit in the constructor rather than just taken implicitly in a function closure. Functional programs that use lots of closures are often more cryptic than the equivalent OO program, though not necessarily less elegant :)

It's about type coupling:

A function built into an object to work on that object can't be used on other types of objects.

In Haskell you write functions to work against type classes - so there are many different types of objects any given function can work against, so long as it's a type of the given class that function works on.

Free-standing functions allow such decoupling which you don't get when you focus on writing your functions to work inside of type A because then you can't use them if you don't have a type A instance, even though the function might otherwise be general enough to be used on a type B instance or type C instance.

In Java and similar incarnations of OOP, instance methods (unlike free functions or extension methods) can't be added from other modules.

This becomes more of a restriction when you consider interfaces which can only be implemented by the instance methods. You can't define an interface and a class in different modules and then use code from a third module to bind them together. A more flexible approach, like Haskell's type classes should be able to do that.

Object Orientation is fundamentally about Procedural Data Abstraction (or Functional Data Abstraction if you take away side-effects which are an orthogonal issue). In a sense, Lambda Calculus is the oldest and most pure Object-Oriented language, since it only provides Functional Data Abstraction (because it doesn't have any constructs besides functions).

Only the operations of a single object can inspect that object's data representation. Not even other objects of the same type can do that. (This is the main difference between Object-Oriented Data Abstraction and Abstract Data Types: with ADTs, objects of the same type can inspect each other's data representation, only the representation of objects of other types is hidden.)

What this means is that several objects of the same type may have different data representations. Even the very same object may have different data representations at different times. (For example, in Scala, Maps and Sets switch between an array and a hash trie depending on the number of elements because for very small numbers linear search in an array is faster than logarithmic search in a search tree because of the very small constant factors.)

From the outside of an object, you shouldn't, you can't know its data representation. That's the opposite of tight coupling.

Tight coupling between data and functions is bad because you want to be able to change each independently of the other and tight coupling makes this hard because you can't change one without knowledge of and possibly changes to, the other.

You want different data presented to the function to not require any changes in the function and similarly you want to be able to make changes to the function without needing any changes to the data it is operating on to support those function changes.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange