What is “premature abstraction”?

https://softwareengineering.stackexchange.com/questions/386570

20-02-2021
|

Question

I've heard the phrase being thrown arround and to me the arguments sound completely insane (sorry if I'm strawmaning here, Its not my intention), generally it goes something along the lines of:

You don't want to create an abstraction before you know what the general case is, otherwise (1) you might be putting things in your abstractions that don't belong, or (2) omitting things of importance.

(1) To me this sounds like the programmer isn't being pragmatic enough, they have made assumptions that things would exist in the final program that doesnt, so they are working with to low of a level of abstraction, the problem isn't premature abstraction, it's premature concretion.

(2) Omitting things of importance is one thing, it's entirely possible something is omitted from the spec that later turns out to be important, the solution to this isn't to come up with your own concretion and waste resources when you find out you guessed wrong, it's to get more information from the client.

We should always be working from abstractions down to concretions as this is the most pragmatic way of doing things, and not the other way around.

If we don't do so then we risk misunderstanding clients and creating things that need to be changed, but if we only build the abstractions the clients have defined in their own language we never hit this risk (at least nowhere near as likely as taking a shot in the dark with some concretion), yes it's possible clients change their minds about the details, but the abstractions they used to originally communicate what they want tend to still be valid.

Here is an example, lets say a client wishes you to create an item bagging robot:

public abstract class BaggingRobot() {
    private Collection<Item> items;

    public abstract void bag(Item item);
}

We are building something from the abstractions the client used without going into more detail with things we don't know. This is extremely flexible, I've seen this being called "premature abstraction" when in reality it would be more premature to assume how the bagging was implemented, lets say after discussing with the client they want more than one item to be bagged at once. In order to update my class all I need to is change the signature, but for someone who started bottom up that might involve a large system overhaul.

There is no such thing as premature abstraction, only premature concretion. What is wrong with this statement? Where is the flaws in my reasoning? Thanks.

Solution

At least in my opinion, premature abstraction is fairly common, and was especially so early in the history of OOP.

At least from what I saw, the major problem that arose was that people read through the typical examples of object oriented hierarchies. They got told a lot about making everything ready to deal with future changes that might arise (even though there was no particularly good reason to believe they would). Another theme common to many articles for a while was things like the platypus, which defies simple rules about "mammals are all like this" or "birds are all like that."

As a result, we ended up with code that really only needed to deal with, say, records of employees, but were carefully written to be ready if you ever hired an arachnid or maybe a crustacean.

OTHER TIPS

It's important to remember that abstraction is a means to an end. You use abstraction to uniform behavior across your program and make the addition of new classes straightforward in cases where you'd expect new classes to be added but only when the abstraction is needed in that very moment.

You wouldn't add abstraction simply because you might need it (with no real basis for thinking you'll need to add new classes in the future). A proper metaphor here might be plumbing. Hopefully you'll agree that a 6-directional pipe which allows water to flow up/down, east/west, north/south would be the most flexible type of pipe you could have. You could theoretically use a 6-directional pipe anywhere where a pipe is required, and block off the unneeded directions, right?

If you tried to fix a problem with a leak under your sink and found all the sections of pipe were 6-directional pipe pieces, you'd want to pull your hair out in frustration at the guy who designed it that way. Not only do you not know where the problem is, but it would almost certainly be more straightforward to simply start from scratch done in a proper way.

Of course coding isn't plumbing, but the metaphor still stands. Abstraction is like using those 6-directional pipe pieces. Use them when you honestly believe you may one day in the near future need to connect pipes from all 6 directions. Otherwise abstraction is simply complication, not much different than using a pattern where none is required or using a god class which attempts to do everything. If it is not being used, and will not likely ever be used, you're ultimately adding an additional class for nothing.

Admittedly, the art of writing programs is very very abstract conceptually. It's simply worth mentioning that abstractions don't exist for the sake of being an abstraction but because they're practical in some real way. If you feel the need to use abstraction, so be it, but don't ask me to check your plumbing afterwards. ;)

When I learned about Object Oriented Analysis and Design, many years ago, we would start with a plain English description of the system the customer needed.

Looking through that, any noun (or noun phrase) would be considered as a possible class. Any verbs (or verb phrases) would be potential methods on classes. So "bagging robot" could be a class BaggingRobot. "Open a bag" might become method OpenBag.

After a few iterations, this would turn into a class diagram.

At this point, there are no abstract classes. The customer doesn't want an abstract concept of a bagging robot. They want a robot that puts things in bags. All the classes are concrete and have a set of methods.

Abstract classes are only introduced when it becomes clear that:

There are several similar classes that could form a hierarchy.
They actually share something in common, so that a base class performs a useful purpose.

To me, "premature abstraction" is assuming that any BaggingRobot must inherit from some BaseRobot, and, worse, trying to develop a set of methods for BaseRobot before you even know what is common to all robots.

I agree with most of what everyone else already answered here (and in the comments) but I thought your separation of Premature Concretion was interesting and thought I could add a little bit:

In practice I've usually seen Premature Abstraction to be something like deriving a Robot and having BaggingRobot extend it before there is any actual use case for another type of Robot. The reasoning is usually that since we have one type of Robot we might want another type of Robot in the future. This is what everyone else mentioned as well.

I also commonly see what I think you would define as Premature Concretion: the most common example I can think of is adding additional, unused options to methods thinking they might be useful in the future.

Further, I often see both of these done at the same time: abstracting unnecessary hierarchies & functions, adding unused "helpful" options, etc. The goal usually being something along the lines of making our future lives easier by anticipating future use-cases (which then often turn out to be incorrect).

Since both of these things often happen together, and since they stem from the same goal of anticipating future change, I think they are typically lumped together under the term Premature Abstraction. Even though using your definitions separating them into Abstraction and Concretion might be more accurate. A much looser definition of Abstraction might be anything that makes the code more general, in which case you could probably lump Concretion in with Abstraction as the additional Concretions supposedly allow for more general usage.

We should always be working from abstractions down to concretions as this is the most pragmatic way of doing things, and not the other way around.

I think your assertion holds as long as you start with the simplest Abstraction. If you had started w/ Robot and then specialized BaggingRobot out of it I think you could say BaggingRobot is a Concretion of the Robot Abstraction. So starting from the wrong abstraction, working down to concretions, and not refactoring to the simplest structure can still lead to unnecessary complexity (which in practice I think is quite common and often comes from not wanting to "waste" the time spent working on the wrong abstraction).

There is no such thing as premature abstraction, only premature concretion. What is wrong with this statement? Where is the flaws in my reasoning? Thanks.

Since you've accepted an answer it sounds like you were convinced Premature Abstraction exists a long time ago 😅. I'll just add that I think your separation of Premature Concretion from Premature Abstraction is useful, and that I think "Premature Abstraction" is commonly used to refer to both (perhaps because they tend to happen together).

"Premature abstraction" means (in a code review): "I think it is too early to tell what an appropriate abstraction to use is, and if we start using the abstraction you chose, it might make it harder to see and switch to a more appropriate abstraction later."

So if someone tells you your abstraction is "premature", here are some questions to review:

What alternative ways of doing this code did you consider?
How did you decide that this alternative, using this abstraction, is the most appropriate one?
Are there any likely upcoming changes to this code that are big or imminent?
When those changes are implemented, will this still be an appropriate abstraction?
Will thinking about the problem in terms of this abstraction make it harder to spot when another abstraction becomes more appropriate?
Will using this abstraction now make it harder to make those changes?

When people think our abstraction is premature, they might be having trouble seeing some part of these answers, or they're seeing a part that we aren't, or they disagree with us on how likely or significant or difficult some part is. From their perspective, the answers to the above questions probably don't add up to as favorable of an assessment of the abstraction.

Also, remember that in many projects, requirements are like sand: they shift and blow around over time and as things interact with them. Your question sounds like you expect that you can always get a good enough picture of what is needed to know if an abstraction is appropriate. But that's not always the case. Sometimes you don't and can't know soon enough, and sometimes you miss that you don't or can't know something relevant if you don't have enough experience with that thing.

Finally, read The Wrong Abstraction by Sandi Metz. This is a good example of how what seems like the right abstraction at one point in time can be the wrong abstraction in the future, or when you understand the problem better. When that's likely to happen soon enough to be a problem, "premature abstraction" is a good phrase for that. In particular, the All the Little Things presentation she links to in that article features an excellent example of this. Right at the moment that she says "duplication is far cheaper than the wrong abstraction", we're looking at two almost identically-shaped pieces of code which can give rise to the temptation to abstract away the commonality. And abstracting at that moment would be premature because if we do it based on just what's obvious from those two pieces of code we would come up with a different and less useful abstraction than if we waited for the next pieces of code.

It sounds a bit like you want to solve a problem that does not yet exist, and that you have no good reason to assume will occur, unless you just believe the client is wrong in what they're requesting. If that's the case, I would recommend more communication over implementing a guess solution, abstract or otherwise. While it might seem innocuous to just slap "abstract" on the class definition and feel safe knowing you can extend it later, you might find that you never need to, or you may find yourself extending it in ways that over-complicate the design down the line, by abstracting the wrong things.

Going with your example, do you imagine that it is the robot itself, the items being bagged or the method of bagging that will change down the line?

Premature abstraction really just means you have no good reason to perform the abstraction and since abstractions are far from free in many languages you may be incurring overhead without justification.

Edit To answer some points from OP in the comments, I'm updating this to clarify and respond in a way that doesn't require a long comment chain while addressing points (1) and (2) more specifically.

(1) To me this sounds like the programmer isn't being pragmatic enough, they have made assumptions that things would exist in the final program that doesnt, so they are working with to low of a level of abstraction, the problem isn't premature abstraction, it's premature concretion.

(Emphasis mine). Assuming things would exist in the final program that doesn't is exactly what I see in your example. The customer asked for an item-bagging robot. This is a very specific request, if somewhat vague in its language. The customer wants a robot that bags items, so you produce a pair of objects

public class Item {
/* Relevant item attributes as specified by customer */
}

public class BaggingRobot {
  private Collection<Item> items;

  public void bag(Item item);
}

Your model is simple and precise, it follows the requirements set by the customer without adding any complexity to the solution and without making an assumption that the customer wanted more than they asked for. If, when presented with this, they clarify the need for the robot to have interchangeable bagging mechanics, only then do you have enough information to justify creating an interface or other method of abstraction, since you can now point to specific value that is added to the system through the abstraction.

On the contrary, if you start with abstraction from pure pragmatism, you either spend time creating a separate interface and implementing it in a concretion, or create an abstract class, or whatever manner of abstraction you like. Should it then turn out that the customer is satisfied with this, and no further extension is necessary, you have spent time and resources in vain, and/or introduced overhead and complexity into the system for no gain.

In regards to (2), I agree that omitting things of importance is not on its face a hallmark of premature abstraction. Abstraction or not, you can omit something of importance and unless its found far down the line of a chain of abstraction, it will not be harder or easier to sort that out either way.

Instead, I would interpret that as meaning that any abstraction runs the risk of obfuscating your domain model. Incorrect abstraction can make a system very difficult to reason about, since you are creating relationships that can drastically affect the further growth of the domain model and domain understanding. You run the risk of omitting not important information about the thing being abstracted, but about the system as a whole, by taking your model down the wrong rabbit hole.

You always start with some concrete requirement, and the first thing is to study the requirement until you understand it.

At this point you can create an implementation, and assuming you understood the requirements and you can create bug free code, that implementation will work. And at that point it’s nice to hide implementation details so that a user of your code doesn’t need to learn unnecessary things.

So where does abstraction come in? You may figure out that your requirements are just a special case of a more abstract requirement. So instead of implementing your concrete requirements you could have implemented the abstraction, together with the adaptations needed to make it work for your case.

So when do you do that? At the point where you know the abstract requirements, and where you are given other concrete requirements that can be implemented using the same abstraction and different adaptations.

When is the abstraction premature? In two cases: If you don’t understand the abstract requirements correctly yet and overestimate or underestimate what adaptations are needed, or worse if you come up with an abstraction that doesn’t fit anything but your concrete case (that will lead to a huge waste of development time because at some point you’ll have to change the abstraction), and often when you abstract before you have a second concrete case because you did unnecessary work.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange