Is using the Pair class a sign of primitive obsession code smell?

https://softwareengineering.stackexchange.com/questions/385078

18-02-2021
|

Question

Let's say I use a Pair in this way:

Pair<Long, Date> signup = getSignup();
System.out.println("User with ID " + signup.getLeft() + " signed up on " + signup.getRight());

Is it a form of Primitive Obsession?

I could have something like

Signup signup = getSignup();
System.out.println("User with ID " + signup.getUsrId() + " signed up on " + signup.getSignupDate());

If it's not a form of primitive obsession, why is that?

Solution

A rule of thumb I draw from the article is that if you are using a Pair/other built in type "just this once" you need to ensure that it can only ever be used "just this once" (within reason).

So let's consider a lambda. It is also something that can start out as being used "just this once".

When you pass lambda as a argument for a filter(), you ensure that (within reason), that lambda will only be used for the filter predicate. You needed to tell the filter() how to filter and your filtering condition was special to that situation, so you used a lambda to provide the predicate. You took advantage of the lambda to provide the predicate as concisely as possible.

If you find that predicate (lambda) being needed in more than one place and you decide to store it in a variable, in my opinion you have probably entered the Primitive Obsession territory. You had a good justification to use that lambda "just this once" before, but the situation has changed. You have nothing to lose by converting it to a method with a descriptive name.

So let's go back to the Pair. It is very unlikely that this Pair will just be used just this once to print out the sign up date. You will likely soon find yourself wanting to save, load, filter etc these signups. Every single time you want to add functionality, you need to take that extra little bit of time to check with yourself what this Pair really means.

A way I would check if I am really allowed to use the Pair or other primitive "just this once" is to ask if can "leak" and someone else can get to use it. Then the question is what can happen when it leaks. When someone gets hold of my filter, all they can do it chuck in a item and get out a true or false. If someone finds this Pair sitting there and starts trying to use it, all sorts of havoc can happen if they don't carefully check what is going on elsewhere with that Pair. For example they may overwrite the information, thinking the pair means something else. When you have a SignUp class with a long userId, someone can easily see when they are working with the userid.

Disclaimer: you should not actually let some other class directly access SignUp.userId directly (have getUserId() instead). I just had it this way for the example.

Another way to check is to interrogate your primitive and ask if it really means what you are trying to intend it to mean. When you have a lambda that takes a list item and spits out a true or false, it is a predicate. You can stick it in anything that wants a predicate.

With your Pair, the two fields only have meaning as a SignUp "mentally". Everywhere you want for your code to treat it as a sign-up, you need to tell it where to get the user id and where to get the data. With a predicate, you just know it's going to decide to give the input a thumbs up or a thumbs down.

OTHER TIPS

While there are cases where a Tuple or Pair is appropriate, I consider it a code smell.

As always with this question, it's important to note that some people consider 'code smell' to mean something which is always wrong and needs to be refactored.

I think the correct definition is something that, when you see it, you check to see if it masks or causes some known problem or anti-pattern.

In the case of Tuples, it's a bit of an edge case as their use is pretty rare. But the anti-pattern of using them extensively as a kind of dynamic object rather than coding a class seems to come up often enough that it is worth checking for when you see one.

If I saw your example, I would definitely query why a Signup class wasn't used.

It is a form of anonymous tuple (x, y). Other such anonymous constructs the now trendy lambda expression as in list.stream().filter(e -> e.name.isEmpty())... (e -> ...), or indeed method parameters f(12, "ab", x).

In these cases a named construct might be better: new Signup(x, y) or Signup.withX(x).withY(y).build() and .filter(Element::hasNoName) and f(signup).

But no, introducing many small classes, especially with getters and setters, perhaps might no smell, but certainly has a negative influence of development productivity - even in IntelliJ IDEA.

If you have repeated constructs with Pair in a source, it might point to applying a wrong data structure; this would not change with the introduction of a data transfer object / tuple class.

Example of a "wrong" Pair: list of (key, value) - here the key should be part of the value data structure or a map used. List of (condition, action) - here a List of boolean predicate might be a better (=higher) abstraction.

It depends. If you have an X that comes together with an Y, in other words, if you have a pair, use a pair. If you have an object which consists, just by coincidence, of two items, create a type for that object.

Let’s say you have a car and its price. There’s no point in having a “CarWithPrice” object. Make it a pair. Let’s say you have a teacher, who has a name and teaches a subject. You wouldn’t have a “NameAndSubject” object - however, in this case you would have a Teacher object. Not a pair.

Now you are using C++ where a pair is just a built-in type. In Swift, tuples (and triples etc.) is one of the five fundamental type categories (class, struct, enum, tuple and closure) that are part of the language type system. So anyone calling it a “code smell” will be laughed off, and tuples are much easier to use.

To me, I would use a Signup class, not a Pair 2-tuple. Because, even if the implementation of Signup is actually a Pair right now, someday it might not be.

A well-designed class is a human-obvious description of what something "is," not merely "how it is (presently ...) constructed." A method like getUserId() plainly indicates what the method does, whereas getLeft() conveys nothing to the reader. It also can return a rigorously-defined type, where a generic method like getRight() has no such opportunity.

It also gives you opportunities to use explicit types to detect errors at compile-time. When the object code is generated, it will all be about the same, but here you're letting the compiler do a better job for you and to keep you clear of more bugs. Java has a very strong type system: plan to use it to maximum advantage.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange