Passing around large value objects vs converting to smaller value objects

https://softwareengineering.stackexchange.com/questions/378171

07-02-2021
|

Domanda

Let's say I have a project that needs to do the following:

Multiple calls to read from the database where each call is a different query and returns back a value object (just has getters/setters). Let's say we end receiving 20 value objects
Perform some logic with the data from these value objects

I'm trying to brainstorm the best practice for parameters to my methods for the second bullet point given that some value objects might have more fields than what my methods need.

Let's say that I want to accomplish the second bullet point by having many classes that each do a portion of the work. Let's say one class happens to need 7 fields from one value object and 8 fields from another value object and that both value objects each have 30+ fields. I can think of 3 ways to do this:

Have my class take in the two value objects it needs as a parameter. The class will then call 15 getters and then do something with the fields. Take this same approach for the rest of my classes where they take in the value objects they need
Code up a new value object which only stores the 15 values that I need. Somewhere in my application I will have to do a mapping/conversion from these two value objects to my new value object. My class will call 15 getters and then do something with the fields. Take this same approach for the rest of my classes
Code up some huge value object that stores all of the data that I need for the entire second bullet point. We basically take all of our value objects and convert all the data we need to this huge value object. My class will then take in this huge value object. My class will call 15 getters and then do something with the fields. Each class that does a portion of the logic will take in this huge value object

Which is the best approach?

seems like the easiest approach. However, it feels weird having a class that asks for more information than it needs. Someone looking at the class might want to know why it asks for 60 fields if it is only going to use 15 of them.
seems to be by far the most complicated as it has many different mappings but each individual class ends up asking only for what it needs.

Soluzione

The quest for the best approach is always troublesome in software engineering, because there usually isn't a single best approach. Each approach has trade-offs that make it better or worse in some situations.

One thing that should be high on your list of what to optimize for is readability and maintainability, so that people that haven't seen the code before or haven't seen it for some time can easily understand what the code is doing.

One aspect of readability is to have classes that represent a coherent concept. Value objects are not a bunch of properties that get thrown together in a container. Value objects should, just as entities, represent a coherent concept. For example, a street address has several properties, like the street name, city, postal code, etc., but you shouldn't create a value object that consists of the properties for a street address and the color of the house, just because it is convenient for some functions. Most likely, that collection of properties represents multiple concepts and should be broken into multiple value objects.

This brings us to the answer to your question.
If the current value objects you have for option 1 each represent a single, coherent, concept, then option 1 is most definitely the option to choose. It is very natural that not all functions need all properties of a value object (especially if it is a rather large one with many properties), but that is not a problem. Seeing only a part of a well-understood concept being used is usually easier to understand than seeing an operation on a random bag of properties.

If the current value objects do not represent single coherent concepts, then I would offer option 4: Create new value objects that each represent a single, coherent concept and base your operations around those.

Altri suggerimenti

1) Have my class take in the two value objects it needs as a parameter. The class will then call 15 getters and then do something with the fields. Take this same approach for the rest of my classes where they take in the value objects they need

This is the anemic domain approach. Works fine if you never change DBs, tables, or fields or don't mind when such changes break many things in many places. This means your application is intimately familiar with DB details.

2) Code up a new value object which only stores the 15 values that I need. Somewhere in my application I will have to do a mapping/conversion from these two value objects to my new value object. My class will call 15 getters and then do something with the fields. Take this same approach for the rest of my classes

This is over applying the introduce parameter object refactoring. You don't just shove every parameter of some method into one parameter object. You breakup and group the methods parameters so that you end up with fewer arguments that each hold conceptually coherent groups. These little data transfer objects can each focus part of your methods needs. They might even be reusable.

For example don't refactor this:

void draw(int x, int y, int r, int g, int b)

into this:

void draw(DrawArgs drawArgs)

Refactor it into this:

void draw(Point point, Color color)

Do that even if the DB has only the data and has no idea what points and colors are.

3) Code up some huge value object that stores all of the data that I need for the entire second bullet point. We basically take all of our value objects and convert all the data we need to this huge value object. My class will then take in this huge value object. My class will call 15 getters and then do something with the fields. Each class that does a portion of the logic will take in this huge value object

This is called a God object. It is what happens when you give up on organizing and just dump everything in one place. It just hides the mess somewhere else.

You have a very complex processing that takes value objects belonging to 20 different classes as input. Let's assume that they are well defined and consistent objects of the application domain.

Your design subdivides the complex processing of the domain objects into smaller tasks that are encapsulated in smaller classes that are easier to manage.

Option 1 seems here the most appropriate choice: you take well defined consistent domain objects as input and you only need a part of them. This has significant benefits:

One day you may find out a better way to perform the elementary task by using an additional property of the domain object: you can make the improvement without changing the "natural" interface based on domain objects.
One day a domain object might evolve and contain more fields to be considered as well: no worry again, no need to change the interfaces of your tasks.

All the other choices are less optimal:

Option 2: You'd create a new intermediary object, with the hope of achieving a better interface segregation so that the class depends only on what it needs. But in reality, you have created an arbitrary new interface, which does not represent a domain reality and in the end still depends indirectly of the domain classes !
Option 3: Your huge object simplifies parameter passing to the extreme. But at what cost ? You'll loose most of the benefits of the encapsulation in smaller tasks. You'll have difficulty to keep the overview on what classes modifies/updates/generates what.

Hmmmm... So you have multiple value-objects (maybe 20 or more according to your post) that have multiple relationships to each other and need to be orchestrated.

This very much sounds like a couple of enterprise integration patterns (EIP's) that compose your end-to-end process. It seems to me that your process needs to accomplish the following operations:

Message/Data Aggregation: collect all required data from all of the required value objects
Message Sequencing: Organize the inbound value objects into the correct sequence
Content Enrichment: Add/adjust the payload of the value objects based on input

I think that the best tool for what you want to accomplish is a 'Builder-pattern' with orchestration, assembly, and enrichment of your VO's. I would use Apache Camel for what you want to do.

Apache Camel

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange