Use a custom value object or a Guid as an entity identifier in a distributed system?

https://softwareengineering.stackexchange.com/questions/239220

03-10-2020
|

Question

tl;dr

I've been told that in domain-driven design, an identifier for an entity could be a custom value object, i.e. something other than Guid, string, int, etc. Can this really be advisable in a distributed system?

Long version

I will invent an situation analogous to the one I am currently facing.

Say I have a distributed system in which a central concept is an egg. The system allows you to order eggs and see spending reports and inventory-centric data such as quantity on hand, usage, valuation and what have you. There area variety of services backing these behaviors. And say there is also another app which allows you to compose recipes that link to a particular egg type.

Now egg type is broken down by the species—ostrich, goose, duck, chicken, quail. This is fine and dandy because it means that users don't end up with ostrich eggs when they wanted quail eggs and whatnot. However, we've been getting complaints because jumbo chicken eggs are not even close to equivalent to small ones. The price is different, and they really aren't substitutable in recipes. And here we thought we were doing users a favor by not overwhelming them with too many options.

Currently each of the services (say, OrderSubmitter, EggTypeDefiner, SpendingReportsGenerator, InventoryTracker, RecipeCreator, RecipeTracker, or whatever) are identifying egg types with an industry-standard integer representation the species (let's call it speciesCode). We realize we've goofed up because this change could effect every service.

There are two basic proposed solutions:

Use a predefined identifier type like Guid as the eggTypeID throughout all the services, but make EggTypeDefiner the only service that knows that this maps to a speciesCode and eggSizeCode (and potentially to an isOrganic flag in the future, or whatever).
Use an EggTypeID value object which is a combination of speciesCode and eggSizeCode in every service.

I've proposed the first solution because I'm hoping it better encapsulates the definition of what an egg type is in the EggTypeDefiner and will be more resilient to changes, say if some people now want to differentiate eggs by whether or not they are "organic".

The second solution is being suggested by some people who understand DDD better than I do in the hopes that less enrichment and lookup will be necessary that way, with the justification that in DDD using a value object as an ID is fine. Also, they are saying that EggTypeDefiner is not a domain and EggType is not an entity and as such should not have a Guid for an ID.

However, I'm not sure the second solution is viable. This "value object" is going to have to be serialized into JSON and URLs for GET requests and used with a variety of technologies (C#, JavaScript...) which breaks encapsulation and thus removes any behavior of the identifier value object (is either of the fields optional? etc.) Is this a case where we want to avoid something that would normally be fine in DDD because we are trying to do DDD in a distributed fashion?

Summary

Can it be a good idea to use a custom value object as an identifier in a distributed system (solution #2)?

Solution

If I were to simplify it a bit to core requirements, it seems that what the problem comes down to are the following:

Design egg type in such a way that it can be extended in the future to hold additional attributes
Whether such extensions can be backward compatible or not (and whether they should be backward compatible or not)

I would think that the backward compatibility problem is going to be the real issue to solve. Let's walk through the example you provided:

Currently each of the services (say, OrderSubmitter, EggTypeDefiner, SpendingReportsGenerator, InventoryTracker, RecipeCreator, RecipeTracker, or whatever) are identifying egg types with an industry-standard integer representation the species (let's call it speciesCode). We realize we've goofed up because this change could effect every service.

Stepping away from the implementation for a second, let's say the egg type now included a species code and a size code. Now let's say that a new attribute needed to be added with the assumption that no matter how many attributes are added now, new ones could be added in the future. How should the existing services that are already referencing to existing egg types behave?

One approach, which may be quite acceptable in this case, could be that any new attribute is an 'optional' attribute, which means that there is either a default or a "doesn't matter" flag and existing services continue to operate based on the existing set of attributes. This issue is easier solved if we assume that attributes will only be added (your case, and never modified or removed, which is a more involved versioning issue).

If this approach is acceptable, then I would design an object type (typically a class) that holds these various properties, and ensure that serialization and deserialization into various formats is backward compatible.

For example, let's assume the following (assume pseudocode):

class EggType
{
    string speciesCode;
    string sizeCode;
}

string SerializeForURL()
{
    return speciesCode + "." + sizeCode;
}

Now, let's say we added a new property, called colorCode. But since this is optional, we would assume that an egg type can be created without providing a colorCode. If that's the case, the code should handle the case where colorCode is provided vs where not provided. Let's say if one of the services which uses this egg type places orders for new eggs from the farms, that service will have to be modified to support this, otherwise, new eggs will be random colors regardless of what was ordered. However, let's say there is a service that collects sales data on what is sold, it can simply not aggregate by colorCode (if the code is not yet updated), and not report it up the management chain, until that change is made. This decision of when to support can then rest on respective services.

We could then modify the serializeForUrl() to be something like this:

string SerializeForURL()
{
    return speciesCode + "." + sizeCode + "." + colorCode;
}

In the deserialization method, we could check whether there are two attributes or three. If two, colorCode is not provided, but if three, then colorCode is indeed provided.

It is easier in the case of JSON, because the value for colorCode would be null.

I would prefer this approach over GUIDs any day, for multiple reasons:

Makes it harder to manage change in egg type: It does not allow isolation of these attributes from each other, even though they are really independent of each other. (i.e. if you modify the GUID, all the services need to understand it, even though all it might be doing is adding a property, such as colorCode, which many services may not care about at all.)
All the systems are now dependent on a global list of GUIDs and how they translate into egg type - unnecessary coupling in my opinion.
Easier to debug: programmers are one of the users of the code, and they need to be able to easily read code as well as trace logs, audit data, whatever.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange