The problem of letting the receiver instead of the sender generate an ID

https://softwareengineering.stackexchange.com/questions/418708

18-03-2021
|

문제

Lately I've encountered a similar problem in a few different circumstances, and every time it has required quite a bit of extra code to work around it.

The problem I'm talking about is the one where one application or device needs to send something to another application or device (usually a server). The thing being sent can be identified by a unique ID, but this ID is not known at the time of sending. It is instead generated or calculated by the receiver. Sort of a chicken-and-the-egg type of situation.

A common example is inserting a record into a database, and not knowing the ID of the newly inserted record until it has been inserted. My latest encounter was a situation where a mobile app needed to send an image and an accompanying textfile to a server, but the textfile couldn't be sent until the image had been received and an ID returned and inserted into the textfile. Pretty straightforward, until you start taking mobile network error handling into account...

This has cropped up in various forms lately, and I've noticed that usually it requires a lot more complexity to let the receiver generate an ID than would be required if I could just let the sender generate a UUID/GUID/whatever and send it along with whatever I'm sending.

So I'm curious - does this problem have a common name? And am I right in thinking that the best/least complex approach usually is to let the sender set the ID instead of the receiver?

해결책

This is a common problem in distributed systems. The issue is that you have two competing requirements:

the server wants a globally unique ID (and is in a position to enforce it)
the client wants a client-unique ID (so it isn't affected by other clients)

(I'm assuming you have a single server - it's a little more complicated if the servers are distributed/sharded/whatever as well).

A common solution is to therefore ... just use two IDs:

client sends a request with a request ID it assigns itself

this request ID must be trackable by the client, but doesn't need to be globally unique. The request state is "in-flight" and it has just a local ID.
the server assigns a globally-unique ID, and sends the client an ack with both IDs (which serves as a mapping)

now both parties know the global ID, and everything can use that. The request state is "acked" and it has a global ID.

If the client needs to stream updates without waiting for the global ID ack, it can always keep using the local ID, and the server can map this internally with a (request-ID, requestor-ID) tuple.

다른 팁

Naming

That would be the problem's common name.

What you need to do is flip the problem on its head. How do I know how to refer to your dog before you have introduced your dog's name to me, by the way I've decided to nickname the dog wuffles. (Hint one way is italicised, the other is in bold.)

It boils down to either a mechanical description, kind of like an address, or to name it myself and introduce it to you.

On the receiving side no one says you have to use the same name, or that you cannot augment the name mechanically/randomly. Following the dog example on receipt translate the your dog/wuffles to the dogs actual name, or Kain's your dog, or User 12345's wuffles.

So I'm curious - does this problem have a common name?

The problem is one of namespace ownership.

And am I right in thinking that the best/least complex approach usually is to let the sender set the ID instead of the receiver?

The approach I typically see is

The sender sets the id of the message (a GUID), sometimes referred to as a "trace ID" or "correlation ID." The primary purposes are deduplication, fault tolerance, debug tracing, and audit logging/non-repudiation, and occasionally as a nonce or salt.
The sender sets (or provides) any ID that serves as a natural key for an entity that needs to be persisted, such as an email address that uniquely identifies a user. However, the receiver still validates that it is unique.
The server sets any key that serves as a surrogate key, such as a confirmation code, that represents an entity that has been persisted in a data store for which the server owns the namespace. Usually the receiver is the only component that can guarantee that something is successfully persisted.

My latest encounter was a situation where a mobile app needed to send an image and an accompanying textfile to a server, but the textfile couldn't be sent until the image had been received and an ID returned and inserted into the textfile.

It would've been perfectly possible for the backend to already receive both the image and textfile, insert the image, update the textfile, and then insert the textfile.

If you don't want the backend to update the textfile, you've made an arbitrary decision that effectively makes it impossible to post the image and textfile in one request while having the receiver generate the ID.

There's nothing wrong with making that decision, but you can't then generalize your observation about software development in general, which is what your question starts doing.

until you start taking mobile network error handling into account

You are overgeneralizing a very niche problem here.

One major design decision that comes from mobile connectivity, that you don't usually find in other applications, is the ability to work offline when the network is down.

For your average enterprise application, if the network is down everyone stops working on the system. No one can do any work anymore (i.e. updating the state of the application) until the network connection has resumed. But in a mobile context, you specifically want your mobile application to continue working, queue its updates, and send those out when the connection has been restored.

This leads to your mobile app needing the ability to both remember and reference a completely new entity that it just created but has not been saved in the database yet (e.g because the network is down). That reference becomes its ID.

And while it is technically possible to generated a temporary ID and later have the backend replace that ID with one the backend generated for itself, it becomes quite difficult to process the queued updates if those future updates might still be referencing that temporary ID that you just changed into a different permanent one.
It can be done, but it's really not worth the effort.

When you reach that point, then I agree that having the mobile app generate its own identifiers makes a lot of sense, if it's a non-sequential data type like GUID. Most definitely not ints because multiple mobile apps are likely to reuse int values while offline.

I've noticed that usually it requires a lot more complexity to let the receiver generate an ID

And am I right in thinking that the best/least complex approach usually is to let the sender set the ID instead of the receiver?

In a mobile context, it is indeed the easier option. In a general context, it isn't.

It would've helped if you posted an example other than in a mobile context. In my experience, with the exception of needing to work offline during network outages, it is vastly easier to delegate the generation of IDs to the database server.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 softwareengineering.stackexchange