Why is exactly once semantics infeasible?

https://stackoverflow.com/questions/416551

03-07-2019
|

Question

In RPC semantics where Erlang has hope for the best, SUN RPC with at-least once and Java RMI with at-most-once but no one has exactly once semantics.

Why does it seem infeasible to have exactly once semantics?

For example if the client keeps resending a uniquely tagged request until a reply is received and a server keeps track of all handled requests in order not to duplicate a request. Would that not be exactly once?

Solution

Consider what happens if the server crashes between carrying out the request and recording that it has carried out the request?

You can get at-most-once by recording the request, then carrying it out. if you get a crash between the two, then you've (erroneously) recorded it as carried out, so you won't do it again. Hence at-most-once

Bizarrely, this one (with timeouts) is patented: http://www.freepatentsonline.com/7162512.html. Except as I argue above, it doesn't guarantee exactly-once.

You get at-least-once by carrying it out, then recording it. If you get a crash between the two, you'll carry it out again if the request is repeated.

But it's not really feasible to say "exactly once" in all circumstances

(There are similar scenarios for network errors rather than server crashes)

OTHER TIPS

High-end messaging buses, like IBM's WebSphere MQ do purport to offer exactly once delivery. In fact, this is the default behaviour (as of the last time I used WMQ...). They achieve this with Write-ahead logs and a variety of locking techniques.

Of course, I don't doubt that buried somewhere in their legal documents, "exactly once" is actually defined to mean "message may or may not be delivered, once, more than once. Or lots. Or fewer than zero." in order to cover their backs, but it does work in the vast majority of cases, including kicking out power cables, taking axes to network infrastructure, etc.

I think the answer is that you'd need an indefinite amount of time to get those semantics, because the client would have to wait for a definitive result from the server, which may never come. That requirement is impractical on real networks.

If the client ever gives up trying (or if the server goes down for a prolonged period either before completing the transaction, or before signalling that it is complete, depending what order it does those things) then there may be no way for the client to know whether the request was received and handled. In practice, RPC systems may for example want to respect default TCP timeouts, so do not want to have to wait for a definitive success or failure from the server.

That's a guess though: I have never designed an RPC protocol.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow