A Client Walks Into a Server And Asks "What's New?" – Problems With Sequence Numbers

Question 1

One thing you can do is maintain a heap of sequence numbers that are in use as well as the "next sequence number to assign". Try the following:

When you grab a sequence number, put it in an "in-use" map.
When you are done making changes with that sequence number, remove it from the "in-use" std::set
Keep track of the minimum in the set. Whenever the minimum changes from "x" to "y", have Client C request values from x to y, but no greater than y.

So, in your example, when you update the sequence to 2, 1 is put in the in-use set. Then, when you update to 3, 2 is put in there, and the set contains 1 and 2. When the work for 2 is done, 2 is removed from the set, but client C does not pick up any changes because the minimum, 1, is unchanged. When client A is done with 1, the minumum changes from 1 to 3, and client C may read the changes from 1 to 3.

For a more complex example, suppose you have 6 clients use sequence numbers 11, 12, 13, 14, 15, and 16, yet finish in the following order: 12, 13, 11, 15, 14, 16 (which is the order they are removed from the "in-use" set. In this example, after 11 is gone, client C may read 11 through 13, because the minumum changes from 11 to 14. Then, after 14 is gone, client C may read 14 and 15, as the minimum changes from 14 to 16. Then, when 16 is gone, client C can read 16.

This is basically the algorithm we use in TokuMX replication that decides what oplog entries may be replicated to secondaries. Clients A and B would be threads doing writes to the oplog and Client C would be a tailable cursor from the secondary pulling oplog data.

Question 2

This is clarifying and providing some examples of Zardosht Kasheff's correct answer above.

At the start of an update we add the sequence number to an in-use list
Any client requesting “what’s new” can’t ask for anything >= the lowest number on the list.
Thus if a couple clients were updating with sequences 10, 11, 12, a client would ask for everything from > sequenceOfLastItemReceivedByClient and < 10 (the lowest number on the in use list.)

Here's my original example solved:

Sequence starts at 1
Client A starts update. Uses sequence 2.  In use table now has “2"
Client B starts update. Uses sequence 3. In use table now has “2,3"
Client B updates rows with sequence 3. In use table now has “2"
Client C wants to request >1 .  However, min in-use table is 2.  So it requests >1 and <2 which is none.
Client A updates rows with sequence 2. In use table now empty.
Client C wants to request changes >1. Gets A and B.

And Zardosht's example

Clients start updates with sequences 11, 12, 13, 14, 15, and 16. In use table is "11, 12, 13, 14, 15, 16”
Updates 12, 13, 11 finish.  In use table is “14, 15, 16”
Client C wants to request >1.  Is allowed >1 and <14.  Gets sequences 11, 12, 13
Updates 15, 14, 16 finish.  In use table is empty.
Client C requests >13. Gets 15, 14, 16.

And a final example:

Clients start updates with sequences 11, 12, 13, 14, 15, and 16. In use table is "11, 12, 13, 14, 15, 16”
Updates 12, 16, 11 finish.  In use table is “13, 14, 15”
Client C wants to request >1.  Is allowed >1 and <13.  Gets sequences 11, 12
Updates 13, 14, 15 finish.  In use table is empty.
Client C requests >12. Gets 13, 14, 15, 16

All above examples pass. This seems to be a workable solution.