Question

I'm working on an application that basically needs to store a Map<String,Set<String>> (well, it's much more complicated than that but that's the basic idea) and I plan to be doing a lot of

Set<String> strings = storeClient.get("some key");
strings.add("some string");
storeClient.put("some key", strings);

So what I'm trying to understand is when would StoreClient#put create an inconsistency that would be resolved by InconsistencyResolver and when would StoreClient#put just clobber the value?

Était-ce utile?

La solution

Disclaimer: I have not used Voldemort in a long time, and now work at Basho on Riak. That said, I thought this would be an easy question to answer with citations, but the lack of real documentation (and the difficulty of framing google searches that don't return things about Harry Potter) actually presents a real challenge - you pose a very good question. I believe the below to be correct.

Since you're talking about the version of put() where you are not sending a version (vector clock) and don't care if or what is currently in the db ... basically it is just going to overwrite whatever (if anything) is there.

With their architecture they have the concept of a master (coordinating) node for any given (hashed) key where they always write first before replicating to other nodes on the ring which allows them to overwrite/purge any previous version of a value. I'm guessing they are doing this comparison as a CAS or otherwise protected (via locks) operation to prevent any issues with concurrency. When using a BerkeleyDB backend it's very likely they're simply using its built in transaction/locking mechanisms. Given that, you should rarely encounter conflicting values/versions where the client needs to resolve them.

However, according to this post from Jay Kreps he states:

... concurrent versions occur when different clients (or request routers) disagree on whether or not a particular server is available. In the common case this will not occur--each key has a master server and we always write to that server first which allows us to immediately garbage collect any old versions. However in the case where one writer believes the master is down and another believes it is up, it is possible for these two servers to accept conflicting writes. It is necessary that the storage engine have the ability to retain both of these versions until a client can resolve them.

That's where the InconsistencyResolver comes in.

When using the version of put() where you're also sending the version from a previous get, the (master) server will return an indicator that the version is stale and the client will throw an ObsoleteVersionException. Again though, in the case of failed/recovered nodes ... it's possible that concurrent versions could be in the cluster and only the client can resolve them via the InconsistencyResolver.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top