سؤال

I'm working on implementing rollback operation in hbase. My component is fed with all information to do put (actually there are hundred of such puts) - table, timestamp (might be null), family, qualifier, value. It buffers them, then calls HTable.put() in a batch. Considering the fact that data is not pre-verified any put might fail.

I'm trying to implement the way to roll-back what was already done before the failing put().

As I see there are 3 ways to roll-back put:

  1. Delete new item (if no such item existed before)
  2. Do nothing (if exactly the same item (including timestamp) existed before)
  3. Execute another Put (if new Put changed some data in the old row. NB: I know that in hbase there is no way to change data in place. By 'changed' I refer to the fact that new data was written to the same row/timestamp/family/qualifier, and old one was discarded - as in my setup hbase is instructed to keep only one version of the item).

So the question is - how to distinguish between these 3 puts? Of course it is matter of querying hbase for particular item, but doing plain get/scan for few hundred items seems not very efficient to me.

So I'm looking for some way to do batch get / scan on hbase.

هل كانت مفيدة؟

المحلول

I'd use something like Apache BookKeeper to store the transaction log and then read a ledger to perform the rollbacks with using HBAse checkAndPut

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top