HBase internals: How is rowkey ordering preserved between StoreFiles?

https://stackoverflow.com/questions/23675118

hbase

23-07-2023
|

Question

In HBase, how is rowkey ordering preserved between StoreFiles?
(IS row rowkey ordering preserved between StoreFiles?)

This is my undersanding of the inner workings (probably flawed...):
When MemStore becomes too big, it is flushed and a new StoreFile is created.
Information in MemStore is ordered by rowkey (->hence also in StoreFile).

e.g. after 2 flushes we could have:

StoreFile 1:
key1 ...
key3 ...
key4 ...

StoreFile 2:
key2 ...
key5 ...
key6 ...

but what we really want (?) for fast retrieval is:

StoreFile 1:
key1 ...
key2 ...
key3 ...

StoreFile 2:
key4 ...
key5 ...
key6 ...

Potential performance problem if rowkey ordering is not preserved between StoreFiles (see example):
-to get data associated to a rowkey, we must do a (binary?) search for each StoreFile...
-also a region split would be much more work.

(Context: I try to optimize -and understand- a test HBase cluster at work.)

Thanks in advance for your help!

Solution

Row key orders are preserved only in one StoreFile and not between StoreFiles.

When we Get, no binary search is required because :

1) HFiles have b-tree-like indexes

2) A heap (PriorityQueue) of StoreFile readers is created when reading from multiple StoreFiles. StoreFile readers in the heap is compared according to its current KeyValue. We always read from the reader whose current KeyValue is "smallest" in order. (Though, optimizations like lazy-seek make things a bit more complicated.)

See org.apache.hadoop.hbase.regionserver.KeyValueHeap for more.

As for region-splitting, a "Reference" to old regions' top/bottom half will be used. And later, compactions will generate new actual HFiles for new regions.

See org.apache.hadoop.hbase.io.HalfStoreFileReader for more.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow