Does HBase support indexing on list properties and non-equality operators?

Question 1

Chris, maybe this at least somehow will help you. In HBase everything depends on your row key design (specially look for openTSDB case). For example in your case key may look like the following:

[name-code] [counts-code] [...]

In this case you easily select range for all records having certain name / counts with Olog(n) complexity. If key doesn't include component calculated from size, you will have O(n) complexity searching for certain size. If key includes size (or at least some calculation based on size) this will speed up process as it allows you to limit range up to Olog(n).

HBase is very straightforward tool allowing you to perform magic things but only if you really know how it works and yes, it is something like 'raw engine' with minimal abstraction.

Please also note if you have lot of records per names / counts field value you probably need to balance such request loading among cluster nodes. So this affects your table / row key design. For example I have now design where linear full scan of table with perfect loading balance is better than limited scan without balancing.

Question 2

Agreeing with Roman;

HBase

is a distributed key/value store
has no built in index structure (apart from third party tools as described here)
has no built in query language support (using Hive may ease but she will disable you to use the data stored in HBase from a programming language without a third party library support. Or you can use HCatalog instead of Hive, Pig gang. But this will make it an ordinary RDBMS with seek latencies for every row as RDBMS platforms do using BTree like structures)
Very good on batch reading according to the rowkey (the only builtin index available); if you design your rowkey well, you will only be very fast first to seek to the startrowkey and read from there in batch with the disk transform rate to the stoprowkey.

If you can design your data this way it will be very well suited.

Apart from that, of course you can filter your data, whether this filter is on rowkey or on the payload, but if there are no startrowkey or stoprowkey, the query (or map/reduce job if it is used) will have to read the entire data even if you put filters on the payload or on the rowkey.

So you must consider these when you make your evaluations.

PS: Because of the rowkey design, startrowkey and stoprowkey is crucial. You may create a compound rowkey but in that the order of the fields will be very important.

Does HBase support indexing on list properties and non-equality operators?

Object

Query

Question: