Filtering integers with HBase + Python

https://stackoverflow.com/questions/23249731

08-07-2023
|

Question

I am trying to filter rows from a HBase table (I am using HappyBase), concretely I am trying to get rows whose 'id' is less than 1000:

for key, data in graph_table.scan(filter="SingleColumnValueFilter('cf', 'id', <, 'binary:1000')"):
    print key, data

The results are the following ones:

<http://ieee.rkbexplorer.com/id/publication-d2a6837e67d808b41ffe6092db50f7cc> {'cf:type': 'v', 'cf:id': '100', 'cf:label': '<http://www.aktors.org/ontology/portal#Proceedings-Paper-Reference>'}
<http://www.aktors.org/ontology/date#1976> {'cf:type': 'v', 'cf:id': '1', 'cf:label': '<http://www.aktors.org/ontology/support#Calendar-Date>'}
<http://www.aktors.org/ontology/date#1985> {'cf:type': 'v', 'cf:id': '10', 'cf:label': '<http://www.aktors.org/ontology/support#Calendar-Date>'}

In the table there are rows with 'id' from 1 to 1000. If I code this in Java using HBase Java library it works fine, parsing integer value with Byte.toBytes() function.

Thank you.

Solution

Well, the problem was that I was saving integers as strings, while the right way is to save them as bytes:

table.put(key, {'cf:id': struct.pack(">q", value)})

When querying to database, the values from the filter have to be packed too:

for key, data in graph_table.scan(filter="SingleColumnValueFilter('cf', 'id', <, 'binary:%s', true, false)" % struct.pack(">q", 1000)):
     print key, data

And finally, unpacking the result:

value = struct.unpack(">q", data['cf:id'])[0]

Thank you very much.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow