You can do that but as a map/reduce job not a direct query
HBASE - select distinct query against the rowkey
-
11-12-2021 - |
Question
I have a hbase table called "users", rowkey consists of three parts:
- userid
- messageid
- timestamp
rowkey looks like: ${userid}_${messageid}_${timestamp}
Given I can hash the userid and make the length of the field fixed, is there anyway I can do a query like SQL query:
select distinct(userid) from users
If rowkey doesn't allow me to query like this, does that mean I need to create a separated table just contains all the user ids? I guess if I do something like that, it won't be atomic anymore when I insert a record in, becoz I am dealing with two tables without transaction.
La solution
Autres conseils
you can use HashSet to do that. Something like this :
public Set<String> getDistinctCol(String tableName,String colFamilyName, String colName)
{
Set<String> set = new HashSet<String>();
ResultScanner rs=null;
Result r = null;
String s = null;
try
{
HTable table = new HTable(conf, tableName);
Scan scan = new Scan();
scan.addColumn(Bytes.toBytes(colFamilyName),Bytes.toBytes(colName));
rs = table.getScanner(scan);
while((res=rs.next()) != null)
{
byte [] col = res.getValue(Bytes.toBytes(colFamilyName+":"+colName));
s = Bytes.toString(col);
set.add(s);
}
} catch (IOException e)
{
System.out.println("Exception occured in retrieving data");
}
finally
{
rs.close();
}
return set;
*col in your case is userID.
HTH
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow