Domanda

I have a hbase table called "users", rowkey consists of three parts:

  1. userid
  2. messageid
  3. timestamp

rowkey looks like: ${userid}_${messageid}_${timestamp}

Given I can hash the userid and make the length of the field fixed, is there anyway I can do a query like SQL query:

select distinct(userid) from users

If rowkey doesn't allow me to query like this, does that mean I need to create a separated table just contains all the user ids? I guess if I do something like that, it won't be atomic anymore when I insert a record in, becoz I am dealing with two tables without transaction.

È stato utile?

Soluzione

You can do that but as a map/reduce job not a direct query

Altri suggerimenti

you can use HashSet to do that. Something like this :

public Set<String> getDistinctCol(String tableName,String colFamilyName, String colName)
   {
    Set<String> set = new HashSet<String>();
    ResultScanner rs=null;
    Result r = null;
    String s = null;
    try 
    {
        HTable table = new HTable(conf, tableName);
        Scan scan = new Scan();
        scan.addColumn(Bytes.toBytes(colFamilyName),Bytes.toBytes(colName));
        rs = table.getScanner(scan);
        while((res=rs.next()) != null)
        {
            byte [] col = res.getValue(Bytes.toBytes(colFamilyName+":"+colName));                
            s = Bytes.toString(col);
            set.add(s);
        }
    } catch (IOException e) 
    {
        System.out.println("Exception occured in retrieving data");
    }
    finally
    {
        rs.close();
    }
    return set;

*col in your case is userID.

HTH

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top