Question

I'm trying to put a filter on my HBase Scan object that skips rows that do not have the necessary columns filled in. I figure I should use a skip filter first, but then I get stumped. I don't see in the package summary anything about whether a column is present or not.

Should I use a column value filter, and check to see if the columns in question null or blank? And why do filters return columns (such as ColumnCountGetFilter)? Is there a guide or something someone could point me towards to learn more about Filters that isn't just a collection of javadocs?

Était-ce utile?

La solution

You can look at the source codes of the filter package.

e.g. The source code of ColumnCountGetFilter is quite short, if you look at the following codes,

@Override
public boolean filterAllRemaining() {
  return this.count > this.limit;
}

@Override
public ReturnCode filterKeyValue(KeyValue v) {
  this.count++;
  return filterAllRemaining() ? ReturnCode.SKIP: ReturnCode.INCLUDE;
}

You should understand that the filter implementation returns ReturnCode.SKIP or ReturnCode.INCLUDE, they does't return colmns directly. They return the flags to tell whether should return the KeyValues to the client side.

You may need to implement custom filters, the HBase filter package contains good samples. You can go through them and write your own.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top