Question

I'm migrating a load of code to stop passing byte[]s, InputStreams and InputSuppliers around and just use ByteSource.

The code currently calculates and ETag for the data by use Arrays.hashCode on the raw byte[], which translates to this with a ByteSource:

Arrays.hashCode(dataSource.read());

The problem with this is that dataSource.read() on a ByteArrayInputSource clones the underlying byte[], which is worse than what's currently there.

I'd like to use dataSource.hash(HashFunction) but I want to make sure I don't bust the ETags generated through the hashCode, as this will cause a load of cache invalidations.

Anyone know of a HashFunction that do the job for me?

Was it helpful?

Solution

I don't know of any already available HashFunction that'll do what you want, but it should be pretty easy to write it yourself. Something like:

public final class ByteArrayHashFunction extends AbstractStreamingHashFunction {

  @Override
  public Hasher newHasher() {
    return new ByteArrayHasher();
  }

  @Override
  public int bits() {
    return 32;
  }

  private static final class ByteArrayHasher extends AbstractByteHasher {

    private int hash = 1;

    @Override
    protected void update(byte b) {
      hash = 31 * hash + b;
    }

    @Override
    public HashCode hash() {
      return HashCode.fromInt(hash);
    }
  }
}

You would need to copy a few of the abstract classes from common.hash into your own package though.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top