Domanda

I am new to bloom filter concept. Please let me know your thoughts on this. I have 3 types of categories. Each type contains billions of categories.

  1. Do I need 3 bloom filter objects or is there any way to manage all the category types in object?

  2. I am using Apache hadoop bloom filter implementation i.e org.apache.hadoop.util.bloom.Filter. Is there any other implementation better than this?

  3. What should be the ideal bit array size to handle billion records?

È stato utile?

Soluzione

  1. Do I need 3 bloom filter objects: depending on what you want to do (you didn't describe that), yes.

  2. Is there any other implementation: sure! Try using Google.

  3. Ideal bit array size: it depends on what you want to do. Try reading the Wikipedia article about Bloom filters. There are formulas to calculate the probability.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top