문제

I am new to bloom filter concept. Please let me know your thoughts on this. I have 3 types of categories. Each type contains billions of categories.

  1. Do I need 3 bloom filter objects or is there any way to manage all the category types in object?

  2. I am using Apache hadoop bloom filter implementation i.e org.apache.hadoop.util.bloom.Filter. Is there any other implementation better than this?

  3. What should be the ideal bit array size to handle billion records?

도움이 되었습니까?

해결책

  1. Do I need 3 bloom filter objects: depending on what you want to do (you didn't describe that), yes.

  2. Is there any other implementation: sure! Try using Google.

  3. Ideal bit array size: it depends on what you want to do. Try reading the Wikipedia article about Bloom filters. There are formulas to calculate the probability.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top