Вопрос

I am new to bloom filter concept. Please let me know your thoughts on this. I have 3 types of categories. Each type contains billions of categories.

  1. Do I need 3 bloom filter objects or is there any way to manage all the category types in object?

  2. I am using Apache hadoop bloom filter implementation i.e org.apache.hadoop.util.bloom.Filter. Is there any other implementation better than this?

  3. What should be the ideal bit array size to handle billion records?

Это было полезно?

Решение

  1. Do I need 3 bloom filter objects: depending on what you want to do (you didn't describe that), yes.

  2. Is there any other implementation: sure! Try using Google.

  3. Ideal bit array size: it depends on what you want to do. Try reading the Wikipedia article about Bloom filters. There are formulas to calculate the probability.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top