سؤال

I am new to bloom filter concept. Please let me know your thoughts on this. I have 3 types of categories. Each type contains billions of categories.

  1. Do I need 3 bloom filter objects or is there any way to manage all the category types in object?

  2. I am using Apache hadoop bloom filter implementation i.e org.apache.hadoop.util.bloom.Filter. Is there any other implementation better than this?

  3. What should be the ideal bit array size to handle billion records?

هل كانت مفيدة؟

المحلول

  1. Do I need 3 bloom filter objects: depending on what you want to do (you didn't describe that), yes.

  2. Is there any other implementation: sure! Try using Google.

  3. Ideal bit array size: it depends on what you want to do. Try reading the Wikipedia article about Bloom filters. There are formulas to calculate the probability.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top