سؤال

I need to track a set of perhaps 10 million numbers in Python. (All numbers are between 0 and 2^32). I'll know before hand the max val of an integer, and, between 0 and max, between 20-80% of the values will be in the set.

My current code uses the built in set. This way too slow. As far as performance is considered, the best way to do this is with a bitarray (such as https://pypi.python.org/pypi/bitarray/ ).

It's easy for me to use a bitarray to build a class with add(n) and remove(n) methods. What I don't know how to do is support for n in bitarray_set:. I think I need to use an iterator or iterable, but I'm not sure how to do that. Is this possible? How?

هل كانت مفيدة؟

المحلول

bitarrays support an itersearch method that iterates over all positions where one bitarray occurs in another. Use that:

def __iter__(self):
    return self.bits.itersearch(bitarray([True]))
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top