سؤال

I am working on implementing some bloom filter variants, and a very useful data structure for this would be a compact multi-bit array; that is, an array where each element is a compact integer of around 4 bits.

Space efficiency is of the utmost importance here, so while a plain integer array would give me the functionality I want, it would be bulkier than necessary.

Before I try to implement this functionality myself with bit arithmetic, I was wondering if anyone knows of a library out there that already provides such a data structure.

Edit: Static size is fine. The ideal case would be an implementation that is flexible with regard to the number of bits per cell. That might be a bit much to hope for though (no pun intended?).

هل كانت مفيدة؟

المحلول

If you aren't modifying the array after creation, java.util.BitSet does all the bit masking for you but is slow to access since you have to fetch each bit individually and do the masking yourself to re-create the int from 4 bits.

Having said that writing it yourself might be the best way to go. Doing the bit arithmetic yourself isn't that difficult since it's only 2 values per byte so decoding the high bits are (array[i] & 0xF0) >> 4 and the low bits are array[i] & 0x0F

نصائح أخرى

Take a look at the compressed BitSet provided by http://code.google.com/p/javaewah/, it allows to set bits freely and will ensure that it uses memory efficiently via compression algorithms being used.

I.e. something like

        EWAHCompressedBitmap32 set = new EWAHCompressedBitmap32();
        set.set(0);
        set.set(1000000);

will still only occupy a few bytes, not one MB as with the Java BitSet...

You should be able to map the 4-bit integer to the BitSet by multiplying the index into the BitSet accordingly

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top