سؤال

The Java reference here indicates that boolean types, while represented with a "bit" of information, do not have a precisely defined size. In contrast, other types seem to suggest that the size is defined. For example, an int is 32-bits, end of story.

When we look at the spec for a BitSet, we can see that it is composed of boolean values. By the reference above, this seems to suggest that the "size" of a BitSet is undefined - it's composed of boolean values, after all. And sure enough, the documentation specifies:

Note that the size is related to the implementation of a bit set, so it may change with implementation.

So my question is, why not implement a BitSet using another datatype that is precisely defined? For example, if we use a byte, we could guarantee a size of 8-bits, and we wouldn't have the fuzzy feeling that the size may not be what we think it is. It's true that the size would have to be divisible by 8, but at least it seems more size-deterministic this way.

If we have a system that absolutely cannot exceed a certain memory capacity, it seems useful to have a BitSet implementation that is precise in terms of size.

هل كانت مفيدة؟

المحلول

I think that you're getting conceptually stuck by the fact that the method signatures use booleans.

The easiest way to think about a single bit is off/on, so a boolean true/false is a convenient way to model it. Another thing entirely is the BitSet internal storage, which if you have a look at the source code, is using a long array and using bitmasks to twiddle individual bits.

Accordingly, the size of the BitSet is tied pretty closely to the number of bits in use.

نصائح أخرى

Part of the point of BitSet is that its length is conceptually infinite -- that we can manipulate arbitrarily many bits with it. It's not the memory consumption we care about so much as the semantics, and size is only an indication of memory consumption.

A BitSet is not necessarily composed of booleans but would convert the bits to booleans for ease of use (instead of having to check against 0 or 1).

Besides that the implementation would most likely use some datatype but depending on the architecture the bits might be stored using a bunch of 8-, 16-, 32- or 64-bit integers (or something else). In most systems memory constraints are not that hard and thus a bitset with a logical size of say 5 having a real size of 1 or 8 bytes isn't that critical.

True, you could implement a bit set using bytes only but there might be reasons to adhere to the platform's memory alignment (which might be more than one byte).

You let it sound like a BitField needs to be an array of actual booleans, this is not so; You can for example look up the current implementation in the source code of your JDK. Here's a snipplet:

/**
 * The internal field corresponding to the serialField "bits".
 */
private long[] words;

So in this case an array of longs is used and the bits are accessed through bitmasking and shifts.

Unlike most primitives, byte sizes of Java objects are not well defined and are implementation-dependent or may even change during an application's run time due to JIT compilation and different tricks the JVM uses internally. The size of a boolean has changed even between Sun JVM releases (4 vs 1 bytes), and if I'm not mistaken, there was even a time when a single boolean would take 4 bytes and an array of N booleans would take about N*1 bytes (or perhaps it was the byte type?). Anyway, the logical size of a variable or its information capacity may be completely different from the physical memory allocated by JVM.

BitSet consists of boolean values only conceptually and the implementation does not need to follow the logical layout. Indeed, most implementations will use a byte array for BitSet and use approximately only one bit for each value (but there is some slack in order to allow it to grow and some additional housekeeping data).

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top