Domanda

In this youtube tutorial here

it seems that Bitmap index will always create a replica of the whole table when it creates the index. Because it creates the index and against each row, it puts 0 or 1. Is my understadning wrong? The otehr thing is that towards the end of the tutorial it seems that bitmap index cannot operate on a != operator. I thought that = and != seems the same to me from the POV of indexing.

È stato utile?

Soluzione

Every row in the table is represented in a single bit (i.e. either 0 or 1), for at least one distinct value1. I'm not sure that could be considered a replica of the whole table, as that implies that all the data is replicated, and data in other columns is obviously not present. But it does contain data for the whole table, as every row is represented (probably multiple times, all but one with the bit set to zero).

The concepts guide explains what's happening:

Each bit in the bitmap corresponds to a possible rowid. If the bit is set, then the row with the corresponding rowid contains the key value. A mapping function converts the bit position to an actual rowid, so the bitmap index provides the same functionality as a B-tree index although it uses a different internal representation.

The storage structure is also explained.

Coupled with that, when you think of it as a two-dimentional array, it becomes clearer why every row has to be represented for each value. In the example in the documentation, the value for each row has to be represented by one of the distinct values, so a 'column' of the array has to have exactly one bit set to 1. There is no way to have a 'column' that is all zeros - if the column was nullable then null would be another value in the array and null columns in the table would have that bit set to 1 in the index - for a row in the table, so there it wouldn't make sense to not have every row represented.

You can have an array 'column' that is all zeros, but only for rows that don't exist. 'Each bit in the bitmap corresponds to a possible rowid', not necessarily to an actual row. From the storage description you can see that bitmaps are stored against ranges of rowids, and a rowid value in that range might not point to an actual row (in this table).

And that's what makes testing for inequality a problem. You can't just look at one 'row' of the array and say that anything in the 'M' row that is set to zero matches != 'M', because the rowid that bit represents might not actually be a row in the table at all. In a sense, a bit set to zero doesn't tell you anything definite; only a bit set to 1 does. So for an inequality condition, the whole index has to be checked to find values that are 1 for any other value.


1 - Logically every row is represented for every value, but the example data storage in the docs shows different rowid ranges for different values; I guess there's no point storing index data for a range where all the bits are zero, only for ranges where at least one bit is 1. But all rows are still represented in at least one index entry, as a bit set to 1 somewhere. I might be reading too much into their conceptual picture of what's stored.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top