Question

In a common implementation of a B+ tree, we can assume that keys have a fixed length (e.g. 25 bytes). We can then define that each node must have a minimum number of keys, and a maximum.

If I wanted the tree to accept variable length keys, what should I modify? What if I say that the node must have at least 2 keys, but the key which i'm trying to insert is so big that it doesn't fit into the Block that holds the node?

Était-ce utile?

La solution

The simple solution is to store the keys as pointers (wrapped in a type that overrides the relative operators etc) rather than values, but that of course damages the locality that is part of the point of using B+ trees.

That said, the larger the items, the less it matters that items are adjacent in memory. Huge items won't fit even one to a cache page, let alone several in the same page.

Another relatively simple approach is to use a union type or placement new or whatever to allocate items within a memory-for-item type that's big enough for all item types you might use. You still have a fixed number of bytes per item, but the items don't necessarily use all those bytes.

If you're willing to do the work, you could have variable-sized nodes. You'll have some hassles working with those nodes, of course, depending on how you arrange the in-node data structure to cope with that. You might have a small array of item-pointers within the node, for instance, pointing to the items which are also inside the node (not separately allocated on the heap).

Also, every time you change a node you may need to reallocate it. Even if all you're doing is rebalancing, that might move a huge item from one node into another, and even though the destination node has room in the sense of having a slot free for an item it may not have enough bytes to store the value.

In a sense, each node would be a mini-heap in which you can allocate or release space for items big or small, but sometimes you'd have to go back to the heap proper to replace that mini-heap with a bigger or smaller one.

It's again worth mentioning that if the items are that huge, locality within a node probably isn't relevant anyway.

I've implemented B+-style multiway trees in memory before myself, but I've never gone to this extreme.

Autres conseils

You can keep the rest of a big key in an overflow page, like there.

Use hashing. A hash is a fixed-size representation of a key. For good hashing functions see http://www.cse.yorku.ca/~oz/hash.html.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top