Calculating memory usage of a B-Tree in Java

https://stackoverflow.com/questions/15897869

02-04-2022
|

Question

I've implemented a simple B-Tree whichs maps longs to ints. Now I wanted to estimate the memory usage of it using the following method (applies to 32bit JVM only):

class BTreeEntry {

    int entrySize;
    long keys[];
    int values[];
    BTreeEntry children[];
    boolean isLeaf;
    ...
    /** @return used bytes */
    long capacity() {
        long cap = keys.length * (8 + 4) + 3 * 12 + 4 + 1;
        if (!isLeaf) {
            cap += children.length * 4;
            for (int i = 0; i < children.length; i++) {
                if (children[i] != null)
                    cap += children[i].capacity();
            }
        }
        return cap;
    }
}
/** @return memory usage in MB */
public int memoryUsage() {
    return Math.round(rootEntry.capacity() / (1 << 20));
}

But I tried it e.g. for 7mio entries and the memoryUsage method reports much higher values than the -Xmx setting would allow! E.g. it says 1040 (MB) and I set -Xmx300! Is the JVM somehow able to optimize the memory layout, eg. for empty arrays or what could be my mistake?

Update1: Ok, introducing the isLeaf boolean reduces memory usage a lot, but still it is unclear why I observed higher values than Xmx. (You can still try out this via using isLeaf == false for all contructors)

Update2: Hmmh, something is very wrong. When increasing the entries per leaf one would assume that the memory usage decreases (when doing compact for both), because less overhead of references is involved for larger arrays (and btree has smaller height). But the method memoryUsage reports an increased value if I use 500 instead 100 entries per leaf.

La solution

Ohh sh... a bit fresh air solved this issue ;)

When an entry is full it will be splitted. In my original split method checkSplitEntry (where I wanted to avoid waste of memory) I made a big memory waste mistake:

// left child: just copy pointer and decrease size to index
BTreeEntry newLeftChild = this;
newLeftChild.entrySize = splitIndex;

The problem here is, that the old children pointers are still accessible. And so, in my memoryUsage method I'm counting some children twice (especially when I did not compact!). So, without this trick all should be fine and my B-Tree will be even more memory efficient as the garbage collector can do its work!

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow