I've coded up a KD-Tree in Java using the "median of list" algorithm for constructing a more balanced tree. It seems to work fine when using the data provided by the wiki, note that the wikipedia example uses only X,Y values, so it doesn't evaluate the Z depth.
From wikipedia:
point_list = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)]
From my java program:
depth=0 id=(7.0, 2.0, 0.0)
├── [left] depth=1 id=(5.0, 4.0, 0.0)
│ ├── [left] depth=2 id=(2.0, 3.0, 0.0)
│ └── [right] depth=2 id=(4.0, 7.0, 0.0)
└── [right] depth=1 id=(9.0, 6.0, 0.0)
└── [left] depth=2 id=(8.0, 1.0, 0.0)
But when I use the "median of list" approach on this data, it doesn't seem to work properly.
point list = [(1,0,-1), (1,0,-2), (1,0,1), (1,0,2)]
I get a tree like this:
depth=0 id=(1.0, 0.0, 1.0)
├── [left] depth=1 id=(1.0, 0.0, -2.0)
│ └── [left] depth=2 id=(1.0, 0.0, -1.0)
└── [right] depth=1 id=(1.0, 0.0, 2.0)
Which doesn't look correct because (1.0, 0.0, 2.0) is to the right of (1.0, 0.0, 1.0) but they are essentially equal because their Y values are equal. Also, (1.0, 0.0, -1.0) is to the left of (1.0, 0.0, -2.0) and it should be to the right since it's Z value is greater.
I think the problem stems from having equal X and Y values and only variable Z values, so the median of the list doesn't really split the list accurately.
... original code following the wiki's python code ...
private static KdNode createNode(List<XYZPoint> list, int k, int depth) {
if (list == null || list.size() == 0) return null;
int axis = depth % k;
if (axis == X_AXIS) Collections.sort(list, X_COMPARATOR);
else if (axis == Y_AXIS) Collections.sort(list, Y_COMPARATOR);
else Collections.sort(list, Z_COMPARATOR);
KdNode node = null;
if (list.size() > 0) {
int mediaIndex = list.size() / 2;
node = new KdNode(k, depth, list.get(mediaIndex));
if ((mediaIndex - 1) >= 0) {
List<XYZPoint> less = list.subList(0, mediaIndex);
if (less.size() > 0) {
node.lesser = createNode(less, k, depth + 1);
node.lesser.parent = node;
}
}
if ((mediaIndex + 1) <= (list.size() - 1)) {
List<XYZPoint> more = list.subList(mediaIndex + 1, list.size());
if (more.size() > 0) {
node.greater = createNode(more, k, depth + 1);
node.greater.parent = node;
}
}
}
return node;
}
... new code based on my comment ...
private static KdNode createNode(List<XYZPoint> list, int k, int depth) {
if (list == null || list.size() == 0) return null;
int axis = depth % k;
if (axis == X_AXIS) Collections.sort(list, X_COMPARATOR);
else if (axis == Y_AXIS) Collections.sort(list, Y_COMPARATOR);
else Collections.sort(list, Z_COMPARATOR);
KdNode node = null;
if (list.size() > 0) {
int medianIndex = list.size() / 2;
node = new KdNode(k, depth, list.get(medianIndex));
List<XYZPoint> less = new ArrayList<XYZPoint>(list.size()-1);
List<XYZPoint> more = new ArrayList<XYZPoint>(list.size()-1);
//Process list to see where each non-median point lies
for (int i=0; i<list.size(); i++) {
if (i==medianIndex) continue;
XYZPoint p = list.get(i);
if (KdNode.compareTo(depth, k, p, node.id)<=0) {
less.add(p);
} else {
more.add(p);
}
}
if (less.size() > 0) {
node.lesser = createNode(less, k, depth + 1);
node.lesser.parent = node;
}
if (more.size() > 0) {
node.greater = createNode(more, k, depth + 1);
node.greater.parent = node;
}
}