Question

If I have a dataset, where the keys are points in 3D, represented by 3 signed 64 bit integers. And I want to use a (sorted)key-value store to store them, where keys are just byte array (but I can specify a comparator). I think I can turn all those points into a byte array by using bit-interleaving, as is done with Z/Morton order like in How to compute a 3D Morton number

In addition to fetching individual points, which can be done more simply without Morton ordering, I want to do range search, where I search within a box that is aligned with the axes. I will define A and B as being respectively the box corners where all coordinates are the lowest, and the opposite corner where all coordinates are the highest.

Now my questions are:

  1. For any point C, which is logically between, A and B, will the Morton number of C also be between the Morton number of A and B? (Isn't that the point of Morton order?)

  2. If 1 is no, can A and B be "rounded" to values that guaranty that C will be included?

  3. Assuming that 1 or 2 is possible, does the search return also points outside of that box, which I have to "post-filter" away? How big is that "error set" (does it depend on the size, or the position, of the search)?

  4. Does the fact that the integers are signed cause problems? And if so, is there a work-around?

To recap, using Morton Numbers is just one possible solution to the real problem: How to efficiently range-search in 3D integer space, when the 3D points have to be mapped to a unidimensional value? I want to get all the points between A and B, by performing a single range select in the DB, using a min-key and a max-key, and ideally, getting as few points outside the box as possible.

Was it helpful?

Solution 2

Seems like I have to answer my question myself. And the answer will relate to the z order curve, which is what I actually asked for. This is as far as I get it:

  1. Yes. It always work like that with z order curves.
  2. Is irrelevant, because 1) is true.
  3. It depends on how big the area is, and where it is, but the worst case scenario is when you actually include the centre of the space, rather then when you are far from it. For any dimensionality, if you use indexes of depth N bits in each dimension, there are special cases, where the z order curve give you an exact match, and you only get points in the search box. This happens when the following conditions are met:
    1. The search area is the same size in all dimension, and a power of two, 2^M, where 0 <= M <= N.
    2. The search area is a hypercube aligned with all axis.
    3. All coordinates of the search area hypercube corners are a multiple of 2^M. When all those conditions are met, the search area corresponds exactly to a sub-tree node, and therefore matches exactly. In the general case, though, the best you can do is find the smallest node that will hold all the desired points, and subdivide it recursively in smaller nodes that provide a partial match, to some maximum desired depth, to get a better match at the cost of using multiple queries. Finding the smallest tree node that contains all corners of the area is equivalent to finding the common prefix of the Morton code for all corners. And when using a single query, the amount of points outside the area returned is the volume of that node queried minus the volume of the search area.
  4. I'm pretty sure that's a problem, but I haven't found info on that yet.

It seems what I'm saying isn't clear, so I'll do a little ASCII ART...

A basic Z-order (Morton order) curve in 2D goes like this (path is A,B,C,D):

x 0 1

0 A B

1 C D

So, A = (0,0) B = (0,1) C = (1,0) D = (1,1)

Now a basic Hilbert curve in 2D goes like this (path is A,B,C,D):

x 0 1

0 A D

1 B C

So, A = (0,0) B = (1,0) C = (1,1) D = (0,1)

With a Z-order (Morton order), the curve will start from the lowest point A (0,0), and end with the highest point D (1,1), while covering all the points in between. This makes a range search easy, and it work in 3D too. All the points in a 3D box (0,0,0) to (1,1,1) are between the Morton order code of (0,0,0) and (1,1,1).

With an Hilbert curve, you start at (0,0) and ends at (0,1) (Probably similar in 3D). Therefore, performing the range-query from the Hilbert code of (0,0) to (1,1) will NOT find all points.

Therefore, if I am to use an Hilbert curve to perform my range query of a 3D box (0,0,0) to (1,1,1) (as a single DB query), I need a function that tells me what point should I use as first point, and what point I should use as last point, because (0,0,0) and (1,1,1) will not work.

So, is there such a function, where you give your 8 (for 3D) box coordinates, and it returns the first and last points to use in your range query? Without that, I cannot use Hilbert curves to solve my problem.

Or, in pseudo-SQL:

Morton Query: SELECT key,data FROM table WHERE (key >= Morton(lowest(box))) AND (key <= Morton(highest(box)))

Hilbert Query: SELECT key,data FROM table WHERE (key >= Hilbert(XXX(box))) AND (key <= Hilbert(YYY(box)))

And so I need the XXX() and YYY().

[EDIT] A part of this answer was targeted at some other answer telling me to use Hilbert curves, but which has since deleted their answer.

OTHER TIPS

4) Yes, the sign will cause an issue, but it's trivial to solve.

Just XOR the sign bit of x, y and z with 1 before you create your Morton Number.

Why this works (using 1 dimension signed bytes instead):

-1 in binary is 11111111

0 in binary is 00000000

1 in binary is 00000001

The order you want is -1, 0, 1, but the current binary order is 0, 1, -1.

-1 XOR 10000000 = 01111111

0 XOR 10000000 = 10000000

1 XOR 10000000 = 10000001

Now your binary order is correct

There is in general not much you can expect from trying to reduce 3-dimensions to 1 dimension. What you could try though: find the major axis (i.e. do a line fit through your points), then project the points onto the line. This gives you requested one-dimensional value for each point. When you project the boxes' corners onto to the line and take enclosing interval of these values, you get the search range.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top