How to compute a 3D Morton number (interleave the bits of 3 ints)

https://stackoverflow.com/questions/1024754

06-07-2019
|

Question

I'm looking for a fast way to compute a 3D Morton number. This site has a magic-number based trick for doing it for 2D Morton numbers, but it doesn't seem obvious how to extend it to 3D.

So basically I have 3 10-bit numbers that I want to interleave into a single 30 bit number with a minimal number of operations.

Solution

You can use the same technique. I'm assuming that variables contain 32-bit integers with the highest 22 bits set to 0 (which is a bit more restrictive than necessary). For each variable x containing one of the three 10-bit integers we do the following:

x = (x | (x << 16)) & 0x030000FF;
x = (x | (x <<  8)) & 0x0300F00F;
x = (x | (x <<  4)) & 0x030C30C3;
x = (x | (x <<  2)) & 0x09249249;

Then, with x,y and z the three manipulated 10-bit integers we get the result by taking:

x | (y << 1) | (z << 2)

The way this technique works is as follows. Each of the x = ... lines above "splits" groups of bits in half such that there is enough space in between for the bits of the other integers. For example, if we consider three 4-bit integers, we split one with bits 1234 into 000012000034 where the zeros are reserved for the other integers. In the next step we split 12 and 34 in the same way to get 001002003004. Even though 10 bits doesn't make for a nice repeated division in two groups, you can just consider it 16 bits where you lose the highest ones in the end.

As you can see from the first line, you actually only need that for each input integer x it holds that x & 0x03000000 == 0.

OTHER TIPS

Here is my solution with a python script:

I took the hint from in his comment: Fabian “ryg” Giesen
Read the long comment below! We need to keep track which bits need to go how far!
Then in each step we select these bits and move them and apply a bitmask (see comment last lines) to mask them!

Bit Distances: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Bit Distances (binary): ['0', '10', '100', '110', '1000', '1010', '1100', '1110', '10000', '10010']
Shifting bits by 1   for bits idx: []
Shifting bits by 2   for bits idx: [1, 3, 5, 7, 9]
Shifting bits by 4   for bits idx: [2, 3, 6, 7]
Shifting bits by 8   for bits idx: [4, 5, 6, 7]
Shifting bits by 16  for bits idx: [8, 9]
BitPositions: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Shifted bef.:   0000 0000 0000 0000 0000 0011 0000 0000 hex: 0x300
Shifted:        0000 0011 0000 0000 0000 0000 0000 0000 hex: 0x3000000
NonShifted:     0000 0000 0000 0000 0000 0000 1111 1111 hex: 0xff
Bitmask is now: 0000 0011 0000 0000 0000 0000 1111 1111 hex: 0x30000ff

Shifted bef.:   0000 0000 0000 0000 0000 0000 1111 0000 hex: 0xf0
Shifted:        0000 0000 0000 0000 1111 0000 0000 0000 hex: 0xf000
NonShifted:     0000 0011 0000 0000 0000 0000 0000 1111 hex: 0x300000f
Bitmask is now: 0000 0011 0000 0000 1111 0000 0000 1111 hex: 0x300f00f

Shifted bef.:   0000 0000 0000 0000 1100 0000 0000 1100 hex: 0xc00c
Shifted:        0000 0000 0000 1100 0000 0000 1100 0000 hex: 0xc00c0
NonShifted:     0000 0011 0000 0000 0011 0000 0000 0011 hex: 0x3003003
Bitmask is now: 0000 0011 0000 1100 0011 0000 1100 0011 hex: 0x30c30c3

Shifted bef.:   0000 0010 0000 1000 0010 0000 1000 0010 hex: 0x2082082
Shifted:        0000 1000 0010 0000 1000 0010 0000 1000 hex: 0x8208208
NonShifted:     0000 0001 0000 0100 0001 0000 0100 0001 hex: 0x1041041
Bitmask is now: 0000 1001 0010 0100 1001 0010 0100 1001 hex: 0x9249249

x &= 0x3ff
x = (x | x << 16) & 0x30000ff   <<< THIS IS THE MASK for shifting 16 (for bit 8 and 9)
x = (x | x << 8) & 0x300f00f
x = (x | x << 4) & 0x30c30c3
x = (x | x << 2) & 0x9249249

So for a 10bit number and 2 interleaving bits (for 32 bit), you need to do the following!:

x &= 0x3ff
x = (x | x << 16) & 0x30000ff   #<<< THIS IS THE MASK for shifting 16 (for bit 8 and 9)
x = (x | x << 8) & 0x300f00f
x = (x | x << 4) & 0x30c30c3
x = (x | x << 2) & 0x9249249

And for a 21bit number and 2 interleaving bits (for 64bit), you need to do the following!:

x &= 0x1fffff
x = (x | x << 32) & 0x1f00000000ffff
x = (x | x << 16) & 0x1f0000ff0000ff
x = (x | x << 8) & 0x100f00f00f00f00f
x = (x | x << 4) & 0x10c30c30c30c30c3
x = (x | x << 2) & 0x1249249249249249

And for a 42bit number and 2 interleaving bits (for 128bit), you need to do the following ( in case you need it ;-)) :

x &= 0x3ffffffffff
x = (x | x << 64) & 0x3ff0000000000000000ffffffffL
x = (x | x << 32) & 0x3ff00000000ffff00000000ffffL
x = (x | x << 16) & 0x30000ff0000ff0000ff0000ff0000ffL
x = (x | x << 8) & 0x300f00f00f00f00f00f00f00f00f00fL
x = (x | x << 4) & 0x30c30c30c30c30c30c30c30c30c30c3L
x = (x | x << 2) & 0x9249249249249249249249249249249L

Python Script to produce and check the Interleaving Patterns!!!

def prettyBinString(x,d=32,steps=4,sep=".",emptyChar="0"):
    b = bin(x)[2:]
    zeros = d - len(b)


    if zeros <= 0: 
        zeros = 0
        k = steps - (len(b) % steps)
    else:
        k = steps - (d % steps)

    s = ""
    #print("zeros" , zeros)
    #print("k" , k)
    for i in range(zeros): 
        #print("k:",k)
        if(k%steps==0 and i!= 0):
            s+=sep
        s += emptyChar
        k+=1

    for i in range(len(b)):
        if( (k%steps==0 and i!=0 and zeros == 0) or  (k%steps==0 and zeros != 0) ):
            s+=sep
        s += b[i]
        k+=1
    return s    

def binStr(x): return prettyBinString(x,32,4," ","0")


def computeBitMaskPatternAndCode(numberOfBits, numberOfEmptyBits):
    bitDistances=[ i*numberOfEmptyBits for i in range(numberOfBits) ]
    print("Bit Distances: " + str(bitDistances))
    bitDistancesB = [bin(dist)[2:] for dist in  bitDistances]
    print("Bit Distances (binary): " + str(bitDistancesB))
    moveBits=[] #Liste mit allen Bits welche aufsteigend um 2, 4,8,16,32,64,128 stellen geschoben werden müssen

    maxLength = len(max(bitDistancesB, key=len))
    abort = False
    for i in range(maxLength):
        moveBits.append([])
        for idx,bits in enumerate(bitDistancesB):
            if not len(bits) - 1 < i:
                if(bits[len(bits)-i-1] == "1"):
                    moveBits[i].append(idx)

    for i in range(len(moveBits)):
        print("Shifting bits by " + str(2**i) + "\t for bits idx: " + str(moveBits[i]))

    bitPositions = range(numberOfBits);
    print("BitPositions: " + str(bitPositions))
    maskOld = (1 << numberOfBits) -1

    codeString = "x &= " + hex(maskOld) + "\n"
    for idx in xrange(len(moveBits)-1, -1, -1):
        if len(moveBits[idx]):


           shifted = 0
           for bitIdxToMove in moveBits[idx]:
                shifted |= 1<<bitPositions[bitIdxToMove];
                bitPositions[bitIdxToMove] += 2**idx; # keep track where the actual bit stands! might get moved several times

           # Get the non shifted part!     
           nonshifted = ~shifted & maskOld

           print("Shifted bef.:\t" + binStr(shifted) + " hex: " + hex(shifted))
           shifted = shifted << 2**idx
           print("Shifted:\t" + binStr(shifted)+ " hex: " + hex(shifted))

           print("NonShifted:\t" + binStr(nonshifted) + " hex: " + hex(nonshifted))
           maskNew =  shifted | nonshifted
           print("Bitmask is now:\t" + binStr(maskNew) + " hex: " + hex(maskNew) +"\n")
           #print("Code: " + "x = x | x << " +str(2**idx)+ " & " +hex(maskNew))

           codeString += "x = (x | x << " +str(2**idx)+ ") & " +hex(maskNew) + "\n"
           maskOld = maskNew
    return codeString


numberOfBits = 10;
numberOfEmptyBits = 2;
codeString = computeBitMaskPatternAndCode(numberOfBits,numberOfEmptyBits);
print(codeString)

def partitionBy2(x):
    exec(codeString)
    return x

def checkPartition(x):
    print("Check partition for: \t" + binStr(x))
    part = partitionBy2(x);
    print("Partition is : \t\t" + binStr(part))
    #make the pattern manualy
    partC = long(0);
    for bitIdx in range(numberOfBits):
        partC  = partC | (x & (1<<bitIdx)) << numberOfEmptyBits*bitIdx
    print("Partition check is :\t" + binStr(partC))
    if(partC == part):
        return True
    else:
        return False

checkError = False        
for i in range(20):
    x = random.getrandbits(numberOfBits);
    if(checkPartition(x) == False):
        checkError = True
        break
if not checkError:
    print("CHECK PARTITION SUCCESSFUL!!!!!!!!!!!!!!!!...")
else:
    print("checkPartition has ERROR!!!!")

I will add the decoding code as well in a while!

The simplest is probably a lookup table, if you've 4K free space:

static uint32_t t [ 1024 ] = { 0, 0x1, 0x8, ... };

uint32_t m ( int a, int b, int c )
{
    return t[a] | ( t[b] << 1 ) | ( t[c] << 2 );
}

The bit hack uses shifts and masks to spread the bits out, so each time it shifts the value and ors it, copying some of the bits into empty spaces, then masking out combinations so only the original bits remain.

for example:

x = 0xabcd;
  = 0000_0000_0000_0000_1010_1011_1100_1101    

x = (x | (x << S[3])) & B[3]; 

  = ( 0x00abcd00 | 0x0000abcd ) & 0xff00ff 
  = 0x00ab__cd & 0xff00ff 
  = 0x00ab00cd
  = 0000_0000_1010_1011_0000_0000_1100_1101
x = (x | (x << S[2])) & B[2]; 
  = ( 0x0ab00cd0 | 0x00ab00cd) & 0x0f0f0f0f 
  =   0x0a_b_c_d & 0x0f0f0f0f 
  = 0x0a0b0c0d 
  = 0000_1010_0000_1011_0000_1100_0000_1101
x = (x | (x << S[1])) & B[1]; 
  = ( 0000_1010_0000_1011_0000_1100_0000_1101 | 
      0010_1000_0010_1100_0011_0000_0011_0100 ) &
      0011_0011_0011_0011_0011_0011_0011_0011
  =   0010_0010_0010_0011_0011_0000_0011_0001
x = (x | (x << S[0])) & B[0]; 
  = ( 0010_0010_0010_0011_0011_0000_0011_0001 | 
      0100_0100_0100_0110_0110_0000_0110_0010 ) &
      0101_0101_0101_0101_0101_0101_0101_0101
  =   0100_0010_0100_0101_0101_0000_0101_0001

In each iteration, each block is split in two, the rightmost bit of the leftmost half of the block moved to its final position, and a mask applied so only the required bits remain.

Once you have spaced the inputs out, shifting them so the values of one fall into the zeros of the other is easy.

To extend that technique for more than two bits between values in the final result, you have to increase the shifts between where the bits end up. It gets a bit trickier, as the starting block size isn't a power of 2, so you could either split it down the middle, or on a power of 2 boundary.

So an evolution like this might work:

0000_0000_0000_0000_0000_0011_1111_1111    
0000_0011_0000_0000_0000_0000_1111_1111    
0000_0011_0000_0000_1111_0000_0000_1111    
0000_0011_0000_1100_0011_0000_1100_0011    
0000_1001_0010_0100_1001_0010_0100_1001    

// 0000_0000_0000_0000_0000_0011_1111_1111    
x = ( x | ( x << 16 ) ) & 0x030000ff;
// 0000_0011_0000_0000_0000_0000_1111_1111    
x = ( x | ( x << 8 ) ) & 0x0300f00f;
// 0000_0011_0000_0000_1111_0000_0000_1111    
x = ( x | ( x << 4 ) ) & 0x030c30c3;
// 0000_0011_0000_1100_0011_0000_1100_0011    
x = ( x | ( x << 2 ) ) & 0x09249249;
// 0000_1001_0010_0100_1001_0010_0100_1001

Perform the same transformation on the inputs, shift one by one and another by two, or them together and you're done.

Good timing, I just did this last month!

The key was to make two functions. One spreads bits out to every-third bit. Then we can combine three of them together (with a shift for the last two) to get the final Morton interleaved value.

This code interleaves starting at the HIGH bits (which is more logical for fixed point values.) If your application is only 10 bits per component, just shift each value left by 22 in order to make it start at the high bits.

/* Takes a value and "spreads" the HIGH bits to lower slots to seperate them.
   ie, bit 31 stays at bit 31, bit 30 goes to bit 28, bit 29 goes to bit 25, etc.
   Anything below bit 21 just disappears. Useful for interleaving values
   for Morton codes. */
inline unsigned long spread3(unsigned long x)
{
  x=(0xF0000000&x) | ((0x0F000000&x)>>8) | (x>>16); // spread top 3 nibbles
  x=(0xC00C00C0&x) | ((0x30030030&x)>>4);
  x=(0x82082082&x) | ((0x41041041&x)>>2);
  return x;
}

inline unsigned long morton(unsigned long x, unsigned long y, unsigned long z)
{
  return spread3(x) | (spread3(y)>>1) | (spread3(z)>>2);
}

The following code finds the Morton number of the three 10 bit input numbers. It uses the idee from your link and performs the bit spreading in the steps 5-5, 3-2-3-2, 2-1-1-1-2-1-1-1, and 1-1-1-1-1-1-1-1-1-1 because 10 is not a power of two.

......................9876543210
............98765..........43210
........987....56......432....10
......98..7..5..6....43..2..1..0
....9..8..7..5..6..4..3..2..1..0

Above you can see the location of every bit before the first and after every of the four steps.

public static Int32 GetMortonNumber(Int32 x, Int32 y, Int32 z)
{
    return SpreadBits(x, 0) | SpreadBits(y, 1) | SpreadBits(z, 2);
}

public static Int32 SpreadBits(Int32 x, Int32 offset)
{
    if ((x < 0) || (x > 1023))
    {
        throw new ArgumentOutOfRangeException();
    }

    if ((offset < 0) || (offset > 2))
    {
        throw new ArgumentOutOfRangeException();
    }

    x = (x | (x << 10)) & 0x000F801F;
    x = (x | (x <<  4)) & 0x00E181C3;
    x = (x | (x <<  2)) & 0x03248649;
    x = (x | (x <<  2)) & 0x09249249;

    return x << offset;
}

I took the above and modified it to combine 3 16-bit numbers into a 48- (really 64-) bit number. Perhaps it will save someone the small bit of thinking to get there.

#include <inttypes.h>
#include <assert.h>
uint64_t zorder3d(uint64_t x, uint64_t y, uint64_t z){
     static const uint64_t B[] = {0x00000000FF0000FF, 0x000000F00F00F00F,
                                    0x00000C30C30C30C3, 0X0000249249249249};           
     static const int S[] =  {16, 8, 4, 2}; 
     static const uint64_t MAXINPUT = 65536;

     assert( ( (x < MAXINPUT) ) && 
      ( (y < MAXINPUT) ) && 
      ( (z < MAXINPUT) )
     );

     x = (x | (x << S[0])) & B[0];
     x = (x | (x << S[1])) & B[1];
     x = (x | (x << S[2])) & B[2];
     x = (x | (x << S[3])) & B[3];

     y = (y | (y << S[0])) & B[0];
     y = (y | (y << S[1])) & B[1];
     y = (y | (y << S[2])) & B[2];
     y = (y | (y << S[3])) & B[3];

     z = (z | (z <<  S[0])) & B[0];
     z = (z | (z <<  S[1])) & B[1];
     z = (z | (z <<  S[2])) & B[2];
     z = (z | (z <<  S[3])) & B[3];

     return ( x | (y << 1) | (z << 2) );
    }

Following is the code snippet to generate Morton key of size 64 bits for 3-D point.

using namespace std;

unsigned long long spreadBits(unsigned long long x)
{
    x=(x|(x<<20))&0x000001FFC00003FF;
    x=(x|(x<<10))&0x0007E007C00F801F;
    x=(x|(x<<4))&0x00786070C0E181C3;
    x=(x|(x<<2))&0x0199219243248649;
    x=(x|(x<<2))&0x0649249249249249;
    x=(x|(x<<2))&0x1249249249249249;
    return x;
}

int main()
{
    unsigned long long x,y,z,con=1;
    con=con<<63;
    printf("%#llx\n",(spreadBits(x)|(spreadBits(y)<<1)|(spreadBits(z)<<2))|con);    
}

I had a similar problem today, but instead of 3 numbers, I have to combine an arbitrary number of numbers of any bit length. I employed my own sort of bit spreading and masking algorithm and applied it to C# BigIntegers. Here is the code I wrote. As a compilation step, it figures out the magic numbers and mask for the given number of dimensions and bit depth. Then you can reuse the object for multiple conversions.

/// <summary>
/// Convert an array of integers into a Morton code by interleaving the bits.
/// Create one Morton object for a given pair of Dimension and BitDepth and reuse if when encoding multiple 
/// Morton numbers.
/// </summary>  
public class Morton
{
    /// <summary>
    /// Number of bits to use to represent each number being interleaved.
    /// </summary>
    public int BitDepth { get; private set; }

    /// <summary>
    /// Count of separate numbers to interleave into a Morton number.
    /// </summary>
    public int Dimensions { get; private set; }

    /// <summary>
    /// The MagicNumbers spread the bits out to the right position.
    /// Each must must be applied and masked, because the bits would overlap if we only used one magic number.
    /// </summary>
    public BigInteger LargeMagicNumber { get; private set; }
    public BigInteger SmallMagicNumber { get; private set; }

    /// <summary>
    /// The mask removes extraneous bits that were spread into positions needed by the other dimensions.
    /// </summary>
    public BigInteger Mask { get; private set; }

    public Morton(int dimensions, int bitDepth)
    {
        BitDepth = bitDepth;
        Dimensions = dimensions;
        BigInteger magicNumberUnit = new BigInteger(1UL << (int)(Dimensions - 1));
        LargeMagicNumber = magicNumberUnit;
        BigInteger maskUnit = new BigInteger(1UL << (int)(Dimensions - 1));
        Mask = maskUnit;
        for (var i = 0; i < bitDepth - 1; i++)
        {
            LargeMagicNumber = (LargeMagicNumber << (Dimensions - 1)) | (i % 2 == 1 ? magicNumberUnit : BigInteger.Zero);
            Mask = (Mask << Dimensions) | maskUnit;       
        }
        SmallMagicNumber = (LargeMagicNumber >> BitDepth) << 1; // Need to trim off pesky ones place bit.
    }

    /// <summary>
    /// Interleave the bits from several integers into a single BigInteger.
    /// The high-order bit from the first number becomes the high-order bit of the Morton number.
    /// The high-order bit of the second number becomes the second highest-ordered bit in the Morton number.
    /// 
    /// How it works.
    /// 
    /// When you multupliy by the magic numbers you make multiple copies of the the number they are multplying, 
    /// each shifted by a different amount.
    /// As it turns out, the high order bit of the highest order copy of a number is N bits to the left of the 
    /// second bit of the second copy, and so forth. 
    /// This is because each copy is shifted one bit less than N times the copy number.
    /// After that, you apply the AND-mask to unset all bits that are not in position.
    /// 
    /// Two magic numbers are needed because since each copy is shifted one less than the bitDepth, consecutive
    /// copies would overlap and ruin the algorithm. Thus one magic number (LargeMagicNumber) handles copies 1, 3, 5, etc, while the 
    /// second (SmallMagicNumber) handles copies 2, 4, 6, etc.
    /// </summary>
    /// <param name="vector">Integers to combine.</param>
    /// <returns>A Morton number composed of Dimensions * BitDepth bits.</returns>
    public BigInteger Interleave(int[] vector)
    {
        if (vector == null || vector.Length != Dimensions)
            throw new ArgumentException("Interleave expects an array of length " + Dimensions, "vector");
        var morton = BigInteger.Zero;
        for (var i = 0; i < Dimensions; i++)
        {
            morton |= (((LargeMagicNumber * vector[i]) & Mask) | ((SmallMagicNumber * vector[i]) & Mask)) >> i;
        }
        return morton;
    }


    public override string ToString()
    {
        return "Morton(Dimension: " + Dimensions + ", BitDepth: " + BitDepth 
            + ", MagicNumbers: " + Convert.ToString((long)LargeMagicNumber, 2) + ", " + Convert.ToString((long)SmallMagicNumber, 2)
            + ", Mask: " + Convert.ToString((long)Mask, 2) + ")";
    }
}

Here's a generator I've made in Ruby for producing encoding methods of arbitrary length:

def morton_code_for(bits)
  method = ''

  limit_mask = (1 << (bits * 3)) - 1
  split = (2 ** ((Math.log(bits) / Math.log(2)).to_i + 1)).to_i
  level = 1

  puts "// Coding for 3 #{bits}-bit values"

  loop do
    shift = split
    split /= 2
    level *= 2

    mask = ([ '1' * split ] * level).join('0' * split * 2).to_i(2) & limit_mask

    expression = "v = (v | (v << %2d)) & 0x%016x;" % [ shift, mask ]

    method << expression

    puts "%s // 0b%064b" % [ expression, mask ]

    break if (split <= 1)
  end

  puts
  print "// Test of method results: "
  v = (1 << bits) - 1
  puts eval(method).to_s(2)
end

morton_code_for(21)

The output is suitably generic and can be adapted as required. Sample output:

// Coding for 3 21-bit values
v = (v | (v << 32)) & 0x7fff00000000ffff; // 0b0111111111111111000000000000000000000000000000001111111111111111
v = (v | (v << 16)) & 0x00ff0000ff0000ff; // 0b0000000011111111000000000000000011111111000000000000000011111111
v = (v | (v <<  8)) & 0x700f00f00f00f00f; // 0b0111000000001111000000001111000000001111000000001111000000001111
v = (v | (v <<  4)) & 0x30c30c30c30c30c3; // 0b0011000011000011000011000011000011000011000011000011000011000011
v = (v | (v <<  2)) & 0x1249249249249249; // 0b0001001001001001001001001001001001001001001001001001001001001001

// Test of method results: 1001001001001001001001001001001001001001001001001001001001001

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow