Question

I'm trying to figure out the purpose of this piece of code, from the Tiled utility's map format documentation.

const int gid = data[i] |
                data[i + 1] << 8 |
                data[i + 2] << 16 |
                data[i + 3] << 24;

It looks like there is some "or-ing" and shifting of bits, but I have no clue what the aim of this is, in the context of using data from the tiled program.

Was it helpful?

Solution 2

As you noted, the << operator shifts bits to the left by the given number.

This block takes the data[] array, which has four (presumably one byte) elements, and "encodes" those four values into one integer.

Example Time!

data[0] = 0x3A; // 0x3A =  58 = 0011 1010 in binary
data[1] = 0x48; // 0x48 =  72 = 0100 1000 in binary
data[2] = 0xD2; // 0xD2 = 210 = 1101 0010 in binary
data[3] = 0x08; // 0x08 =   8 = 0000 1000 in binary

int tmp0 = data[0];       // 00 00 00 3A = 0000 0000 0000 0000 0000 0000 0011 1010
int tmp1 = data[1] << 8;  // 00 00 48 00 = 0000 0000 0000 0000 0100 1000 0000 0000
int tmp2 = data[2] << 16; // 00 D2 00 00 = 0000 0000 1101 0010 0000 0000 0000 0000
int tmp3 = data[3] << 24; // 08 00 00 00 = 0000 1000 0000 0000 0000 0000 0000 0000

// "or-ing" these together will set each bit to 1 if any of the bits are 1
int gid = tmp1 | // 00 00 00 3A = 0000 0000 0000 0000 0000 0000 0011 1010
          tmp2 | // 00 00 48 00 = 0000 0000 0000 0000 0100 1000 0000 0000
          tmp3 | // 00 D2 00 00 = 0000 0000 1101 0010 0000 0000 0000 0000
          tmp4;  // 08 00 00 00 = 0000 1000 0000 0000 0000 0000 0000 0000

gid == 147998778;// 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010

Now, you've just encoded four one-byte values into a single four-byte integer.

If you're (rightfully) wondering, why would anyone want to go through all that effort when you can just use byte and store the four single-byte pieces of data directly into four bytes, then you should check out this question:

int, short, byte performance in back-to-back for-loops


Bonus Example!

To get your encoded values back, we use the "and" operator along with the right-shift >>:

int gid = 147998778;    // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010

// "and-ing" will set each bit to 1 if BOTH bits are 1

int tmp0 = gid &        // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
           0x000000FF;  // 00 00 00 FF = 0000 0000 0000 0000 0000 0000 1111 1111
int data0 = tmp0;       // 00 00 00 3A = 0000 0000 0000 0000 0000 0000 0011 1010

int tmp1 = gid &        // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
           0x0000FF00;  // 00 00 FF 00 = 0000 0000 0000 0000 1111 1111 0000 0000
tmp1;      //value of tmp1 00 00 48 00 = 0000 0000 0000 0000 0100 1000 0000 0000
int data1 = tmp1 >> 8;  // 00 00 00 48 = 0000 0000 0000 0000 0000 0000 0100 1000

int tmp2 = gid &        // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
           0x00FF0000;  // 00 FF 00 00 = 0000 0000 1111 1111 0000 0000 0000 0000
tmp2;      //value of tmp2 00 D2 00 00 = 0000 0000 1101 0010 0000 0000 0000 0000
int data2 = tmp2 >> 16; // 00 00 00 D2 = 0000 0000 0000 0000 0000 0000 1101 0010

int tmp3 = gid &        // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
           0xFF000000;  // FF 00 00 00 = 1111 1111 0000 0000 0000 0000 0000 0000
tmp3;      //value of tmp3 08 00 00 00 = 0000 1000 0000 0000 0000 0000 0000 0000
int data3 = tmp3 >> 24; // 00 00 00 08 = 0000 0000 0000 0000 0000 0000 0000 1000

The last "and-ing" for tmp3 isn't needed, since the bits that "fall off" when shifting are just lost and the bits coming in are zero. So:

gid;                   // 08 D2 48 3A = 0000 1000 1101 0010 0100 1000 0011 1010
int data3 = gid >> 24; // 00 00 00 08 = 0000 0000 0000 0000 0000 0000 0000 1000

but I wanted to provide a complete example.

OTHER TIPS

Tiled stores its layer "Global Tile ID" (GID) data in an array of 32-bit integers, base64-encoded and (optionally) compressed in the XML file.

According to the documentation, these 32-bit integers are stored in little-endian format -- that is, the first byte of the integer contains the least significant byte of the number. As an analogy, in decimal, writing the number "1234" in little-endian would look like 4321 -- the 4 is the least significant digit in the number (representing a value of just 4), the 3 is the next-least-significant (representing 30), and so on. The only difference between this example and what Tiled is doing is that we're using decimal digits, while Tiled is using bytes, which are effectively digits that can each hold 256 different values instead of just 10.

If we think about the code in terms of decimal numbers, though, it's actually pretty easy to understand what it's doing. It's basically reconstructing the integer value from the digits by doing just this:

int digit[4] = { 4, 3, 2, 1 }; // our decimal digits in little-endian order
int gid = digit[0] +
          digit[1] * 10 +
          digit[2] * 100 +
          digit[3] * 1000;

It's just moving each digit into position to create the full integer value. (In binary, bit shifting by multiples of 8 is like multiplying by powers of 10 in decimal; it moves a value into the next 'significant digit' slot)

More information on big-endian and little-endian and why the difference matters can be found in On Holy Wars And A Plea For Peace, an important (and entertainingly written) document from 1980 in which Danny Cohen argued for the need to standardise on a single byte ordering for network protocols. (spoiler: big-endian eventually won that fight, and so the big-endian representation of integers is now the standard way to represent integers in files and network transmissions -- and has been for decades. Tiled's use of little-endian integers in their file format is somewhat unusual. And results in needing code like the code you quoted in order to reliably convert the little-endian integers in the data file into the computer's native format. If they'd stored their data in the standard big-endian format, every OS provides standard utility functions for converting back and forth from big-endian to native, and you could simply have called ntohl() to assemble the native-format integer, instead of needing to write and comprehend this sort of byte manipulation code manually).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top