سؤال

Number 4 represented as a 32-bit unsigned integer would be

on a big endian machine: 00000000 00000000 00000000 00000100 (most significant byte first)

on a small endian machine: 00000100 00000000 00000000 00000000 (most significant byte last)

As a 8-bit unsigned integer it is represented as 00000100 on both machines.

Now when casting 8-bit uint to a 32-bit I always thought that on a big endian machine that means sticking 24 zeros in front of the existing byte, and appending 24 zeros to the end if the machine is little endian. However, someone pointed out that in both cases zeros are prepended rather than appended. But wouldn't it mean that on a little endian 00000100 will become the most significant byte, which will result in a very large number? Please explain where I am wrong.

هل كانت مفيدة؟

المحلول

Zeroes are prepended if you consider the mathematical value (which just happens to also be the big-endian representation).

Casts in C always strive to preserve the value, not representation. That's how, for example, (int)1.25 results(*note below) in 1, as opposed to something which makes much less sense.

As discussed in the comments, the same holds for bit-shifts (and other bitwise operations, for that matter). 50 >> 1 == 25, regardless of endianness.

(* note: usually, depends rounding mode for float->integer conversion)

In short: Operators in C operate on the mathematical value, regardless of representation. One exception is when you cast a pointer to the value (as in (char*)&foo), since then it is essentially a different "view" to the same data.

نصائح أخرى

Not sure if it answers your question, but will give it a try:

If you take a char variable and cast it to an int variable, then you get the exact same result on both architectures:

char c = 0x12;
int  i = (int)c;     // i == 0x12 on both architectures

If you take an int variable and cast it to a char variable, then you get the exact same result (possibly truncated) on both architectures:

int  i = 0x12345678;
char c = (char)i;    // c == 0x78 on both architectures

But if you take an int variable and read it using a char* pointer, then you get a different result on each architecture:

int  i = 0x12345678;
char c = *(char*)&i; // c == 0x12 on BE architecture and 0x78 on LE architecture

The example above assumes that sizeof(int) == 4 (may be different on some compilers).

Loosely speaking, "Endianness" is the property of how processor sees the data stored in memory. This means that all the processors, when a particular data is brought to the CPU, sees it the same way.

For example:

int a = 0x01020304;

Irrespective of whether a little or big endian machine, would always have 04 as the least significant and 01 as the most significant byte, when stored in it's register.

The problem arises when this variable/data has to be stored in memory, which is "byte addressable". Should 01 (Most Significant Byte) go into the lowest memory address (Big Endian) or the highest memory address (Little Endian).

In your particular example, what you have shown is the representation, the way processor sees it - with LS/MS Byte.


So technically speaking, both little and big endian machines would have:

00000000 00000000 00000000 00000100

in its 32 bit wide register. Assuming of course what you have in memory is 32 bit wide integer representing 4. How this 4 is stored in/retrieved from memory is what endianness is all about.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top