In my course for intro to operating systems, our task is to determine if a system is big or little endian. There's plenty of results I've found on how to do it, and I've done my best to reconstruct my own version of a code. I suspect it's not the best way of doing it, but it seems to work:

#include <stdio.h>
int main() {
    int a = 0x1234;
    unsigned char *start = (unsigned char*) &a;
    int len = sizeof( int );

    if( start[0] > start[ len - 1 ] ) {
        //biggest in front (Little Endian)
        printf("1");
    } else if( start[0] < start[ len - 1 ] ) {
        //smallest in front (Big Endian)
        printf("0");
    } else {
        //unable to determine with set value
        printf( "Please try a different integer (non-zero). " );
    }   
}

I've seen this line of code (or some version of) in almost all answers I've seen:

unsigned char *start = (unsigned char*) &a;

What is happening here? I understand casting in general, but what happens if you cast an int to a char pointer? I know:

unsigned int *p = &a;

assigns the memory address of a to p, and that can you affect the value of a through dereferencing p. But I'm totally lost with what's happening with the char and more importantly, not sure why my code works.

Thanks for helping me with my first SO post. :)

有帮助吗?

解决方案

When you cast between pointers of different types, the result is generally implementation-defined (it depends on the system and the compiler). There are no guarantees that you can access the pointer or that it correctly aligned etc.

But for the special case when you cast to a pointer to character, the standard actually guarantees that you get a pointer to the lowest addressed byte of the object (C11 6.3.2.3 §7).

So the compiler will implement the code you have posted in such a way that you get a pointer to the least significant byte of the int. As we can tell from your code, that byte may contain different values depending on endianess.

If you have a 16-bit CPU, the char pointer will point at memory containing 0x12 in case of big endian, or 0x34 in case of little endian.

For a 32-bit CPU, the int would contain 0x00001234, so you would get 0x00 in case of big endian and 0x34 in case of little endian.

其他提示

If you de reference an integer pointer you will get 4 bytes of data(depends on compiler,assuming gcc). But if you want only one byte then cast that pointer to a character pointer and de reference it. You will get one byte of data. Casting means you are saying to compiler that read so many bytes instead of original data type byte size.

Values stored in memory are a set of '1's and '0's which by themselves do not mean anything. Datatypes are used for recognizing and interpreting what the values mean. So lets say, at a particular memory location, the data stored is the following set of bits ad infinitum: 01001010 ..... By itself this data is meaningless.

A pointer (other than a void pointer) contains 2 pieces of information. It contains the starting position of a set of bytes, and the way in which the set of bits are to be interpreted. For details, you can see: http://en.wikipedia.org/wiki/C_data_types and references therein.

So if you have

a char *c, an short int *i, and a float *f

which look at the bits mentioned above, c, i, and f are the same, but *c takes the first 8 bits and interprets it in a certain way. So you can do things like printf('The character is %c', *c). On the other hand, *i takes the first 16 bits and interprets it in a certain way. In this case, it will be meaningful to say, printf('The character is %d', *i). Again, for *f, printf('The character is %f', *f) is meaningful.

The real differences come when you do math with these. For example,

c++ advances the pointer by 1 byte,

i++ advanced it by 4 bytes,

and f++ advances it by 8 bytes.

More importantly, for

(*c)++, (*i)++, and (*f)++ the algorithm used for doing the addition is totally different.

In your question, when you do a casting from one pointer to another, you already know that the algorithm you are going to use for manipulating the bits present at that location will be easier if you interpret those bits as an unsigned char rather than an unsigned int. The same operatord +, -, etc will act differently depending upon what datatype the operators are looking at. If you have worked in Physics problems wherein doing a coordinate transformation has made the solution very simple, then this is the closest analog to that operation. You are transforming one problem into another that is easier to solve.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top