문제

Here is function that i am writing on 64 bit linux machine.

void myfunc(unsigned char* arr) //array of 8 bytes is passed by reference
{
   unsigned long a = 0; //8 bytes
   unsigned char* LL = (unsigned char*) &a;

   LL[0] = arr[6];
   LL[1] = arr[3];
   LL[2] = arr[1];
   LL[3] = arr[7];
   LL[4] = arr[5];
   LL[5] = arr[4];
   LL[6] = arr[0];
   LL[7] = arr[2];
}

Now my questions are:

  1. Will variable 'a' be stored in a register so that It wont be accessed again and again from RAM or chache?
  2. Working on 64 bit architecture, should I assume that 'arr' array will be stored in a register as functions parameters are stored in a register in 64 bit arch?
  3. How efficient is Pointer type casting? my guess is that It should be inefficient at all?

Any help would be appriciated.

Regards

도움이 되었습니까?

해결책

  1. a cannot be stored in a register, as you have taken its address. (valdo correctly points out that a really smart compiler could optimize the array accesses into bit operations and leave a in a register, but I've never seen a compiler do that, and I'm not sure it would wind up being faster).
  2. arr (the pointer itself) is stored in a register (%edi, on amd64). The contents of the array are in memory.
  3. Pointer type casting by itself often generates no code at all. However, doing silly things with type casts can lead to very inefficient code, or even to code whose behavior is undefined.

It looks like you are trying to permute the bytes in an array and then shove them into a number, and the machine code your example generates is not terribly bad for that. David's suggestion to use shift and mask operations instead is good (this will also avoid problems if your code ever needs to run on a big-endian machine), and there are also the SSE vector permute instructions, but I have heard they're kind of a pain to use.

Incidentally, you should make the return type of your example function be unsigned long and put return a; at the very end; then you can use gcc -O2 -S and see exactly what you get from compilation. Without the change to return a, GCC will cheerfully optimize away the entire body of the function, since it has no externally visible side effects.

다른 팁

You might do better to use explicit shift and mask instructions to accomplish this, instead of using array indexing.

The array operations are going to make it harder for the compiler to use registers for this, because there typically are not instructions that do things like "load 8 bits from the 3rd byte of register A". (An optimizing compiler could figure out that it's possible to do this with shifts/masks, but I'm not sure how likely that is).

  1. The question about if the variable a will be stored in the register is a matter of optimization. Since there's no volatile modifier IMHO a smart compiler will do this.

  2. It's a question of the calling convention. If by convention a single pointer parameter is transferred in a register - so will be arr.

  3. Pointer type casting is not an operation that CPU interprets. There's no code generated for it. It just the information for the compiler about what do you mean.

(Actually sometimes casting does produce extra code, but this is related to multiple inheritance and polymorphism)

Depends on your optimization level. You can examine the assembly to answer your questions. With gcc, use the "-S" flag.

gcc -S -O0 -o /tmp/xx-O0.s /tmp/xx.c
gcc -S -O3 -o /tmp/xx-O3.s /tmp/xx.c

The generated assembly is completely different. (Be sure to make the return a; change suggested by Zack.)

See also this message for hints on how to generate a mixed c/assembly listing (which quickly becomes useless with optimization).

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top