C programming and error_code variable efficiency

https://stackoverflow.com/questions/1832919

11-09-2019
|

Question

Most code I have ever read uses a int for standard error handling (return values from functions and such). But I am wondering if there is any benefit to be had from using a uint_8 will a compiler -- read: most C compilers on most architectures -- produce instructions using the immediate address mode -- i.e., embed the 1-byte integer into the instruction ? The key instruction I'm thinking about is the compare after a function, using uint_8 as its return type, returns.

I could be thinking about things incorrectly, as introducing a 1 byte type just causes alignment issues -- there is probably a perfectly sane reason why compiles like to pack things in 4-bytes and this is possibly the reason everyone just uses ints -- and since this is stack related issue rather than the heap there is no real overhead.

Doing the right thing is what I'm thinking about. But lets say say for the sake of argument this is a popular cheap microprocessor for a intelligent watch and that it is configured with 1k of memory but does have different addressing modes in its instruction set :D

Another question to slightly specialize the discussion (x86) would be: is the literal in:

uint_32 x=func(); x==1;

and

uint_8 x=func(); x==1;

the same type ? or will the compiler generate a 8-byte literal in the second case. If so it may use it to generate a compare instruction which has the literal as an immediate value and the returned int as a register reference. See CMP instruction types..

Another Refference for the x86 Instruction Set.

Solution

Here's what one particular compiler will do for the following code:

extern int foo(void) ;
void bar(void)
{
        if(foo() == 31) { //error code 31
                do_something();
        } else {
                do_somehing_else();
        }
}

   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 ec 08                sub    $0x8,%esp
   6:   e8 fc ff ff ff          call   7 <bar+0x7>
   b:   83 f8 1f                cmp    $0x1f,%eax
   e:   74 08                   je     18 <bar+0x18>
  10:   c9                      leave
  11:   e9 fc ff ff ff          jmp    12 <bar+0x12>
  16:   89 f6                   mov    %esi,%esi
  18:   c9                      leave
  19:   e9 fc ff ff ff          jmp    1a <bar+0x1a>

a 3 byte instruction for the cmp. if foo() returns a char , we get b: 3c 1f cmp $0x1f,%al

If you're looking for efficiency though. Don't assume comparing stuff in %a1 is faster than comparing with %eax

OTHER TIPS

There may be very small speed differences between the different integral types on a particular architecture. But you can't rely on it, it may change if you move to different hardware, and it may even run slower if you upgrade to newer hardware.

And if you talk about x86 in the example you are giving, you make a false assumption: An immediate needs to be of type uint8_t.

Actually 8-bit immediates embedded into the instruction are of type int8_t and can be used with bytes, words, dwords and qwords, in C notation: char, short, int and long long.

So on this architecture there would be no benefit at all, neither code size nor execution speed.

You should use int or unsigned int types for your calculations. Using smaller types only for compounds (structs/arrays). The reason for that is that int is normally defined to be the "most natural" integral type for the processor, all other derived type may necessitate processing to work correctly. We had in our project compiled with gcc on Solaris for SPARC the case that accesses to 8 and 16 bit variable added an instruction to the code. When loading a smaller type from memory it had to make sure the upper part of the register was properly set (sign extension for signed type or 0 for unsigned). This made the code longer and increased pressure on the registers, which deteriorated the other optimisations.

I've got a concrete example:

I declared two variable of a struct as uint8_t and got that code in Sparc Asm:

    if(p->BQ > p->AQ)

was translated in

ldub    [%l1+165], %o5  ! <variable>.BQ,
ldub    [%l1+166], %g5  ! <variable>.AQ,
and     %o5, 0xff, %g4  ! <variable>.BQ, <variable>.BQ
and     %g5, 0xff, %l0  ! <variable>.AQ, <variable>.AQ
cmp     %g4, %l0    ! <variable>.BQ, <variable>.AQ
bleu,a,pt %icc, .LL586  !

And here what I got when I declared the two variables as uint_t

lduw    [%l1+168], %g1  ! <variable>.BQ,
lduw    [%l1+172], %g4  ! <variable>.AQ,
cmp     %g1, %g4    ! <variable>.BQ, <variable>.AQ
bleu,a,pt %icc, .LL587  !

Two arithmetic operations less and 2 registers more for other stuff

Processors typically likes to work with their natural register sizes, which in C is 'int'.

Although there are exceptions, you're thinking too much on a problem that does not exist.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow