Domanda

In the msgpcc (GCC for MSP430 microcontrollers) manual authors wrote:

Use int instead of char or unsigned char if you want a small integer within a function. The code produced will be more efficient, and in most cases storage isn't actually wasted.

Why int is more efficient?

UPD. And why (u)int_fast8_t in the mspgcc defined to (unsigned) char, not (unsigned) int. As I understand, (u)int_fast*_t should be defined to the most efficient type with a suffient size.

È stato utile?

Soluzione 3

In general, not necessarily specific to this processor, it has to do with sign extension and masking, requiring additional instructions to faithfully implement the C source code. A signed 8 bit value in a 16 or 32 or 64 bit processor MAY involve additional instructions to sign extend. An 8 bit add on a 32 bit processor might involve extra instructions to and with 0xFF, etc.

You should do some simple experiments, it took a few iterations but I quickly hit something that showed a difference.

unsigned int fun ( unsigned int a, unsigned int b )
{
    return(a+b)<<3;
}

unsigned char bfun ( unsigned char a, unsigned char b )
{
    return(a+b)<<3;
}


 int sfun (  int a,  int b )
{
    return(a+b)<<3;
}

 char sbfun (  char a,  char b )
{
    return(a+b)<<3;
}

produces

00000000 <fun>:
   0:   0f 5e           add r14,    r15 
   2:   0f 5f           rla r15     
   4:   0f 5f           rla r15     
   6:   0f 5f           rla r15     
   8:   30 41           ret         

0000000a <bfun>:
   a:   4f 5e           add.b   r14,    r15 
   c:   4f 5f           rla.b   r15     
   e:   4f 5f           rla.b   r15     
  10:   4f 5f           rla.b   r15     
  12:   30 41           ret         

00000014 <sfun>:
  14:   0f 5e           add r14,    r15 
  16:   0f 5f           rla r15     
  18:   0f 5f           rla r15     
  1a:   0f 5f           rla r15     
  1c:   30 41           ret         

0000001e <sbfun>:
  1e:   8f 11           sxt r15     
  20:   8e 11           sxt r14     
  22:   0f 5e           add r14,    r15 
  24:   0f 5f           rla r15     
  26:   0f 5f           rla r15     
  28:   0f 5f           rla r15     
  2a:   4f 4f           mov.b   r15,    r15 
  2c:   30 41           ret         

The msp430 has word and byte versions of the instructions so a simple add or subtract doesnt have to do the clipping or sign extension that you would expect when using smaller than register sized variables. As a programmer we might know that we were only going to feed sbfun some very small numbers, but the compiler doesnt and has to faithfully implement our code as written, generating more code between sfun and sbfun. It is not hard to do these experiements with different compilers and processors to see this in action, the only trick is to create code that the processor doesnt have simple instructions to solve.

another example

unsigned int fun ( unsigned int a, unsigned int b )
{
    return(a+b)>>1;
}

unsigned char bfun ( unsigned char a, unsigned char b )
{
    return(a+b)>>1;
}

produces

00000000 <fun>:
   0:   0f 5e           add r14,    r15 
   2:   12 c3           clrc            
   4:   0f 10           rrc r15     
   6:   30 41           ret         

00000008 <bfun>:
   8:   4f 4f           mov.b   r15,    r15 
   a:   4e 4e           mov.b   r14,    r14 
   c:   0f 5e           add r14,    r15 
   e:   0f 11           rra r15     
  10:   4f 4f           mov.b   r15,    r15 
  12:   30 41           ret         

Altri suggerimenti

A general rule of thumb is that CPUs are fastest at operating on integers of their native word size.

This is of course entirely architecture dependent, see the answers to this similar question for more clarification on that point.

TI has published an Application Note on the topic for their Tiva-C (formally Stellaris) MCUs.

In the "Introduction" section, a table provides a list of factors affecting performance and size. A factor label Variable size states that "using variables smaller than optimal may mean extra instructions to sign or unsign extend...".

Also, under the section, "Size of Variables", it states:

"When the local variables are smaller than the register size, then extra code is usually needed. On a Stellaris part, this means that local variables of size byte and halfword (char and short int respectively) require extra code. Since code ported from an 8-bit or 16-bit microcontroller may have had locals converted to smaller sizes (to avoid the too large problem), this means that such code will run slower and take more code space than is needed."

Please see: http://www.ti.com/lit/an/spma014/spma014.pdf

The following is handled by the compiler, but is still relevant to the issue at hand:

The MSP430 is a 16-bit microprocessor. A char is only 8-bits and would require packing to ensure that all words are aligned. For instance, 3 chars would not align properly in memory. Instead, use an integer that is 16-bits and will always be aligned.

When you use variable sizes that are multiples of 16 (e.g. 16 and 32) you can also utilize memory more efficiently. You won't end up with padding to align the memory.

int matches the native size of the processor in question (16 bits), so when you ask for a store to an unsigned char variable, the compiler may have to emit extra code to ensure that the value is between 0 and 255.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top