Question

Has anyone experiences replacing floating point operations on ATMega (2560) based systems? There are a couple of very common situations which happen every day.

For example:

  • Are comparisons faster than divisions/multiplications?

  • Are float to int type cast with followed multiplication/division faster than pure floating point operations without type cast?

I hope I don't have to make a benchmark just for me.

Example one:

int iPartialRes = (int)fArg1 * (int)fArg2;
iPartialRes *= iFoo;

faster as?:

float fPartialRes = fArg1 * fArg2;
fPartialRes *= iFoo;

And example two:

iSign = fVal < 0 ? -1 : 1;

faster as?:

iSign = fVal / fabs(fVal);
Was it helpful?

Solution

the questions could be solved just by thinking a moment about it.

  1. AVRs does not have a FPU so all floating point related stuff is done in software --> fp multiplication involves much more than a simple int multiplication

  2. since AVRs also does not have a integer division unit a simple branch is also much faster than a software division. if dividing floating points this is the worst worst case :)

but please note, that your first 2 examples produce very different results.

OTHER TIPS

This is an old answer but I will submit this elaborated answer for the curious.

Just typecasting a float will truncate it ie; 3.7 will become 3, there is no rounding.

Fastest math on a 2560 will be (+,-,*) with divide being the slowest due to no hardware divide. Typecasting to an unsigned long int after multiplying all operands by a pseudo decimal point that suits your fractal number(1) range that your floats are expected to see and tracking the sign as a bool will give the best range/accuracy compromise. If your loop needs to be as fast as possible, avoid even integer division, instead multiplying by a pseudo fraction instead and then doing your typecast back into a float with myFloat(defined elsewhere) = float(myPseudoFloat) / myPseudoDecimalConstant;

Not sure if you came across the Show info page in the playground. It's basically a sketch that runs a benchmark on your (insert Arduino model here) Shows the actual compute times for various things and systems. The Mega 2560 will be very close to an At Mega 328 as far as FLOPs goes, up to 12.5K/s (80uS per divide float). Typecasting would likely handicap the CPU more as it introduces more overhead and might even give erroneous results due to rounding errors and lack of precision.

(1)ie: 543.509,291 * 100000 = 543,509,291 will move the decimal 6 places to the maximum precision of a float on an 8-bit AVR. If you first multiply all values by the same constant like 1000, or 100000, etc, then the decimal point is preserved and then you cast it back to a float number by dividing by your decimal constant when you are ready to print or store it.

float f = 3.1428;
int x;

x = f * 10000;

x now contains 31428

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top