Question

I'm still a beginner at programming and I always have more questions than our book or internet searches can answer (unless I missed something). So I apologize in advance if this was answered but I couldn't find it.

I understand that float has a smaller range than double making it less precise, and from what I understand, long double is even more precise(?). So my question is why would you want to use a variable that is less precise in the first place? Does it have something to do with different platforms, different OS versions, different compilers? Or are there specific moments in programming where its strategically more advantageous to use a float over a double/long double?

Thanks everyone!

Was it helpful?

Solution

In nearly all processors, "smaller" floating point numbers take the same or less clock-cycles in execution. Sometimes the difference isn't very big (or nothing), other times it can be literally twice the number of cycles for double vs. float.

Of course, memory foot-print, which is affecting cache-usage, will also be a factor. float takes half the size of double, and long double is bigger yet.

Edit: Another side-effect of smaller size is that the processor's SIMD extensions (3DNow!, SSE, AVX in x86, and similar extensions are available in several other architectures) may either only work with float, or can take twice as many float vs. double (and as far as I know, no SIMD instructions are available for long double in any processor). So this may improve performance if float is used vs. double, by processing twice as much data in one go. End edit.

So, assuming 6-7 digits of precision is good enough for what you need, and the range of +/-10+/-38 is sufficient, then float should be used. If you need either more digits in the number, or a bigger range, move to double, and if that's not good enough, use long double. But for most things, double should be perfectly adequate.

Obviously, the importance of using "the right size" becomes more important when you have either lots of calculations, or lots of data to work with - if there are 5 variables, and you just use each a couple of times in a program that does a million other things, who cares? If you are doing fluid dynamics calculations for how well a Formula 1 car is doing at 200 mph, then you probably have several tens of million datapoints to calculate, and every data point needs to be calculated dozens of times per second of the cars travel, then using up just a few clockcycles extra in each calculation will make the whole simulation take noticeably longer.

OTHER TIPS

There are two costs to using float, the obvious one of its limited range and precision, and, less obviously, the more difficult analysis those limitations impose.

It is often relatively easy to determine that double is sufficient, even in cases where it would take significant numerical analysis effort to show that float is sufficient. That saves development cost, and risk of incorrect results if the more difficult analysis is not done correctly.

Float's biggest advantage on many processors is its reduced memory footprint. That translates into more numbers per cache line, and more memory bandwidth in terms of numbers transferred per second. Any gain in compute performance is usually relatively slight - indeed, popular processors do all floating point arithmetic in one format that is wider than double.

It seems best to use double unless two conditions are met - there are enough numbers for their memory footprint to be a significant performance issue, and the developers can show that float is precise enough.

You might be interested in seeing the answer posted here Should I use double or float?

But it boils down to memory footprint vs the amount of precision you need for a given situation. In a physics engine, you might care more about precision, so it would make more sense to use a double or long double.

Bottom line: You should only use as much precision as you need for a given algorithm

The basic principle here would be don't use more than you need.

The first consideration is memory use, you probably realized that already, if you are making only one double no big deal, but what if you create a billion than you just used twice as much memory space as you had too.

Next is processor utilization, I believe on many processors if you use smaller data types it can do a form of threading where it does multiple operations at once.

So an extension to this part of the answer is SSE instructions basically this allows you to used packed data to do multiple floating point operations at once, which in an idealized case can double the speed of your program.

Lastly is readability, when someone is reading your code if you use a float they will immediately realize that you are not going over a certain number. IMO sometimes the right precision number will just flow better in the code.

A float uses less memory than a double, so if you don't need your number to be the size of a double, you might as well use a float since it will take up less memory.

Just like you wouldn't use a bus to drive yourself and a friend to the beach... you would be far better off going in a 2 seater car.

The same applies for a double over a long double... only reserve as much memory as you are going to need. Otherwise with more complex code you run the risk of using too much memory and having processes slow down or crash.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top