What are the two different means by which a floating-point number can be written?

  • Float vs Double: Difference You should know
  • Conclusion

Float and double are primitive data types used by programming languages to store floating-point real (decimal) numbers like 10.923455, 433.45554598 and so on. This article will give you the detailed difference between float and double data type.

Float vs Double: Difference You should know

In the computing world, numeric data can be represented in 2 ways – fixed-point and floating-point arithmetic. Fixed point data is what we call as the integer with some sort of scaling factor. For example, 3.14 will be 314 with a scaling factor of 100, or 2.3456 can be rounded to 2.345 to achieve a fixed number of digits. This method compromises on the accuracy of the result and is not suitable in all situations.

It is thus easier and more accurate to use floating-point representation for high-precision calculations (as we will see in the next section).

What are the two different means by which a floating-point number can be written?

If you want to read about floating-point numbers from an arithmetic point of view, you can read this Wikipedia article. For this blog, we have focussed on how programming languages use these floating-point numbers to get precision values and range.

Why use floating point data?

Mixed precision

In mixed-precision, single-precision values can be used for complex calculations. You can consider it as a trade-off between accuracy and memory efficiency. By combining float16 (half precision) and float32 (single-precision), applications can increase application performance and data transfer speed. Half precision was defined by Microsoft and Nvidia to save space and complexity of floating-point calculations.

But, not always.

Arbitrary precision

We use floating-point and integer math for precise calculations where the result is only limited by the amount of memory available on the system. This type of calculation is called as arbitrary or infinite precision calculation. One of the most important applications of arbitrary precision is public-key cryptography where computations with numbers having hundreds of digits (exponential) are a common sight.

Another similar precision type is the symbolic computation where we use exact values of symbols (like PI) for complex computations.

What are the two different means by which a floating-point number can be written?

Float and double

Double is more precise than float and can store 64 bits, double of the number of bits float can store.

Double is more precise and for storing large numbers, we prefer double over float. For example, to store the annual salary of the CEO of a company, double will be a more accurate choice. All trigonometric functions like sin, cos, tan, mathematical functions like sqrt return double values. However, double comes with a cost. Unless we do need precision up to 15 or 16 decimal points, we can stick to float in most applications, as double is more expensive. It takes about 8 bytes to store a variable. We append ‘f’ or ‘F’ to the number to indicate that it is float type failing which it is taken as double.

A small table that gives the memory requirement and range of float and double is shown below –

Floating point typeMemory requirementRange
Float4 bytes±3.40282347E+38F i.e. 6-7 significant digits
Double8 bytes±1.79769313486231570E+308 i.e. 15-16 significant digits

Float and double function in the same way in all programming languages. For example, in Java, both will throw NumberFormatException with operations where actual numbers are not involved. Note that the compiler will not detect this exception.

String sname = "DR"; float fname = Float.parseFloat(sname); System.out.println(fname/num1); // Leads to NumberFormatException at runtime Dividing float and double by zero will give an output of ‘Infinity’ in Java. double num2 = 344.55555555; System.out.println(num2/0);

This won’t result in an error but is an invalid operation (NaN). Learn more about NaN here.

Where will we use precision values?

Almost everywhere!

If you work with small quantities of data – like average marks, area of triangle etc… use double by default. But, if you deal with a lot of numbers where high precision is involved and any rounding off can change results – like trigonometry, width of a human hair, neural networks, spin of an electron, coordinates of a location and so on – it is important to know about the differences between float and double. While Java encourages you to use double, in languages like C you have the flexibility of using whichever you want.

A typical java declaration will look like –

float number1 = (float) 12.211111111111;

Now, if you do not do that typecasting, your code will not compile in Java. When you do it, while printing the number, you will only get 6 digits after the decimal point.

Consider a simple program of multiplying two numbers – 1.42222*234.56433 written in C.

This could be anything like atomic mass or gravitational force which has to have all its significant digits intact!

float num1 = 1.42222*234.56433; double num2 = 1.42222*234.56433; printf("%f", num1); printf("%.10f", num2);

While num1 returns the value as 333.602081, num2 declared as double returns 333.6020814126, which is precise upto 10 digits as mentioned in our printf statement. We can also print a float as a double and vice versa, it all depends on how we write the printf statement. Writing %f will strip off some significant digits, while when we specify number of digits, the entire value up till that will be printed. To print the value in exponential terms, you should use “%e”.

In Java, as we have seen earlier, only if we typecast to (float), the value is printed. Java takes all high precision decimal values as double by default.

float values; double doubes; values = (float) (1.42222*234.56433); doubes = 1.42222*234.56433; System.out.println(values); System.out.println(doubes);

will yield 333.60208 and 333.6020814126 respectively.

Logical comparisons

We use the operators <, <=, >= and > to compare float and double values. With integers, we can use != and = but here we don’t because the latter operators are less precise.

Obviously, when float is used, exact comparison is not possible as precision is only till 5-6 digits. Any differences in the numbers is not caught.

float number1 = (float) 3.1434343; float number2 = (float) 3.1434343333; if(number1 == number2) System.out.println("equal"); else System.out.println("not equal"); double number3 = 3.1434343; double number4 = 3.1434343333; if(number3 == number4) System.out.println("equal"); else System.out.println("not equal");

What do you think the output will be?

You might have guessed it – the first one will give “equal”, while the second one will give “not equal”.

To avoid the typecasting every time we write the number in float, we can suffix the number with ‘f’. For example,

float number1 = 3.1434343f;

Big Decimal

.NET and Java also have Decimal/BigDecimal class that has higher precision than double. For more accurate calculations like in financial and banking applications, Decimal is used because it further reduces rounding errors.

long double

Some programming languages like C use long double that gives more precision than double. Check out the different data types of C.

Division with float and double

Same as in multiplication or addition, the division will give more precision digits in double. Consider this simple example –

float number1 = 3.1434343f; double number2 = 3.1434343; float divide = 22/7f; // first let us print the result as double double result1 = number1/divide; /* the same result but now it is a float value, note the difference in significant digits */ float result3 = number1/divide; // the double value double result2 = number2/divide; System.out.println(result1); // 1.0001837015151978 System.out.println(result3); // 1.0001837 System.out.println(result2); // 1.000183662587488

This is particularly useful when denominator is bigger than numerator and the result is in small fractions like –

float pie = 22/7f; float pieby4096 = pie/4096; double dpie = 22/7d; double dpieby4096 = dpie/4096; System.out.println("Float Pie is - " + pie); System.out.println("Double pie is - " + dpie); System.out.println("Float Pie divided by 4096 - " + pieby4096); System.out.println("Double Pie divided by 4096 - " + dpieby4096); double pieby4096usingfloatpie = pie/4096; System.out.println("Float Pie divided by 4096 with result as double - " + pieby4096usingfloatpie);

See the results –

Float Pie is - 3.142857 Double pie is - 3.142857142857143 Float Pie divided by 4096 - 7.672991E-4 Double Pie divided by 4096 - 7.672991071428571E-4 Float Pie divided by 4096 with result as double - 7.672990905120969E-4

Pay attention to the last 3 results. The pie that we choose to divide the numbers makes a difference in the significant digits of the result. This is exactly the precision we are talking about!

Concatenation with String

In Java, it is possible to concatenate strings with double and float using + operator.

String str = "test"; float flo = 23.2f; String concat = str + flo; double dou = 3.45555555; concat += dou; System.out.println(concat); // result will be test23.23.45555555

Now that we know what float and double are, it will be good to create a table of differences for quick reference and recap.

FloatDouble
Single precision valueDouble precision value
Can store Up to 7 significant digitsStores up to 15 significant digits
Occupies 4 bytes of memory (32 bits IEEE 754)Occupies 8 bytes of memory (64-bits IEEE 754)
If more than 7 digits are present, value is rounded off7-15 digits are stored as they are
With Java, one needs to typecast to declare float –

float fnum = 2.344f;

or

float fnum = (float) 2.344;

Double is the default decimal point type for Java.

double dnum = 2.344;

If high precision is not required and the program only needs a huge array of decimal numbers to be stored, float is a cost-effective way of storing data and saves memory.Double is costlier, occupies more space and is more effective when more precision is required. For example, currency conversion, financial reports and transactions, scientific calculations etc…

Conclusion

This is the complete difference between Double vs Float; While typecasting from float to double and double to float is perfectly allowed and valid, it should be done carefully in the code. If you are converting too often, precision might be lost and you will lose the entire purpose of using double. During initial stages of development, decide and define whether you want to use float or double and maintain the same throughout the application. It is also a good idea to know how particular data is stored in the database. If your application needs to be performant, use float, because with large data sets, double could make your program slow. If your data needs more precision, use double.

People are also reading: