C program - Why does this happen??

silverscorpion · 2009-04-15T18:47:57+00:00

Hi all, Please explain this strange behavior to me. It's a simple program. #include<stdio.h> main() { float a=1.5; if(a==1.5) printf("Equal"); else printf("Not Equal"); } Output I got was "Equal". Now, #include<stdio.h> main() { float a=1.2; if(a==1.2) printf("Equal"); else printf("Not Equal"); } Now, Output was "Not Equal". Why does this happen? ie, in float data type, when you compare values like 1,1.5,2,2.5 etc, result is correct. If you compare 1.2,1.7,2.4 etc, result is not correct. Why is this?? PS - Strangely, if you use double, output is correct for all cases. I use Turbo C 16 bit compiler. Any idea??

C program - Why does this happen??

silverscorpion
@silverscorpion-iJKtdQ

Updated: Oct 14, 2024

Views: 968
Hi all,
Please explain this strange behavior to me. It's a simple program.
```
#include<stdio.h>
main()
{
float a=1.5;
if(a==1.5) printf("Equal");
else printf("Not Equal");
}
```
Output I got was "Equal".

Now,
```
#include<stdio.h>
main()
{
float a=1.2;
if(a==1.2) printf("Equal");
else printf("Not Equal");
}
```
Now, Output was "Not Equal".

Why does this happen? ie, in float data type, when you compare values like 1,1.5,2,2.5 etc, result is correct.
If you compare 1.2,1.7,2.4 etc, result is not correct. Why is this??

PS - Strangely, if you use double, output is correct for all cases. I use Turbo C 16 bit compiler. Any idea??
0

Replies

Howdy guest!

Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.

Replies

slashfear

Member • Apr 16, 2009
Hey Scorpion,

The Program does not give desired output because floating point arithmetic is different from real number arithmetic.

Computer use bits to store numbers. Each bit contains 0-1, so, every number is stored in the form of exponential of 2. It is easy to understand the binary representation about "Integer". For an integer 7, the binary is (111), that is 1*4+1*2+1. Similarly for float the computer use something like 0.1 to represent 0.5, that is 1*(1/2); Similarly, we would guess 0.75 is stored as 0.11 binary, because the value is 1*(1/2)+1*(1/4);

Floating point representation of real values is an approximation of such values in most cases . The format stores values as a binary mantissa and a binary exponent. That is the value is:

binary-mantissa * (2 ^ binary-exponent)

Where * is multiplication and ^ is exponent (to the power of).

Alright lets c our program:

a=1.2;
if(a==1.2) // output false

I think, this is really un-acceptable ?!

Well, we must accept this. The floating point calculation is with tiny in-accuracy when your using float in "if" or other comparisons statements

But when you use 1.5 it yields the desired output, that's because some floating point numbers cannot always be represented exactly, so comparisons don't always do what you'd like them to do. In other words, if the computer actually multiplies 10.0 by 1.0/10.0, it might not exactly get 1.0 back. so its similar case here float the value of 1.2 will be rounded off and saved in a (as float holds only 7 bit of numbers double can hold 15 bit of numbers) and it will not be same as the original value of 1.2. So you wont get the desired output.

Bottom line: Never use == to compare two floating point numbers.

So an alternative for comparing two float (which always yields a desired output) is as shown below:
```
/* Written by Arvind (slashfear)
language: C++ */

#include<iostream>
using namespace std;

bool IsEqual(float x, float y)
{
    return (x == y); // use equality operator to test if equal
}
int main()
{
float a=1.3;
if(IsEqual (a,1.3))
cout<<"Equal";
else
cout<<"Not Equal";
return 0;
}
```
So what we have done here is we have used bool to validate or compare the two strings ;-).

Note: When you are comparing two float we have to use a smarter way like bool so that there is no inconsistency of output. 😎

Hope this helps!!!!!
Are you sure? This action cannot be undone.
Cancel
silverscorpion

Member • Apr 16, 2009

Ok. That sure was clear. What I dont get is, why is the output correct when I use double? I'm sure double also uses the same representation as float. Right??

And how to do the 'bool' thing in C? There's no data type of boolean in C, right? How do you go about it then??

Anyways, thanks for the clear and detailed answer. It was good!!

Are you sure? This action cannot be undone.
Cancel
Saandeep Sreerambatla

Member • Apr 16, 2009

Good answer slash.
And regarding silver float is stored in 4 bytes and double in 8 bytes .
So may be the value it is calculation for 1.2 is correct.

ps : i am not sure but gave a try!!

Are you sure? This action cannot be undone.
Cancel
slashfear

Member • Apr 17, 2009
silverscorpion
Ok. That sure was clear. What I dont get is, why is the output correct when I use double? I'm sure double also uses the same representation as float. Right??

And how to do the 'bool' thing in C? There's no data type of boolean in C, right? How do you go about it then??

Anyways, thanks for the clear and detailed answer. It was good!!

Hi Silver,

Ok The reasoning behind the correct out put using a double is as SCARED said,
```
 float     4 bytes     32 bits
 double    8 bytes     64 bits
```
So as double uses more number of bytes it can store the accurate value of 1.2 (as in our case) but still, use of double also leads to inaccuracy of output when you compare to double using "==".

And yes double and float use the same type of representation and the only difference between double and float is the size of bytes they use as shown above.

Ok so to do the same process in C we have to do the following instead of bool:
```
/* Written by: Arvind (slashfear)
   Language : C      
*/

#include<stdio.h>

main()
{
float a=1.3;
float precision = 0.00001;
  if (((a - precision) < 1.3) &&
      ((a + precision) > 1.3))
   {
    printf("Equal");
   }
  else
   {
    printf("Not Equal");
   }
 }
```
So what we have done here is we are using precision, The floating-point precision determines the maximum number of digits to be written on insertion operations to express floating-point values.

Alright so hope this helps!!! and cleared your doubt !!!! 😁
Are you sure? This action cannot be undone.
Cancel
pradeep_agrawal

Member • Apr 17, 2009
To understand the logic behind this, its good to have understanding of the storage of variables in the memory. This is described below under "Number

Representation". For the user who know this or directly want to go through the root cause, skip "Number Representation" and refer "Root Cause Discussion".

Number Representation

The storage of char, int, or long in memory is simple and little straight forward. The value is simply converted into binary and stored in memory, e.g., if the char is 'A' its numeric equivalent is 65 and it is stored in memory as the binary of 65 (i.e., 01000001).

The conversion is little different when storing signed char, signed int, or signed long. In such case the first bit represent the sign of the value (0 for positive and 1 for negative value) and rest of the bits represent the binary equivalent of the number (binary equivalent is converted into 2's compliment form for negative values).

The storage of float and double need some better representation as float and double contains both integer and decimal part. And if we simply distribute the total allocated memory for float and double the range will be small. For example, for the 4 bytes allocated for float if we distribute 2 byte for integer part and 2 byte for decimal part and represent it in the same way as we represent char, int or long then the value for float will range from -32768.65535 to 32767.65535, which is too small.

Various representation were suggested for efficient storage of float and doubles in memory. The most widely accepted/used representation is IEEE 754 floating point representation.

IEEE 754 floating point representation define four format as single precision (32bit), double precision (64bit), single extended precision (>= 43bit), and double extended precision (>=79bits). Out of these single precision and double precision are widely used.

For the storage of floating point number the total memory allocated for the floating point number is divided in to three parts as sign, exponent, and fraction. For single precision, sign is 1bit, exponent is 8bit and fraction is 23bit. For double precision, exponent bias is 1023, exponent is 11bit and fraction is 52bit.

When we represent any floating point number in this format, there are many numbers whose fraction part can't be converted into an exact binary representation and hence there is loss of some precision in storage of floating point numbers. As the double precision have more bits to represent fraction so the loss of precision is less in case of double precision than single precision.

For more details on this refer
<a href="https://en.wikipedia.org/wiki/IEEE_floating_point" target="_blank" rel="nofollow noopener noreferrer">Ieee Floating Point</a>
and
<a href="https://www.validlab.com/goldberg/paper.pdf" target="_blank" rel="nofollow noopener noreferrer">PDF</a>

Root Cause Discussion

The float of C language is 32bit and stores data in single precision format, whereas double of C language is 64bit and stores data in double precision format. Hence, when number is in represented double its more accurate than the number represented in float. And when we assign a double to a float there may be some loss of precision (for the numbers which can be represented without loss of precision in float will also not suffer ant precision loss when assigned from float to double).

Consider the below code.
```
#include "stdio.h"

int main() {
  float f = 1.2;
  double d = 1.2;

  printf("Float:  %64.63e\n", f);
  printf("Double: %64.63e\n", d);

  return 0;
}
```
The output of the code is

Float: 1.200000047683715820312500000000000000000000000000000000000000000e+00
Double: 1.199999999999999955591079014993738383054733276367187500000000000e+00

As you can see above, though we assign same value '1.2' to both float and double, the actual value that got stored is different with loss of some precision.

Also as double used double precision and should contain more accurate value. This is reflected in above example. The value represented by double is more close to '1.2' then the value represented by float.

Consider another code given below.
```
#include "stdio.h"

int main() {
  float f = 1.5;
  double d = 1.5;

  printf("Float:  %64.63e\n", f);
  printf("Double: %64.63e\n", d);

  return 0;
}
```
The output of the code is
Float: 1.500000000000000000000000000000000000000000000000000000000000000e+00
Double: 1.500000000000000000000000000000000000000000000000000000000000000e+00

So for the values that can be converted into floating point representation without loss of precision will not have differences in float and double.

Consider one more example which will make things more clear.
```
#include "stdio.h"

int main() {
  float f1 = 1.2;
  float f2 = 1.2f;
  double d1 = 1.2;
  double d2 = 1.2f;

  printf("Float f1:  %64.63e\n", f1);
  printf("Float f2:  %64.63e\n", f2);
  printf("1.2f:      %64.63e\n", 1.2f);
  printf("Double d1: %64.63e\n", d1);
  printf("Double d2: %64.63e\n", d2);
  printf("1.2:       %64.63e\n", 1.2);

  return 0;
}
```
The output of the code is
Float f1: 1.200000047683715820312500000000000000000000000000000000000000000e+00
Float f2: 1.200000047683715820312500000000000000000000000000000000000000000e+00
1.2f: 1.200000047683715820312500000000000000000000000000000000000000000e+00
Double d1: 1.199999999999999955591079014993738383054733276367187500000000000e+00
Double d2: 1.200000047683715820312500000000000000000000000000000000000000000e+00
1.2: 1.199999999999999955591079014993738383054733276367187500000000000e+00

Here when we assign '1.2' or '1.2f' to float or print '1.2f' it always display output as

1.200000047683715820312500000000000000000000000000000000000000000e+00.

Whereas when we assign '1.2' or '1.2f' to double or print '1.2', the output for '1.2f' assigned to double is different from others.

Reason being, in C language when we say '1.2' it by default represents double value. For representing float we need to append 'f' at the end as '1.2f' or need to use explicit type conversion as '(float)1.2'.

- In "float f1 = 1.2;" we try to assign double to float, there is loss of some extra precision due to internal type casting and final value that it holds is of 1.2 as float.
- In "float f2 = 1.2f;" we assign float to float and hence the value it holds is of 1.2 as float.
- In double d1 = 1.2; we assign double to double and hence the value that it holds is of 1.2 as double.
- In double d2 = 1.2f; we try to assign float to double, the float has already lost its precision which we can't get back when assigning to double and hence the value that it holds is of 1.2 as float.
- In "printf("1.2f: %64.63e\n", 1.2f);" we print the value of 1.2 as float.
- In "printf("1.2: %64.63e\n", 1.2);" we print the value of 1.2 as double.

I feel by now most of the things should be clear, so lets take the actual code.
```
#include<stdio.h>
main()
{
float a=1.2;
if(a==1.2) printf("Equal");
else printf("Not Equal");
}
```
Here we are assigning double '1.2' to a float and hence there is loss of some extra precision. Then we compare the float value of 1.2 with double value of 1.2 which are different and hence it output the result as "Not Equal".

Where as when we have "a=1.5", corresponding float and double value is same and thus comparison return "Equal".

Suggestions

When doing floating point arithmetic (or any arithmetic in general) take care of type and the implicit typecasting that will happen. For example if the above code is written as:
```
#include<stdio.h>
main()
{
float a=1.2f;
if(a==1.2f) printf("Equal");
else printf("Not Equal");
}
```
It will output "Equal" as we will be comparing float of 1.2 with float of 1.2 here.

Let me know if any item need more clarification.

-Pradeep
Are you sure? This action cannot be undone.
Cancel
silverscorpion

Member • Apr 19, 2009

wow, You rock!!
I think you guys must be called programming Gurus of CE.

Well, everything is very much clear, and a ton of thanks for both of you, pradeep and slashfear. Way to go!!

Are you sure? This action cannot be undone.
Cancel
Kaustubh Katdare

Administrator • Apr 19, 2009

@SS: CEan- Pradeep Agrawal is one of the most talented software engineers I've ever met. We're blessed to have engineers like him on CE.

Are you sure? This action cannot be undone.
Cancel
pradeep_agrawal

Member • Apr 19, 2009

Thanks for you appreciation Big K.

-Pradeep

Are you sure? This action cannot be undone.
Cancel
slashfear

Member • Apr 20, 2009

silverscorpion
wow, You rock!!
I think you guys must be called programming Gurus of CE.

Well, everything is very much clear, and a ton of thanks for both of you, pradeep and slashfear. Way to go!!

Hey Silver,

Thanks for your compliments buddy 😉

Are you sure? This action cannot be undone.
Cancel