I have two integer variables:
int i1 = 0xdeadbeef
and int i2 = 0xffffbeef
.
(11011110101011011011111011101111 or 37359285591
and 111111111111111110111110111011111 or 4294950639
respectively).
(int) (float) i1 == i1
evaluates as false, yet (int) (float) i2 == i2
evaluates as true.
Why is this? In this system, both ints
and floats
are stored in 4 bytes.
This is because float
has far less precision than int
, it can't store all possible int
values without them suffering some damage. Sometimes this damage just rounds your value, sometimes your rounded value matches precisely.
A 32-bit float
can only store 24 "significand bits", or numerical data. Other bits are reserved for things like exponent, NaN flagging, Infinity and so on, where that eats into the remaining storage space.
A double
does have the required precision as it's usually a 64-bit representation that can store 53 bits of numerical data data.
Lots of conversions going on.
int i1 = 0xdeadbeef; int i2 = 0xffffbeef
incur implementation defined conversions as the constants are out of int
range. Here, they are "wrapped".
i2
is a small value (15 significant bits) exactly representable as a float
.
i1
is not. i1
has 30 significant bits, 6 more than the 24 of float
. Those lower 6 are not 0, so (float) i1
results is a rounded value.
int main() {
int i1 = 0xdeadbeef;
int i2 = 0xffffbeef;
printf("%d\n", (int) (float) i1 == i1);
printf("%d\n", (int) (float) i2 == i2);
printf("%u %10d %17f %10d\n", 0xdeadbeef, i1, (float) i1, (int) (float) i1);
printf("%u %10d %17f %10d\n", 0xffffbeef, i2, (float) i2, (int) (float) i2);
}
Output
0
1
3735928559 -559038737 -559038720.000000 -559038720
4294950639 -16657 -16657.000000 -16657
C implementations commonly use a 32-bit int
, and 0xdeadbeef
does not fit in 32 bits (one sign bit and 32 value bits). Initializing i1
with 0xdeadbeef
results in a conversion to int
. This conversion is implementation-defined. GCC, for example, defines it to wrap modulo 232, and this is not uncommon.
So int i1 = 0xdeadbeef;
initializes i1
to deadbeef16 − 232 = 3735928559 − 232 = −559038737 = −2152411116. As you can see from the 8 hexadecimal digits in “−21524111,” this number spans 30 bits from its leading 1 bit to its trailing 1 bit, inclusive (32 bits in 8 digits, but the first two are zeros). The format commonly used for float
, IEEE-754 binary32, has only 24 bits in its significand. Any number spanning more than 24 bits in its significant bits does not fit in the format and will be rounded when converted to this float
format. So i1 != (int) (float) i1
.
In contrast, int i12 = 0xffffbeef;
initializes i2
to ffffbeef16 − 232 = 4294950639 − 232 = −16657 = −411116. This spans 15 bits (16 bits in 4 digits, but the first one is a zero). So it fits in the 24 bits of a float
significand, and its value does not change when converted to float
. So i2 == (int) (float) i2
.
User contributions licensed under CC BY-SA 3.0