I have to following bitwise code which casts a floating point value (packaged in an int) to an int value.
Question: There are rounding issues so it fails in cases where input is 0x80000001 for example. How do I handle this?
Here is the code:
if(x == 0) return x;
unsigned int signBit = 0;
unsigned int absX = (unsigned int)x;
if (x < 0)
{
signBit = 0x80000000u;
absX = (unsigned int)-x;
}
unsigned int exponent = 158;
while ((absX & 0x80000000) == 0)
{
exponent--;
absX <<= 1;
}
unsigned int mantissa = absX >> 8;
unsigned int result = signBit | (exponent << 23) | (mantissa & 0x7fffff);
printf("\nfor x: %x, result: %x",x,result);
return result;
That's because the precision of 0x80000001
exceeds that of a float
. Read the linked article, the precision of a float is 24 bits, so any pair of floats whose difference (x - y
) is less than the highest bit of the two >> 24
simply cannot be detected.
gdb
agrees with your cast:
main.c:
#include <stdio.h>
int main() {
float x = 0x80000001;
printf("%f\n",x);
return 0;
}
gdb:
Breakpoint 1, main () at test.c:4
4 float x = 0x80000001;
(gdb) n
5 printf("%f\n",x);
(gdb) p x
$1 = 2.14748365e+09
(gdb) p (int)x
$2 = -2147483648
(gdb) p/x (int)x
$3 = 0x80000000
(gdb)
The limit of this imprecision:
(gdb) p 0x80000000 == (float)0x80000080
$21 = 1
(gdb) p 0x80000000 == (float)0x80000081
$20 = 0
The actual bitwise representation:
(gdb) p/x (int)(void*)(float)0x80000000
$27 = 0x4f000000
(gdb) p/x (int)(void*)(float)0x80000080
$28 = 0x4f000000
(gdb) p/x (int)(void*)(float)0x80000081
$29 = 0x4f000001
double
s do have enough precision to make the distinction:
(gdb) p 0x80000000 == (float)0x80000001
$1 = 1
(gdb) p 0x80000000 == (double)0x80000001
$2 = 0
User contributions licensed under CC BY-SA 3.0