Fast floating point abs function

Question

Fast floating point abs function

What is the fastest way to take the absolute value of a standard 32 bit float on x86-64 architectures in C99? The builtin functions fabsf and fabs are not fast enough. My current approach is bit twiddling:

unsigned int tmp = *((unsigned int *)&f) & 0x7fffffff;
float abs = *((float *)&tmp);

It works but is ugly. And I'm not sure it is optimal?

Please stop telling me about type-punned pointers because it's not what I'm asking about. I know the code can be phrased using unions but it doesn't matter because on all compilers (written in the last 10 years) it will emit exactly the same code.

c

performance

floating-point

c99

asked on Stack Overflow Jul 20, 2019 by

Björn Lindqvist • edited Jul 20, 2019 by

Björn Lindqvist

1 Answer

Less standard violations:

/* use type punning instead of pointer arithmatics, to require proper alignment */
static inline float float2absf(float f) {
  /* optimizer will optimize away the `if` statement and the library call */
  if (sizeof(float) == sizeof(uint32_t)) {
    union {
      float f;
      uint32_t i;
    } u;
    u.f = f;
    u.i &= 0x7fffffff;
    return u.f;
  }
  return fabsf(f);
}

IMHO, it would be safer to use the library function. This will improve code portability, especially on platforms where you might encounter a non-IEEE float representation or where type sizes might differ.

In general, once compiled for your platform, the library function should provide the fastest solution.

Having said that, library calls require both stack management and code jumps unless optimized away, which - for a simple bit-altering function - could result in more then twice the number of operations as well as cache misses. In many cases, this is avoidable by using compiler builtins, which could be done automatically by the compiler (it can optimize library functions into inline instructions).

Your bit-approach is (in theory) correct and could optimize away the operations related to function calls, as well as improve code locality... although the same could be achieved using compiler builtins and optimizations.

Also, please note that your approach isn't standard compliant and it assumes that sizeof(int) == sizeof(float)... I think that type punning using a union will improve that little bit.

In addition, using an inline function could work out like using a macro and make the code more readable. In addition, it could allow a fallbacks to the library function if type sizes don't match.

answered on Stack Overflow Jul 20, 2019 by

Myst • edited Jul 20, 2019 by

Myst

User contributions licensed under CC BY-SA 3.0