The following code calls the builtin functions for clz/ctz in GCC and, on other systems, has C versions. Obviously, the C versions are a bit suboptimal if the system has a builtin clz/ctz instruction, like x86 and ARM.
#ifdef __GNUC__
#define clz(x) __builtin_clz(x)
#define ctz(x) __builtin_ctz(x)
#else
static uint32_t ALWAYS_INLINE popcnt( uint32_t x )
{
x -= ((x >> 1) & 0x55555555);
x = (((x >> 2) & 0x33333333) + (x & 0x33333333));
x = (((x >> 4) + x) & 0x0f0f0f0f);
x += (x >> 8);
x += (x >> 16);
return x & 0x0000003f;
}
static uint32_t ALWAYS_INLINE clz( uint32_t x )
{
x |= (x >> 1);
x |= (x >> 2);
x |= (x >> 4);
x |= (x >> 8);
x |= (x >> 16);
return 32 - popcnt(x);
}
static uint32_t ALWAYS_INLINE ctz( uint32_t x )
{
return popcnt((x & -x) - 1);
}
#endif
What functions do I need to call, which headers do I need to include, etc to add a proper ifdef for MSVC here? I've already looked at this page, but I'm not entirely sure what the #pragma is for (is it required?) and what restrictions it puts on MSVC version requirements for compilation. As someone who doesn't really use MSVC, I also don't know whether these intrinsics have C equivalents on other architectures, or whether I have to #ifdef x86/x86_64 as well when #defining them.
Bouncing from sh0dan code, the implementation should be corrected like this :
#ifdef _MSC_VER
#include <intrin.h>
uint32_t __inline ctz( uint32_t value )
{
DWORD trailing_zero = 0;
if ( _BitScanForward( &trailing_zero, value ) )
{
return trailing_zero;
}
else
{
// This is undefined, I better choose 32 than 0
return 32;
}
}
uint32_t __inline clz( uint32_t value )
{
DWORD leading_zero = 0;
if ( _BitScanReverse( &leading_zero, value ) )
{
return 31 - leading_zero;
}
else
{
// Same remarks as above
return 32;
}
}
#endif
As commented in the code, both ctz and clz are undefined if value is 0. In our abstraction, we fixed __builtin_clz(value)
as (value?__builtin_clz(value):32)
but it's a choice
If MSVC has a compiler intrinsic for this, it'll be here:
Otherwise, you'll have to write it using __asm
I find it in a korean website https://torbjorn.tistory.com/317
In msvc compiler, you can use __lzcnt(unsigned int)
to replace __builtin_clz(unsigned int)
in gcc compiler.
The equivalent function for int __builtin_ctz (unsigned int x) in MSVC is unsigned int _tzcnt_u32 (unsigned int a) for 32 bit integer and returns count of trailing zeros. For 64 bit use unsigned __int64 _tzcnt_u64 (unsigned __int64 a) 1.
The equivalent function for int __builtin_clz (unsigned int x) in MSVC is unsigned int _lzcnt_u32 (unsigned int a) for 32 bit integer and returns count of leading zeros. For 64 bit use unsigned __int64 _lzcnt_u64 (unsigned __int64 a) 2
C++ Header: immintrin.h
Tested on linux and windows (x86) :
#ifdef WIN32
#include <intrin.h>
static uint32_t __inline __builtin_clz(uint32_t x) {
unsigned long r = 0;
_BitScanReverse(&r, x);
return (31-r);
}
#endif
uint32_t clz64(const uint64_t x)
{
uint32_t u32 = (x >> 32);
uint32_t result = u32 ? __builtin_clz(u32) : 32;
if (result == 32) {
u32 = x & 0xFFFFFFFFUL;
result += (u32 ? __builtin_clz(u32) : 32);
}
return result;
}
There are two intrinsics "_BitScanForward" and "_BitScanReverse", which suits the same purpose for MSVC. Include . The functions are:
#ifdef _MSC_VER
#include <intrin.h>
static uint32_t __inline ctz( uint32_t x )
{
int r = 0;
_BitScanReverse(&r, x);
return r;
}
static uint32_t __inline clz( uint32_t x )
{
int r = 0;
_BitScanForward(&r, x);
return r;
}
#endif
There are equivalent 64bit versions "_BitScanForward64" and "_BitScanReverse64".
Read more here:
User contributions licensed under CC BY-SA 3.0