Using bit shifting with rand() to allow for a larger random range

1

I am reviewing a function to generate keys for a Radix Map and found the implementation of rand() to be novel to me.

Here is the function:

static int make_random(RadixMap *map)
{
    size_t i = 0;

    for (i = 0; i < map->max - 1; i++){
        uint32_t key = (uint32_t) (rand() | (rand() << 16));<--This was interesting
        check(RadixMap_add(map, key, i) == 0, "Failed to add key %u", key);
    }

    return i;

error:
    return 0;
}

----- Type definitions --------
typedef union RMElement {
    uint64_t raw;
    struct {
        uint32_t key;
        uint32_t value;
    } data;
} RMElement; 

typedef struct RadixMap {
    size_t max; 
    size_t end;
    uint32_t counter;
    RMElement *contents;
    RMElement *temp; 
} RadixMap;

from ex35 Learn C the Hard Way by Zed Shaw

The specific part I found interesting was

uint32_t key = (uint32_t) (rand() | (rand() << 16)); <-- This was interesting

It is interesting to me because it would have been possible to simply do ..

uint32_t key = rand();

As RAND_MAX (0x7FFFFFFF) is less than uint32_t MAX (0xFFFFFFFF)

The bit shifting implementation looks to have the following advantages.

  1. Allows for a larger random value range, 0xFFFFFFFF vs 0x7FFFFFFF
  2. Values (other than initial 0) are at least 5 digits decimal (65537) (0x10001)
  3. Reduced probability of seeing "0".

And the following disadvantage

  1. Increased code complexity?

Are there other reasons for using this bit shift implementation of rand()?

I've been trying to hash out the reason for using this implementation in my code review and wanted to make sure I was on the right track with my thinking.

c
asked on Stack Overflow Apr 27, 2021 by TechBaumgartner

3 Answers

2

The C standard only guarantees that RAND_MAX is at least 32767. This code accounts for that by calling rand twice and shifting to ensure it gets at least 30 bits of randomness.

However, this does does not properly account for the case where RAND_MAX is larger.

The rand function returns an int which is signed. If RAND_MAX was the same as INT_MAX, rand() << 16 would most likely shift a "1" bit into the sign bit, triggering undefined behavior.

The proper way to implement this to handle both cases is:

uint32_t key = rand() | ((uint32_t)rand() << 16));

Since left shifting an unsigned number is well defined as long as the shift amount is less than the size of the type.

Or better yet:

uint32_t key = (((uint32_t)rand() & 0x7FFF) << 17) | 
               (((uint32_t)rand() & 0x7FFF) << 2) | 
               ((uint32_t)rand() & 0x3);

To get a full 32 bits of randomness.

answered on Stack Overflow Apr 27, 2021 by dbush
2

uint32_t key = (uint32_t) (rand() | (rand() << 16)); has shortcomings.

  • Not uniform when RAND_MAX != 65535, which is the usual case.

  • Undefined behavior when int is 16 bit. Also UB in other cases due to signed integer overflow possibilities with rand() << 16

  • The cast is too late to protect against a narrow int. Effectively same as uint32_t key = rand() | (rand() << 16); uint32_t key = rand() + (rand() * (RAND_MAX+(uint32_t key)1); would make a bit more sense.

A key failing is using | to append the bits zeroed on the right are not the same as the bit-width of RAND_MAX.

2nd weakness is assuming shifting is better than multiplying by a power-of-2. A good compiler emits efficient code either way.


Instead, call your random function (1, 2 or 3 times) as needed based on its RAND_MAX. Below works well when RAND_MAX is a Mersenne number.
See Is there any way to compute the width of an integer type at compile-time?.

#define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
// Bit width of RAND_MAX, which is at least 15
#define RAND_MAX_BITS IMAX_BITS(RAND_MAX)

_Static_assert(((RAND_MAX + 1u) & RAND_MAX) == 0, "RAND_MAX is not a Mersenne number");

uint32_t rand32(void) {
  uint32_t r = rand();
  #if RAND_MAX_BITS < 32
    r = (r << RAND_MAX_BITS) | rand();
  #endif
  #if RAND_MAX_BITS*2 < 32
    r = (r << RAND_MAX_BITS) | rand();
  #endif
  return r;
}

(Bit shifting) Increased code complexity?

No.

Are there other reasons for using this bit shift implementation of rand()?

OP's code is not uniform as it generally favors one bits with its potential or-ing of bits past the 15th.

I've been trying to hash out the reason for using this implementation ...

Do not use it.

answered on Stack Overflow Apr 27, 2021 by chux - Reinstate Monica • edited Apr 27, 2021 by chux - Reinstate Monica
0

Or, you could just use a really fast random number generator. Careful you don't see with values that don't have too many zero bytes.

uint64_t
xorshift128plus(uint64_t seed[2])
{
    uint64_t x = seed[0];
    uint64_t y = seed[1];
    seed[0] = y;
    x ^= x << 23;
    seed[1] = x ^ y ^ (x >> 17) ^ (y >> 26);
    return s[1] + y;
}

convert the result to float or just modulo your max int value...

answered on Stack Overflow Apr 27, 2021 by ChuckCottrill

User contributions licensed under CC BY-SA 3.0