I would like to generate a list of random numbers using CUDA Numba.
The typical code is on the official site and the code is as following
from numba import cuda
from numba.cuda.random import create_xoroshiro128p_states, xoroshiro128p_uniform_float32
import numpy as np
@cuda.jit
def compute(rng_states, iterations, out):
"""Find the maximum value in values and store in result[0]"""
i = cuda.grid(1)
if i < iterations:
out[i] = xoroshiro128p_uniform_float32(rng_states, i)
iterations = 100
threads_per_block = 64
blocks = (iterations + (threads_per_block - 1)) // threads_per_block
rng_states = create_xoroshiro128p_states(threads_per_block * blocks, seed=1)
out = np.zeros(iterations, dtype=np.float32)
compute[blocks, threads_per_block](rng_states, iterations, out)
out
However, it is not possible if I would like to generate a list of size 10**20.
This will cause OOM because of rng_state, and even if I cut it into time series, it still takes too long time.
Is there a way, such that I can generate random numbers without rng_state so that I do not need that much GPU space.
Here, is one way, which is not very pleasing:
from numba import cuda
import numpy as np
MAX32 = np.uint32(0xffffffff)
@cuda.jit(device=True)
def cuda_xorshift(idi):
x = idi
x ^= x >> 12
x ^= x << 25
x ^= x >> 27
return np.uint32(x) * np.uint32(2685821657736338717)
@cuda.jit(device=True)
def cuda_xorshift_float(idi):
return np.float32(np.float32(MAX32 & cuda_xorshift(idi)) / np.float32(MAX32))
@cuda.jit
def compute(iterations, out):
"""Find the maximum value in values and store in result[0]"""
i = cuda.grid(1)
if i < iterations:
out[i] = cuda_xorshift_float(i)
iterations = 100
threads_per_block = 64
blocks = (iterations + (threads_per_block - 1)) // threads_per_block
out = np.zeros(iterations, dtype=np.float32)
compute[blocks, threads_per_block](iterations, out)
out
The above way however, outputs a same answer every time. Is there a function that generate local rng_state?
User contributions licensed under CC BY-SA 3.0