Question

I'm pretty new to CUDA language, and I need to perform a simulation on particles which get updated at each time step by adding a random value to their position (different from each other, but following the same distribution).

My idea is to give every particle a different curandState (with a different seed), and at each time step simply do a curand(curandState[particle_id]).

I was thinking I could store the random states and particle ids in constant memory on the GPU. But I havent seen anyone do that anywhere, would that raise memory problems? Can this speed up the program?

Thank you for your help :)

Was it helpful?

Solution

I don't think that makes sense. __constant__ memory is constant, and can't be modified directly by threads running on the GPU. curandState, however, needs to be modified each time a random number is generated by a thread (otherwise, you will get the same number generated, over and over).

There's nothing wrong with giving every particle it's own state; that would be the typical usage for this scenario.

Since the retrieval and usage of curandState and the generation of random numbers is being done by an NVIDIA library on the GPU, you can assume that the NVIDIA engineers have done a reasonably good job of optimizing memory accesses so as to be efficient and coalesced, during the operation of retrieving and updating state, and generating random numbers.

__constant__ memory also has the characteristic that it services only one 32 bit value per SM per clock, so it's useful when all threads are accessing the same data element (i.e. broadcast) but not generally useful when each thread is accessing a different element (e.g. separate curandState) even if that access would normally coalesce, e.g. if it were in ordinary global memory.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top