Question

Here is an image taken from the CUDA C Programming Guide:

enter image description here

The guide says that this is an example of a Conflict-free access since threads 3, 4, 6, 7 and 9 access the same word within bank 5.

I don't quite understand why is this conflict-free, since not only threads 3, 4, 6, 7 and 9 access the same work within same bank (shouldn't that be an example of memory conflict?) but also thread 5 has to access bank 4.

Could you please explain to me this case?

Was it helpful?

Solution

Note that a bank is not the same thing as a word or location in shared memory. A bank refers collectively to all words in shared memory that satisfy a certain address pattern condition.

In general, shared memory bank conflicts can be avoided if all accesses from a warp (or half-warp in cc 1.x) go to separate banks. These accesses need not be in warp order, i.e. they can be scrambled, as long as the request from each thread targets a separate bank.

The above description covers every arrow in your diagram except those arrows pointing to bank 5.

If we had no other information, then multiple arrows targetting a single bank would indicate a potential bank conflict.

However, there is an exception, when not only are the accesses targetting the same bank, but they are targetting the same word in memory. When multiple shared memory requests target the same word in memory, then the shared memory system has a broadcast mechanism to take the data contained in that word, and service it to all the requesting threads, in a single cycle.

From the documentation(http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#shared-memory-1-x):

Shared memory features a broadcast mechanism whereby a 32-bit word can be read and broadcast to several threads simultaneously when servicing one memory read request. This reduces the number of bank conflicts when several threads read from an address within the same 32-bit word.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top