CUDA: When can someone achieve coalescing memory?

https://stackoverflow.com/questions/17285465

01-06-2022
|

Question

I have trouble understanding this concept. I've researched a lot online and the only thing I understood is that threads need to access consecutive data.

So if we have an array of 10000 integers, if thread i accesses i-th number of the array, then the memory will be coalescing.

What if instead of having 10000 threads for all the integers, we decide to have 500 threads where each thread accesses two consecutive integers? Will memory coalescing be possible in this case?

And what if we decide to allow a thread to access more than 2 values, for example 10?

How would memory coalescing behave in this case? And when does "consecutive access" stop being "consecutive" in the example I described above?

Thank you in advance

Solution

I have trouble understanding this concept

It's not something that can be thoroughly covered in a short description, especially with all the clarification questions that are likely to occur to you.

My suggestion is to take one of these webinars:

GPU Computing using CUDA C – Advanced 1 (2010)

CUDA Global Memory Usage & Strategy + Live Q&A with Dr Justin Luitjens, NVIDIA

Then come back when you have specific questions that are based on a general understanding of the topic.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow