Question

I'm fairly new to OpenCL so please bear with me.

In the first iteration of my code, I used basic memory buffers for large datasets and declared them global. However now that I'm looking to improve the timing, I wanted to use texture memory for this. In the CUDA version, we use cudaBindTexture and tex1Dfetch to obtain the data for a large 1D float array. From my understanding of the specification, texture memory is the same thing as image memory. However, since there are only 2D and 3D image objects with max heights and widths, I run into some issues. My array larger than max height/width, but not max height * max width. Must I convert my 1D array into 2D? Or is there a better way to do it?

Or am I completely off?

I did read http://forums.nvidia.com/index.php?showtopic=151743 and http://forums.nvidia.com/index.php?showtopic=150454 but they weren't exactly conclusive in whether the texture memory referred to in Best Practices and Programming Guide was in fact image objects.

Thanks and any help/suggestions are greatly welcome!

Was it helpful?

Solution

I found the best answer as a reply to my post on NVidia's forum here.

OTHER TIPS

My array larger than max height/width, but not max height * max width. Must I convert my 1D array into 2D?

Yes, the texture hardware has constraints on the maximum index values. If you exceed these values, you'll need to convert to using multiple index values.

That said, I'm not implying that converting to texture access is going to speedup your program.

OpenCL 1.2 supports 1D textures. The issue is NVIDIA only supports OpenCL 1.1 unlike AMD or Intel...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top