Question

In my application I am showing all available OpenCL devices so that the user can select the devices on which he wants to perform the computation. The results I am getting on my laptop have left me befuddled.

enter image description here

Following is an excerpt of the code that produced these results:

//CL_DEVICE_TYPE
            {
                cl_device_type devtype;
                QString temp = "Unknown";
                err = clGetDeviceInfo(devices[i][j], CL_DEVICE_TYPE, sizeof(devtype), &devtype, NULL);
                if(err == CL_SUCCESS)
                {
                    if(devtype == CL_DEVICE_TYPE_CPU)
                        temp = "CPU";
                    else if(devtype == CL_DEVICE_TYPE_GPU)
                        temp = "GPU";
                    else if(devtype == CL_DEVICE_TYPE_ACCELERATOR)
                        temp = "Accelerator";
                    else
                        temp = "Unkown";
                }
                ilist->append(temp);
            }

            //CL_DEVICE_MAX_CLOCK_FREQUENCY
            {
                cl_uint devfreq;
                err = clGetDeviceInfo(devices[i][j], CL_DEVICE_MAX_CLOCK_FREQUENCY, sizeof(devfreq), &devfreq, NULL);
                if(err == CL_SUCCESS)
                    ilist->append(QString::number((unsigned int)devfreq));
                else
                    ilist->append("Unknown");
            }

            //CL_DEVICE_GLOBAL_MEM_SIZE
            {
                cl_ulong devmem;
                err = clGetDeviceInfo(devices[i][j], CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(devmem), &devmem, NULL);
                devmem /= 1000000;
                if(err == CL_SUCCESS)
                    ilist->append(QString::number((unsigned int)(devmem)));
                else
                    ilist->append("Unkown");
            }

            //CL_DEVICE_MAX_COMPUTE_UNITS * CL_DEVICE_MAX_WORK_GROUP_SIZE
            {
                cl_uint devcores;
                err = clGetDeviceInfo(devices[i][j], CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(devcores), &devcores, NULL);
                if(err == CL_SUCCESS)
                {
                    size_t devcores2;
                    err = clGetDeviceInfo(devices[i][j], CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(devcores2), &devcores2, NULL);
                    if(err == CL_SUCCESS)
                        ilist->append(QString::number(((unsigned int)(devcores)) * ((unsigned int)(devcores2))));
                    else
                        ilist->append("Unkown");
                }
                else
                    ilist->append("Unknown");
            }

What I do not understand is the Memory and the no of parallel computations shown for the CPU. Any idea why I am getting these results?

Was it helpful?

Solution

It is because CL_DEVICE_MAX_WORK_GROUP_SIZE is not an indicator of parallel computation ability.

OTHER TIPS

It's a complicated task - to measure device performance.

Metrics you used are not suitable to determine how fast device is. Moreover, simple tasks like matrix multiplication doesnt show it either. You need to use benchmarks to determine computing capabilities.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top