The issue was i had the wrong compiler arguments, I feel really stupid now...
i was compiling for 1.0, I switched it to 2.0 and now its working.
Question
Below I've posted some code that I'm using to try and get a feel for the CUDA thrust library. Before anyone says anything I know this is an extremely inefficient way to find prime numbers, I just want something to test parallelism. Unfortunatly when I run this I get an error here is what pops up:
Unhandled exception at at 0x76FCC41F in Thrust_2.exe: Microsoft C++ exception: thrust::system::system_error at memory location 0x0022F500.
If I switch the device_vector
to a host_vector
in the doTest function I no longer get the error and the program works flawlessly. Why does this happen and how can I get it to use the device_vector without crashing? I would like to do as much in parallel is possible.
Also the entire program works as intended with a host_vector.
PS:
I'm using VS2012
Cuda: V5.5
GPU: geforce gt 540M
Thrust: Got with cuda.
Thanks in advance!
struct prime{
__host__ __device__
void operator()(long& x){
bool result = true;
long stop = ceil(sqrt((float)x));
if(x%2!=0){
for(int i = 3;i<stop;i+=2){
if(x%i==0){
result = false;
break;
};
}
}else{
result = false;
}
if(!result)
x = -1;
}
};
void doTest(long gen){
using namespace thrust;
device_vector<long> tNum(gen);
sequence(tNum.begin(),tNum.end()); // fails here when using a device_vector
}
int main(){
doTest(1000);
return 0;
}
Solution 2
The issue was i had the wrong compiler arguments, I feel really stupid now...
i was compiling for 1.0, I switched it to 2.0 and now its working.
OTHER TIPS
This is a problem:
void operator()(long& x){
bool result = true;
long stop = ceil(sqrt(x));
And in fact you should be receiving a warning message from the compiler about it.
The sqrt
function available in device code is only available for float
and double
arguments. Your argument is of type long
. This means the compiler will attempt to use a host-library version of the sqrt
function, which will not work in device code. When you create your vector as a host vector, this is not a problem, as the functor is being run in host code. However when you switch to a device vector, the functor (running on the device) will crash at that point, and throw a thrust error.
As a simple test, you could modify it to:
long stop = ceil(sqrt((float)x));
and see if it eliminates the crash. Whether or not the cast from long
to float
is valid for your code is something you will have to decide.