سؤال

I'm using cublasDgemm to multiply two matrices.

I wrote a method that uses cublasDgemm and returns the pointer to the output.

It seems to work well in my unit tests but it fails in my application code (return code CUBLAS_STATUS_EXECUTION_FAILED).

I went over the code many times now and everything seem ok.. is there anyway to get a better error explanation?

Update: It seems like every 2nd cublasDgemm call works. The first one I'm getting this error, the second one I get success.. any ideas?

Update2: This is my call

    const double alpha = 1.0;

    const double beta = 0;

cublasStatus_t ret = cublasDgemm(RmCudaMatrix::handle_, CUBLAS_OP_N, CUBLAS_OP_N, 
    Rows(), b.Cols(), Cols(), &alpha,
    device_matrix_, Rows(), b.device_matrix_, b.Rows(), &beta,
    output->device_matrix_, output->Rows());

Thanks.

هل كانت مفيدة؟

المحلول

The CUBLAS functions may run asynchronously so, when a CUBLAS call returns a cublasStatus_t other that CUBLAS_STATUS_SUCCESS, the error may be in a previous call. To determine if this is the case, check the CUDA error status after each CUBLAS call with cudaGetLastError().

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top