Erreur d'allocation de mémoire CUBLAS

https://stackoverflow.com/questions/1603401

05-07-2019
|

Question

J'ai essayé d'affecter 17338896 éléments de nombres en virgule flottante comme suit (ce qui correspond à environ 70 mb):

    state = cublasAlloc(theSim->Ndim*theSim->Ndim, 
                       sizeof(*(theSim->K0)), 
                       (void**)&K0cuda);
    if(state != CUBLAS_STATUS_SUCCESS) {
        printf("Error allocation video memory.\n");
        return -1;
    }

Cependant, je reçois le message d'erreur CUBLAS_STATUS_ALLOC_FAILED pour l'état de la variable. Cela aurait-il quelque chose à voir avec la quantité de mémoire de carte vidéo disponible sur la machine (128 Mo sur la mienne) ou s'agirait-il d'une limite de la quantité de mémoire que je peux allouer à l'aide de la fonction cublasAlloc () (c'est-à-dire sans rapport avec la quantité de mémoire disponible sur la machine)? J'ai essayé d'utiliser la fonction cudaMalloc () et je rencontre le même problème. Merci d’avance d’avoir étudié cette question.

-------------- Ajout de la reproduction d'erreur ----------------------------- --------

#include <cuda.h>
#include <stdio.h>
int main (int argc, char *argv[]) {

    // CUDA setup
    cublasStatus state;

    if(cublasInit() == CUBLAS_STATUS_NOT_INITIALIZED) {
        printf("CUBLAS init error.\n");
        return -1;
    }

    // Instantiate video memory pointers
    float *K0cuda;

    // Allocate video memory needed
    state = cublasAlloc(20000000, 
                        sizeof(float), 
                        (void**)&K0cuda);
    if(state != CUBLAS_STATUS_SUCCESS) {
        printf("Error allocation video memory.\n");
        return -1;
    }

    // Copy K0 from CPU memory to GPU memory
    // Note: before so, decide whether to integrate as a part of InsertionSim or
    //      CUDA content as a separate class
    //state = cublasSetMatrix(theSim->Ndim, theSim->Ndim, sizeof(*theSim->K0),
    //                      theSim->K0, theSim->Ndim, K0cuda, theSim->Ndim);
    //if(state != CUBLAS_STATUS_SUCCESS) {
    //  printf("Error copy to video memory.\n");
    //  return -1;
    //}

    // Free memory
    if(cublasFree(K0cuda) != CUBLAS_STATUS_SUCCESS) {
        printf("Error freeing video memory.\n");
        return -1;
    }

    // CUDA shutdown
    if(cublasShutdown() != CUBLAS_STATUS_SUCCESS) {
        printf("CUBLAS shutdown error.\n");
        return -1;
    }

    if(theSim != NULL) delete theSim;

    return 0;
}

La solution

La mémoire peut se fragmenter, ce qui signifie que vous pouvez toujours allouer plusieurs blocs plus petits mais pas un seul grand bloc. Votre carte vidéo aura évidemment besoin de mémoire pour sa tâche 2D normale. Si cela divise les 128 Mo en 2 blocs de près de 64 Mo, vous constaterez ce type d’échec.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow