Вопрос

Note: this is not a question - I solved it and posted it here, trying to share what I learned.

I encountered a problem in using numpy last night, and here is how I simplified it to a short code. At first it looks like a bug to me, but just when I try to write this problem I realized that it was my own mistake. Hope someone else later who also comes to this issue can be benefited from it!

This is repeatable on my Win 7 x64, with WinSDK 7.1's C compiler. Python version is 3.3.3, built with MSC v.1600. Numpy version is 1.8.0.

0) Brief summary: when I pass an ndarray to my dll compiled from c code, the c code sees a different array than the one I passed in.

1) Write a c code:

// testdll.c
#include <stdlib.h>

__declspec(dllexport) void copy_ndarray(double *array1, double *array2, size_t array_length);

void copy_ndarray(double *array1, double *array2, size_t array_length)
{
    size_t i;
    for(i=0; i<array_length; i++)
        array2[i] = array1[i];
    return;
}

2) Write a python code:

import numpy as np
import ctypes


# wrap the function from dll to python
lib = ctypes.cdll.LoadLibrary('./testdll.dll')
fun = lib.copy_ndarray
fun.restype = None
fun.argtypes = [np.ctypeslib.ndpointer(ctypes.c_double), np.ctypeslib.ndpointer(ctypes.c_double), ctypes.c_size_t]
# Initialize array1 and array2
array_length= 10
temp = np.c_[100.*np.ones(array_length), 200.*np.ones(array_length)]
array1 = temp[:, 1]
array2 = np.zeros(array_length)
fun(array1, array2, array_length)

3) Run the code. See how array1 and array2 are different.

Это было полезно?

Решение

Of course it should be different!

When I used array1 = temp[:, 1], the array1 is not a real size (10,) ndarray. It is a view of the temp, which is size(10, 2). Think about how it is stored in the memory - when the pointer go to another sizeof(double) in c, it will encounter the next element in temp, not in array1.

The way to fix it is - do not use ndarray view when reading your data! Use this line

array1 = temp[:, 1].copy()

to make a copy, instead of simply using the view.

The correct python code is:

import numpy as np
import ctypes


# wrap the function from dll to python
lib = ctypes.cdll.LoadLibrary('./testdll.dll')
fun = lib.copy_ndarray
fun.restype = None
fun.argtypes = [np.ctypeslib.ndpointer(ctypes.c_double), np.ctypeslib.ndpointer(ctypes.c_double), ctypes.c_size_t]
# Initialize array1 and array2
array_length= 10
temp = np.c_[100.*np.ones(array_length), 200.*np.ones(array_length)]
array1 = temp[:, 1].copy()
array2 = np.zeros(array_length)
fun(array1, array2, array_length)

I personally find this tricky, because as a data analyzer (honestly I'm not... I'm a researcher, but close enough!), 99% of the times a view is better than the copy, because it is faster, and we don't need the original ndarray once a data is read in anyways.

It is good to learn this and keep this in mind!

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top