Defining the type of fid
won't help because calling python functions is still costly. Try compiling your example with "-a" flag to see what I mean. However, you can use low-level C functions for file handling to avoid python overhead in your loop. For the sake of example, I assumed that the data starts right from the beginning of the file and that its type is double
from libc.stdio cimport *
cdef extern from "stdio.h":
FILE *fdopen(int, const char *)
import numpy as np
cimport numpy as np
DTYPE = np.double # or whatever your type is
ctypedef np.double_t DTYPE_t # or whatever your type is
def FromFileSkip(fid, int count=1, int skip=0):
cdef int k
cdef FILE* cfile
cdef np.ndarray[DTYPE_t, ndim=1] data
cdef DTYPE_t* data_ptr
cfile = fdopen(fid.fileno(), 'rb') # attach the stream
data = np.zeros(count).astype(DTYPE)
data_ptr = <DTYPE_t*>data.data
# maybe skip some header bytes here
# ...
for k in range(count):
if fread(<void*>(data_ptr + k), sizeof(DTYPE_t), 1, cfile) < 0:
break
if fseek(cfile, skip, SEEK_CUR):
break
return data
Note that the output of cython -a example.pyx
shows no python overhead inside the loop.