First: That's pretty much how it should be done.
Second: It my understanding that you should you PyMem_Malloc
/PyMem_Free
here. PyObject_Malloc
/PyObject_Free
are intended for allocation of Python objects and is optimized for a lot of smaller objects. However, unless you allocate a lot of memory or make a very large number of allocations, it would not matter much.
Third & Fourth: you could use the Capsules API to store your C structures and pass them around API functions. Instead of storing a global matrix, you could have your initialization function (initMatrix) return a capsule containing the matrix and pass this capsule to the other functions (e.g. deltaC). When the matrix is no longer referenced, it will be automatically destroyed.
If you do not plan on storing anything but the matrix, a possible alternative is to use numpy arrays. These arrays are accessible directly via the numpy C API from C as a matrix. This would save you the need to call a Python C API function to retrieve each element of the matrix, to allocate memory yourself or to deallocate when it is destroyed.