Question

Over at http://scicomp.stackexchange.com I asked this question about parallel matrix algorithms in IDL. The answers suggest using a multi-threaded LAPACK implementation and suggest some hacks to get IDL to use a specific LAPACK library. I haven't been able to get this to work.

I would ideally like the existing LAPACK DLM to simply be able to use a multi-threaded LAPACK library and it feels like this should be possible but I have not had any success. Alternatively I guess the next simplest step would be to create a new DLM to wrap a matrix inversion call in some C code and ensure this DLM points to the desired implementation. The documentation for creating DLMs is making me cross-eyed though, so any pointers to doing this (if it is required) would also be appreciated.

Était-ce utile?

La solution

What platform are you targeting?

Looking at idl_lapack.so with nm on my platform (Mac OS X, IDL 8.2.1) seems to indicate that the LAPACK routines are directly in the .so, so my (albeit limited) understanding is that it would not be simple to swap out (i.e., by setting LD_LIBRARY_PATH).

$ nm idl_lapack.so
...
000000000023d5bb t _dgemm_
000000000023dfcb t _dgemv_
000000000009d9be t _dgeqp3_
000000000009e204 t _dgeqr2_
000000000009e41d t _dgeqrf_
000000000023e714 t _dger_
000000000009e9ad t _dgerfs_
000000000009f4ba t _dgerq2_
000000000009f6e1 t _dgerqf_

Some other possibilities...

My personal library has a directory src/dist_tools/bindings containing routines for automatically creating bindings for a library given "simple" (i.e., not using typedefs) function prototypes. LAPACK would be fairly easy to create bindings for (the hardest part would probably be to build the package you want to use ATLAS, PLAPACK, ScaLAPACK, etc.). The library is free to use, a small consulting contract could be done if you would like it done for you.

The next version of GPULib will contain a GPU implementation of LAPACK, using the MAGMA library. This is effectively a highly parallel option, but only works on CUDA graphics cards. It would also work best if other operations besides the matrix inversion could be done on the GPU to minimize memory transfer. This option costs money.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top