I have a large set of large arrays that need to be fourier transformed one after another, repeatedly, and they do not all fit in memory at the same time. Typical array size is (350,250000), but is quite variable. The general procedure is
while True:
for data in data_set:
array = generate_array(data)
fft(array,farray)
do_something_with_farray()
ifft(farray,array)
do_something_with_array()
This needs to be fast, so ideally I would make plans for all the arrays beforehand, and reuse them in the loop. This is especially important because even constructing a plan with FFTW_ESTIMATE
is too slow for me to do it inside the loop (10x+ times slower than just executing the plan, when constructing it as pyfftw.FFTW(array, farray, flags=['FFTW_ESTIMATE,FFTW_DESTROY_INPUT'], threads=nthread, axes=[-1])
). However, each plan contains a reference to the arrays that were used when constructing it, which means that keeping all the plans in memory results in me keeping all the arrays in memory too, which I can't afford.
Is it possible to make pyfftw release the references it holds to the arrays? After all, I am planning to repoint them to fully compatible new arrays inside the loop anyway. If not, is there some other way of getting around this problem? I guess I could make plans for single rows, or for chunks of rows, but that could easily lead to slowdowns.
PS. I use FFTW_ESTIMATE
rather than FFTW_MEASURE
despite planning to reuse the plan many times becuse FFTW_MEASURE
takes forever for these array sizes, and when I specify a time limit, performance is no better than for FFTW_ESTIMATE
.
Edit: Actually, the slowness of constructing the plan only happens the first time I construct a plan of that shape (due to wisdom, I guess), so the approach of not storing the plans works after all. Still, if it is possible to store plans without the array references, that would be nice to know about.