Seeking a more efficient python numpy ravel+reshape

https://stackoverflow.com/questions/22430382

15-06-2023
|

Pergunta

I'm curious if there is a better way of doing a numpy ravel+reshape.
I load up a large stack of large images and get an array of shape (num-rasters, h, w) where num-rasters is the number of images and h/w are height/width of an image(which are all the same size). I wish to convert the array into a shape (h*w, num-rasters)

here is the way I do it now:

res = my_function(some_variable) #(num-rasters, h, w)

res = res.ravel(order='F').reshape((res.shape[1] * res.shape[2], res.shape[0])) #(h*w, num-rasters)

It works fine but my 'res' variable(the stack of images) is several Gigs in size and even with a ton of ram (32Gigs), the operation takes it all. I'm curious if any pythonistas or numpy pros have any suggestions.

thanks!

############### post question edit/follow-up

first, the reshaping in place ended up being waaaaay faster than a .reshap() call...which would presumably return a copy with all the associated memory stuff. I should have known better with that.

shortly after I posted I discovered "swapaxes" http://docs.scipy.org/doc/numpy/reference/generated/numpy.swapaxes.html so I made a version with that too:

res2 = res.swapaxes(0, 2).reshape((res.shape[1] * res.shape[2], res.shape[0]))

took 9.2 seconds it was only a wee bit faster than my original(9.3). But with only one discernible memory peak in my process...but still a big and slow peak.

as magic suggested:

res.shape = (res.shape[0], res.shape[1]*res.shape[2])
res_T = res.T

took basically no time (2.4e-5 seconds) with no memory spike. and throwing a copy:

res.shape = (res.shape[0], res.shape[1]*res.shape[2])
res_T = res.T.copy()

makes the operation take 0.85 seconds with a similar (but brief) memory spike (for the copy).

the take-home for me is that 'swapaxes' does the same thing as a transpose but you can swap any axes you want, whereas transpose has its fixed way of flipping. it's also nice to see how a transpose behaves in 3-d...that is the main point for me...not needing to ravel. also, the transpose is an in-place view.

Solução

You can change the array's shape parameter, leading to in-place shape change. It's a bit tricky to tell which dimensions go where, but something along those lines should work:

res.shape = (res.shape[0], res.shape[1]*res.shape[2]) ## converts to num_rasters, h*w

Transposing this would give you a view (so would sort-of be in place), so then you can do

res_T = res.T

and this should lead to no memory copying, to my knowledge.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow