Question

Using SciPy and MATLAB, I'm having trouble reconstructing an array to match what is given from a MATLAB cell array loaded using scipy.io.loadmat().

For example, say I create a cell containing a pair of double arrays in MATLAB and then load it using scipy.io (I'm using SPM to do imaging analyses in conjunction with pynifti and the like)

MATLAB

>> onsets{1} = [0 30 60 90]
>> onsets{2} = [15 45 75 105]

Python

>>> import scipy.io as scio
>>> mat = scio.loadmat('onsets.mat')
>>> mat['onsets'][0]
array([[[ 0 30 60 90]], [[ 15  45  75 105]]], dtype=object)

>>> mat['onsets'][0].shape

(2,)

My question is this: Why does this numpy array have the shape (2,) instead of (2,1,4)? In real life I'm trying to use Python to parse a logfile and build these onsets cell arrays, so I'd like to be able to build them from scratch.

When I try to build the same array from the printed output, I get a different shape back:

>>> new_onsets = array([[[ 0, 30, 60, 90]], [[ 15,  45,  75, 105]]], dtype=object)
array([[[0, 30, 60, 90]],

       [[15, 45, 75, 105]]], dtype=object)

>>> new_onsets.shape
(2,1,4)

Unfortunately, the shape (vectors of doubles in a cell array) is coded in a spec upstream, so I need to be able to get this saved exactly in this format. Of course, it's not a big deal since I could just write the parser in MATLAB, but it would be nice to figure out what's going on and add a little to my [minuscule] knowledge of numpy.

Was it helpful?

Solution 2

Travis from the scipy mailing list responded that the right way to build this is to create the structure first, then populate the arrays:

http://article.gmane.org/gmane.comp.python.scientific.user/31760

> You could build what you saw before with: 
> 
> new_onsets = empty((2,), dtype=object) 
> new_onsets[0] = array([[0, 30, 60, 90]]) 
> new_onsets[1] = array([[15, 45, 75, 105]])

OTHER TIPS

This is one of those things I personally find kind of annoying in python. It is because loadmat automatically "squeezes" dimensions.

By default, squeeze_me=True so as you've seen you get this:

>>> x = sio.loadmat('mymat.mat',squeeze_me=True)
>>> y = x['onsets']
>>> y.shape
(2,)

If you use loadmat with squeeze_me set to False then you don't get one dimension squeezed out:

>>> a = sio.loadmat('mymat.mat',squeeze_me=False)
>>> a
>>> b = a['onsets']
>>> b.shape
(1, 2)

That said, I can't for the life of me figure out how to get another dimension to show up (that is, b.shape = (1,2,4)) for a cell array like 'onsets'. I've only been able to get it for non-cell plain-old vanilla MATLAB arrays

onset_array = [onsets{1}; onsets{2}];

I think the problem here is that cell arrays aren't really arrays, which is why scio.loadmat loads onsets.mat to an object array.

Here, your cell array could be reduced to a normal array of shape (2,1,4), but what if, instead, your data looked like:

>> onsets{1} = {0 30 60 'bob'}
>> onsets{2} = {15 45 75 'fred'}

I'm not sure what the best solution is, but if you know your data is an array, you should probably convert to a normal array before saving in Matlab, or after loading with Scipy.

Edit: The example cell array above could, in theory, be cast into a numpy structured array, but note that's not generally true of cell arrays because the columns don't have to be the same data type. The logical way to represent lists of arbitrary data types is with a Python list (or an array of lists, here), which is what loadmat returns.

Edit 2: Fix cell array syntax, as suggested by Erik Kastman.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top