Question

Is there a way to force genfromtxt to output data with the shape : (xx, 1) in the case only one column of data is loaded? The usual shape is (xx, ). xx in my example could any integer.

update: here is an example of code:

import numpy as np
a = np.zeros([1000, 10])
nbcols = 1
for ind in range(0, 10, nbcols)
    a[:, ind : ind + nbcols] = np.genfromtxt('file_1000x10.csv', usecols = range(nbcols))

this piece of code works only for nbcols >= 2; assuming nbcols is an integer c [1, 10]. is there a solution to make it work for nbcols = 1 without adding an if statement.

In fact I simplified too much the original code for this post, though that wouldn't affect the answers to my problem. In fact the filename is given through a variable as following:

filename = 'file_1000x10_' + '%02d' % ind.astype(int) + '.csv'

So at each iteration in the for loop, np.genfromtxt loads data from another file.

Was it helpful?

Solution

I think the trick is to reshape(-1, nbcols) what you get from np.genfromtxt, so your assignment should look like:

a[:, ind:ind + nbcols] = np.genfromtxt('file_1000x10.csv',
                                       usecols = range(nbcols)).reshape(-1, nbcols)

On a separate note, looping over ind, and reading the file every time is unnecessary. You can do a little bit of higher dimensionality voodoo as follows:

import numpy as np
from StringIO import StringIO

def make_data(rows, cols) :
    data = ((str(k + cols * j) for k in xrange(cols)) for j in xrange(rows))
    data = '\n'.join(map(lambda x: ' '.join(x), data))
    return StringIO(data)

def read_data(f, rows, cols, nbcols) :
    a = np.zeros((rows, (cols + nbcols - 1) // nbcols, nbcols))
    a[...] = np.genfromtxt(f, usecols=range(nbcols)).reshape(-1, 1, nbcols)
    return a.reshape(rows, -1)[:, :cols]

>>> read_data(make_data(3, 6), 3, 6, 2)
array([[  0.,   1.,   0.,   1.,   0.,   1.],
       [  6.,   7.,   6.,   7.,   6.,   7.],
       [ 12.,  13.,  12.,  13.,  12.,  13.]])
>>> read_data(make_data(3, 6), 3, 6, 1)
array([[  0.,   0.,   0.,   0.,   0.,   0.],
       [  6.,   6.,   6.,   6.,   6.,   6.],
       [ 12.,  12.,  12.,  12.,  12.,  12.]])
>>> read_data(make_data(3, 6), 3, 6, 4)
array([[  0.,   1.,   2.,   3.,   0.,   1.],
       [  6.,   7.,   8.,   9.,   6.,   7.],
       [ 12.,  13.,  14.,  15.,  12.,  13.]])

ORIGINAL ANSWER You can add that extra dimension of size 1 to your_array using:

your_array.reshape(your_array.shape + (1,))

or the equivalent

your_array.reshape(-1, 1)

The same can be achieved with

your_array[..., np.newaxis]

or the equivalent

your_array[..., None]

OTHER TIPS

If you can use loadtxt instead of genfromtxt, and if you are using version 1.6.0 or later of numpy, the ndmin argument allows you to specify the (minimum) number of dimensions of the array. E.g.:

a[:, ind : ind + nbcols] = np.loadtxt('file_1000x10.csv', usecols=range(nbcols), ndmin=2)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top