Question

I have a 2d array that looks like this:

XX
xx

What's the most efficient way to add an extra row and column:

xxy
xxy
yyy

For bonus points, I'd like to also be able to knock out single rows and columns, so for example in the matrix below I'd like to be able to knock out all of the a's leaving only the x's - specifically I'm trying to delete the nth row and the nth column at the same time - and I want to be able to do this as quickly as possible:

xxaxx
xxaxx
aaaaa
xxaxx
xxaxx
Was it helpful?

Solution

The shortest in terms of lines of code i can think of is for the first question.

>>> import numpy as np
>>> p = np.array([[1,2],[3,4]])

>>> p = np.append(p, [[5,6]], 0)
>>> p = np.append(p, [[7],[8],[9]],1)

>>> p
array([[1, 2, 7],
   [3, 4, 8],
   [5, 6, 9]])

And the for the second question

    p = np.array(range(20))
>>> p.shape = (4,5)
>>> p
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
>>> n = 2
>>> p = np.append(p[:n],p[n+1:],0)
>>> p = np.append(p[...,:n],p[...,n+1:],1)
>>> p
array([[ 0,  1,  3,  4],
       [ 5,  6,  8,  9],
       [15, 16, 18, 19]])

OTHER TIPS

A useful alternative answer to the first question, using the examples from tomeedee’s answer, would be to use numpy’s vstack and column_stack methods:

Given a matrix p,

>>> import numpy as np
>>> p = np.array([ [1,2] , [3,4] ])

an augmented matrix can be generated by:

>>> p = np.vstack( [ p , [5 , 6] ] )
>>> p = np.column_stack( [ p , [ 7 , 8 , 9 ] ] )
>>> p
array([[1, 2, 7],
       [3, 4, 8],
       [5, 6, 9]])

These methods may be convenient in practice than np.append() as they allow 1D arrays to be appended to a matrix without any modification, in contrast to the following scenario:

>>> p = np.array([ [ 1 , 2 ] , [ 3 , 4 ] , [ 5 , 6 ] ] )
>>> p = np.append( p , [ 7 , 8 , 9 ] , 1 )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/dist-packages/numpy/lib/function_base.py", line 3234, in append
    return concatenate((arr, values), axis=axis)
ValueError: arrays must have same number of dimensions

In answer to the second question, a nice way to remove rows and columns is to use logical array indexing as follows:

Given a matrix p,

>>> p = np.arange( 20 ).reshape( ( 4 , 5 ) )

suppose we want to remove row 1 and column 2:

>>> r , c = 1 , 2
>>> p = p [ np.arange( p.shape[0] ) != r , : ] 
>>> p = p [ : , np.arange( p.shape[1] ) != c ]
>>> p
array([[ 0,  1,  3,  4],
       [10, 11, 13, 14],
       [15, 16, 18, 19]])

Note - for reformed Matlab users - if you wanted to do these in a one-liner you need to index twice:

>>> p = np.arange( 20 ).reshape( ( 4 , 5 ) )    
>>> p = p [ np.arange( p.shape[0] ) != r , : ] [ : , np.arange( p.shape[1] ) != c ]

This technique can also be extended to remove sets of rows and columns, so if we wanted to remove rows 0 & 2 and columns 1, 2 & 3 we could use numpy's setdiff1d function to generate the desired logical index:

>>> p = np.arange( 20 ).reshape( ( 4 , 5 ) )
>>> r = [ 0 , 2 ]
>>> c = [ 1 , 2 , 3 ]
>>> p = p [ np.setdiff1d( np.arange( p.shape[0] ), r ) , : ] 
>>> p = p [ : , np.setdiff1d( np.arange( p.shape[1] ) , c ) ]
>>> p
array([[ 5,  9],
       [15, 19]])

Another elegant solution to the first question may be the insert command:

p = np.array([[1,2],[3,4]])
p = np.insert(p, 2, values=0, axis=1) # insert values before column 2

Leads to:

array([[1, 2, 0],
       [3, 4, 0]])

insert may be slower than append but allows you to fill the whole row/column with one value easily.

As for the second question, delete has been suggested before:

p = np.delete(p, 2, axis=1)

Which restores the original array again:

array([[1, 2],
       [3, 4]])

I find it much easier to "extend" via assigning in a bigger matrix. E.g.

import numpy as np
p = np.array([[1,2], [3,4]])
g = np.array(range(20))
g.shape = (4,5)
g[0:2, 0:2] = p

Here are the arrays:

p

   array([[1, 2],
       [3, 4]])

g:

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

and the resulting g after assignment:

   array([[ 1,  2,  2,  3,  4],
       [ 3,  4,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

You can use:

>>> np.concatenate([array1, array2, ...]) 

e.g.

>>> import numpy as np
>>> a = [[1, 2, 3],[10, 20, 30]]
>>> b = [[100,200,300]]
>>> a = np.array(a) # not necessary, but numpy objects prefered to built-in
>>> b = np.array(b) # "^
>>> a
array([[ 1,  2,  3],
       [10, 20, 30]])
>>> b
array([[100, 200, 300]])
>>> c = np.concatenate([a,b])
>>> c
array([[  1,   2,   3],
       [ 10,  20,  30],
       [100, 200, 300]])
>>> print c
[[  1   2   3]
 [ 10  20  30]
 [100 200 300]]

~-+-~-+-~-+-~

Sometimes, you will come across trouble if a numpy array object is initialized with incomplete values for its shape property. This problem is fixed by assigning to the shape property the tuple: (array_length, element_length).

Note: Here, 'array_length' and 'element_length' are integer parameters, which you substitute values in for. A 'tuple' is just a pair of numbers in parentheses.

e.g.

>>> import numpy as np
>>> a = np.array([[1,2,3],[10,20,30]])
>>> b = np.array([100,200,300]) # initialize b with incorrect dimensions
>>> a.shape
(2, 3)
>>> b.shape
(3,)
>>> c = np.concatenate([a,b])

Traceback (most recent call last):
  File "<pyshell#191>", line 1, in <module>
    c = np.concatenate([a,b])
ValueError: all the input arrays must have same number of dimensions
>>> b.shape = (1,3)
>>> c = np.concatenate([a,b])
>>> c
array([[  1,   2,   3],
       [ 10,  20,  30],
       [100, 200, 300]])

maybe you need this.

>>> x = np.array([11,22])
>>> y = np.array([18,7,6])
>>> z = np.array([1,3,5])
>>> np.concatenate((x,y,z))
array([11, 22, 18,  7,  6,  1,  3,  5])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top