Pythonic Way of Inserting Columns at Arbitrary Positions

https://stackoverflow.com/questions/20272275

06-08-2022
|

Question

I have a grid of data represented by a list of rows:

tableData = [
    [-27.37, 36.61 , 8.90  , -11.20, -36.03, -42.34],
    [16.83 , -33.45, -5.15 , 12.90 , -48.60, -8.70],
    [-19.73, 2.64  , 7.21  , 24.16 , 18.38 , 20.47],
    [-31.05, 15.07 , 42.69 , -32.13, -36.02, 42.31],
    [15.18 , 30.54 , -47.31, 48.38 , 31.60 , -1.98]
]

Now, I want to insert two columns containing None values, for example between columns 2&3 and 4&5 so that I have the following:

tableDataWithNones = [
    [-27.37, 36.61 , None,  8.90  , -11.20, None, -36.03, -42.34],
    [16.83 , -33.45, None,  -5.15 , 12.90 , None,  -48.60, -8.70],
    [-19.73, 2.64  , None,  7.21  , 24.16 , None,  18.38 , 20.47],
    [-31.05, 15.07 , None,  42.69 , -32.13, None,  -36.02, 42.31],
    [15.18 , 30.54 , None,  -47.31, 48.38 , None,  31.60 , -1.98]
]

I can do it with a double for loop like so:

spacerPositions = [2, 4]
for i in tableData:
    for j in reversed(spacerPositions):
        i.insert(j, None)

But that doesn't feel like the pythonic way to do this.

I was thinking I could transpose the data using numpy, so that columns can become rows, then i can use insert to put rows of Nones in, then transpose the data back. But insert can't take multiple values for the index so I still have to use a for loop.

Any ideas how I can do this in a better way?

Solution

Using numpy.insert:

>>> import numpy as np
>>> arr = np.array(tableData)
>>> np.insert(arr, (2, 4), None, axis=1)
array([[-27.37,  36.61,    nan,   8.9 , -11.2 ,    nan, -36.03, -42.34],
       [ 16.83, -33.45,    nan,  -5.15,  12.9 ,    nan, -48.6 ,  -8.7 ],
       [-19.73,   2.64,    nan,   7.21,  24.16,    nan,  18.38,  20.47],
       [-31.05,  15.07,    nan,  42.69, -32.13,    nan, -36.02,  42.31],
       [ 15.18,  30.54,    nan, -47.31,  48.38,    nan,  31.6 ,  -1.98]])

OTHER TIPS

A non numpy, functional way of doing it:

from functools import partial

    def insertFunc(data, locs):
        out = []

        for i,d in enumerate(data):
            if i in locs:
                out.append(None)

            out.append(d)

        return out

map(partial(insertFunc,locs=spacerPositions),tableData)

Comparing this to

for i in spacerPositions: 
    for row in tableData: 
        row.insert(i, None)

with timeit(...,number=10000), I get:

0.144109010696 for the first algorithm 0.606826066971 for the second algorithm

This makes sense since insert is O(n) but append is O(1).

Surprisingly enough, timing the accepted solution with numpy with:

 timeit.timeit('np.insert(arr,(2,4), None, axis=1)',number=10000,setup='import numpy as np;from __main__ import tableData; arr = np.array(tableData)')

gives 0.44481992721557617

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow