Efficient simultaneous update of objects in a list, possibly involving copying. How?

https://stackoverflow.com/questions/18940845

29-06-2022
|

Pergunta

Suppose I have a list g (In practice a doubly nested list which is three dimensions, but here it's simplified to one dimension for clarity.)

g = [1,2,3,2,1]

I want a function that can make g[x] = g[x-1]+g[x]. An incorrect way would be

def f(thing):
    for x in xrange(0,len(thing)):
        thing[x] += thing[x-1]
f(g)

which is wrong because it updates the integers one-by-one instead of all at once. An alternative is to copy g.

gcopy = g[:]
def f(thing, copy):
    for x in xrange(0,len(thing)):
        thing[x] = copy[x]+copy[x-1]
g = gcopy[:]
f(g,gcopy)

If g has objects, as in my case, copy.deepcopy(g) seems to work.

The issue is that copying becomes my performance bottleneck and deepcopy takes as long to run as the rest of my code combined.

I've searched around on SO and Google for ideas/solutions, and brainstormed a few, but none seemed promising.

An answer on this thread: https://stackoverflow.com/a/16711895/1858363, suggests returning the objects instead of modifying them. I am confused by what that entails though. How does returning the modified objects help?

I've heard that copying objects is computationally expensive with deepcopy. If this is true, a possible solution is to replace the objects with lists and store the attributes of the objects as integers and floats in the list. This has the effect of making everything nearly unreadable and might get even more confusing since I have subclasses and inheritance. Since there are only lists, it is then hopefully possible to shallowcopy every list, sublist, subsublist, etc, and reassemble them. Probably not a good solution and it's doubtful the speedup is substantial (though i haven't tested it).

To summarize, is there a way to more efficiently simultaneously change the values of every object in a list with or without copying the list? Or am i stuck with deepcopy? Thank you!

Solução

In general, I don't think you are going to find a fast way to do an in-place list edit with this kind of data dependency. Backwards iteration will work if you only depend on earlier items. Forward iteration will work if you only have forward dependencies. If you have both, you are going to need temporary variables to store the original values until they are no longer needed. Especially with multi-dimensional lists this could end up being less efficient than just copying the lists.

Can you use numpy arrays to do what you want? Numpy is good at this sort of operation and while it would end up making a copy, it is much faster to copy a numpy array than to do a deepcopy() of a nested list.

Outras dicas

To be honest I have no idea what you're doing. But have you tried backwards iteration? It's a powerful technique.

def f(thing):
    for x in xrange(len(thing) -1, 0, -1):
        thing[x] += thing[x-1]
f(g)

This will allow you to update without messing stuff up. You can also use backwards iteration to delete elements from a list. Essentially any operation which relies only on previous elements of a sequence remaining stable can be done through backwards iteration.

If you want to additionally do g[0] = g[0] + g[len(g)-1] then change the '0' in the code above to a '-1'. It will tell the loop to go one step further.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow