Pergunta

I would like to operate on lists element by element without using numpy, for example, i want add([1,2,3], [2,3,4]) = [3,5,7] and mult([1,1,1],[9,9,9]) = [9,9,9], but i'm not sure which way of doing is it considered 'correct' style.

The two solutions i came up with were

def add(list1,list2):
    list3 = []
    for x in xrange(0,len(list1)):
        list3.append(list1[x]+list2[x])
    return list3

def mult(list1, list2):
    list3 = []
    for x in xrange(0,len(list1)):
        list3.append(list1[x]*list2[x])
    return list3

def div(list1, list2):
    list3 = []
    for x in xrange(0,len(list1)):
        list3.append(list1[x]/list2[x])
    return list3

def sub(list1, list2):
    list3 = []
    for x in xrange(0,len(list1)):
        list3.append(list1[x]-list2[x])
    return list3

where each operator is given a separate function

and

def add(a,b)
    return a+b
def mult(a,b)
    return a*b
def div(a,b)
    return a/b
def sub(a,b)
    return a-b
def elementwiseoperation(list1, list2, function):
    list3 = []
    for x in xrange(0,len(list1)):
        list3.append(function(list1[x],list2[x]))
    return list3

where all the basic functions are defined, and I have a separate function to use them on each element. I skimmed through PEP8, but didn't find anything directly relevant. Which way is better?

Foi útil?

Solução

The normal way to do this would be to use map or itertools.imap:

import operator
multiadd = lambda a,b: map(operator.add, a,b)
print multiadd([1,2,3], [2,3,4]) #=> [3, 5, 7]

Ideone: http://ideone.com/yRLHxW

map is a c-implemented version of your elementwiseoperation, with the advantage of having the standard name, working with any iterable type and being faster (on some versions; see @nathan's answer for some profiling).

Alternatively, you could use partial and map for a pleasingly pointfree style:

import operator
import functools

multiadd = functools.partial(map, operator.add)
print multiadd([1,2,3], [2,3,4]) #=> [3, 5, 7]

Ideone: http://ideone.com/BUhRCW

Anyway, you've taken the first steps in functional programming yourself. I suggest you read around the topic.

As a general matter of style, iterating by index using range is generally considered the wrong thing, if you want to visit every item. The usual way of doing this is simply to iterate the structure directly. Use zip or itertools.izip to iterate in parallel:

for x in l:
    print l

for a,b in zip(l,k):
    print a+b

And the usual way to iterate to create a list is not to use append, but a list comprehension:

[a+b for a,b in itertools.izip(l,k)]

Outras dicas

This could be done with just using map and operator module:

>>> from operator import add,mul
>>> map(add, [1,2,3], [2,3,4])
[3, 5, 7]
>>> map(mul, [1,1,1],[9,9,9])
[9, 9, 9]

You can use zip:

sum = [x+y for x,y in zip (list1, list2) ]
diff = [x-y for x,y in zip (list1, list2) ]
mult = [x*y for x,y in zip (list1, list2) ]
div = [x/y for x,y in zip (list1, list2) ]

Performance Comparison

@Marcin said that map "would be cleaner and more efficient" than a list comprehension. I find that the list comprehension looks better, but that's a matter of taste. The efficiency claim I also found surprising, and that one we can test.

Here's a comparison for varying list sizes; code to generate the plot is below (note that this needs to be run in a Jupyter notebook or at least IPython. Also, it takes a bit of time to finish). numpy is not really up for comparison, because the OP needed to work with lists, but I included it because, if you're interested in performance, it's worth knowing what the alternative would be.

As you can see, there's no reason to favor one approach over the other based on efficiency concerns.

performance comparison

import numpy as np
import operator
import matplotlib.pyplot as plt
%matplotlib inline

lc_mean = []  # list comprehension
lc_std = []
map_mean = []
map_std = []
np_mean = []
np_std = []

for n in range(1, 8):
    l1 = np.random.rand(10 ** n)
    l2 = np.random.rand(10 ** n)

    np_time = %timeit -o l1 + l2
    np_mean.append(np_time.average)
    np_std.append(np_time.stdev)

    l1 = l1.tolist()
    l2 = l2.tolist()

    lc_time = %timeit -o [x + y for x, y in zip(l1, l2)]
    lc_mean.append(lc_time.average)
    lc_std.append(lc_time.stdev)

    map_time = %timeit -o list(map(operator.add, l1, l2))
    map_mean.append(map_time.average)
    map_std.append(map_time.stdev)

list_sizes = [10 ** n for n in range(1, 8)]
plt.figure(figsize=(8, 6))

np_mean = np.array(np_mean)
plt.plot(list_sizes, np_mean, label='np')
plt.fill_between(list_sizes, np_mean - np_std, np_mean + np_std, alpha=0.5)

lc_mean = np.array(lc_mean)
plt.plot(list_sizes, lc_mean, label='lc')
plt.fill_between(list_sizes, lc_mean - lc_std, lc_mean + lc_std, alpha=0.5)

map_mean = np.array(map_mean)
plt.plot(list_sizes, map_mean, label='map')
plt.fill_between(list_sizes, map_mean - map_std, map_mean + map_std, alpha=0.5)

plt.loglog()
plt.xlabel('List Size')
plt.ylabel('Time (s)')
plt.title('List Comprehension vs Map Add (vs numpy)')
plt.legend()

How about:

import operator

a = [1, 2, 3]
b = [2, 3, 4]

sum = map(operator.add, a, b)
mul = map(operator.mul, a, b)

No, there is no sense in writing own functions in this case.
Just use map and operator as you won't implement anything better.
Any wrapper on map is just another stuff to be put on stack.
Any own implementation is slower than the builtin solution.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top