Question

I have written the following function for doing linear regression with 2 parameters(the actual math behind it is not that relevant to this question). It takes two functions, f1, f2, and two lists xs, ys:

def lr2par(f1, f2, xs, ys):
    c11 = sum(map(lambda x: (f1(x))**2, xs))
    c12 = sum(map(lambda x: f1(x) * f2(x), xs))
    c22 = sum(map(lambda x: (f2(x))**2, xs))
    d1 = sum(map(lambda x, y: y*f1(x), xs,ys))
    d2 = sum(map(lambda x, y: y*f2(x), xs,ys))

    a1 = -(c22*d1 - c12*d2)/(c12*c12 - c11*c22)
    a2 = (c12*d1 - c11*d2)/(c12*c12 - c11*c22)

    return (c11, c12, c22, d1, d2, a1, a2)

It works as expected as long xs and ys are lists. However, as you can see it is written in a quite functional style, so of course I would like to be able to use this function elegantly in functional code. That includes calling a function like map on a list before i input it to the function, like the ys argument in this example:

lr2par(lambda x: x, lambda x: 1, [1, 3, 5, 7], map(math.log, [130, 150, 175, 210]))

This looks very natural to me, and I would expect it to work(I am a python noob though). Turns out it does not. I am quite sure the problem is that the ys argument is now no longer a list, but an iterator(which seems to be the main type when working with the functional tools in python) that can only be iterated over once, so when it comes to the line

d2 = sum(map(lambda x, y: y*f2(x), xs,ys))

ys is just empty. I want to solve this problem in an idiomatic functional pythonic way. My current solution is to add the lines

xs = list(xs)
ys = list(ys)

to the start of the function body. This works, but is this a good way to solve the problem? Will I have to add these lines to almost all functions that uses collections of objects and are expected to work nice with functions like map, filter and zip?

Was it helpful?

Solution

In general, you can't iterate over an iterable twice. If you need to, you must convert it to a sequence first, as you did with list -- that is actually a pretty standard solution.

The main problem with that is that it's potentially inefficient, because it copies any lists or tuples coming in, so you might want to check for those and replace list with this helper function:

from collections import Sequence  # ABC for lists and tuples

def tosequence(it):
    """Convert iterable to sequence, avoiding unnecessary copies."""
    return it if isinstance(it, Sequence) else list(it)

(And as a site note, your code gets much more readable if you replace map and lambda with generator comprehensions; it may also become a lot faster if you rewrite it to use NumPy, which I'd recommend for any number crunching in Python.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top