Question

Summary: I'm trying to learn about itertools.islice.


I'm trying to find the best way to get a list made up of a subset of the returns from an infinite generator function. For example, I could want a list of the 1000th through 2000th item from a generator.

This is my example generator:

def infinite_counter():
    i = 0
    while True:
        i += 2
        yield i

These values are the return index from the generator that I want the list to start and stop:

start = 1000
end = 2000

Method 1: list comprehension (fails)

[val for ind,val in enumerate(infinite_counter()) if start <= ind <= end ]

This will quite obviously never return, when you expand into this:

for ind, val in enumerate(infinite_counter()):
    if start < ind < end:
       val

Method 2: list() (works)

list(next(iter([])) if ind > end else val for ind,val in enumerate(infinite_counter()) if ind >= start)

This works, but really feels like a hack. It is also quite hard to follow, however I mistakenly thought it would be faster than Method 3.

Method 3: easy method (works)

my_list = []
for ind,val in enumerate(infinite_counter()):
    if ind >= start:
        my_list.append(val)
        if ind >= end:
            break

This is the first way I would think of doing this, before I chided my self from not being pythonic. I was surprised that this was almost exactly the same as Method 2 in timing.

Method 4: itertools.takewhile (works)

[val for ind,val in itertools.takewhile(lambda tup: tup[0] < end, enumerate(infinite_counter())) if ind > start]

At first, I thought takewhile didn't work as I had the lambda as "lambda ind,val:". But it gives the lambda a tuple of the two values. I just need to take the first term in the tuple as the index for early exit. This is slower than Method 2 and 3, and almost as slow as Method 5.

Method 5: wrapping generator (works)

def top_ending_generator(end):
    for ind,val in enumerate(infinite_counter()):
        if ind > end:
            break
        yield ind,val

[val for ind,val in top_ending_generator(end) if ind > start]

This is, as expected, considerably slower than methods 2 and 3.

Overall, I was surprised to see timing of Method 3 to be very close to timing of Method 2. It is more code, but much easier for someone to follow. This is currently how i have this implemented

Are there any other methods that I should consider or better solutions for this?

Edit:

Method 6 itertools.islice (the winner)

list(itertools.islice(infinite_counter(), start, end))

This is slightly faster than my initial itertools.islice solution with list comprehension:

[val for val in itertools.islice(infinite_counter(), start_ind, end_ind)]

Amazing what finding the right method does.

For those keeping score, my timing found the following:

Method 6 = unit time

Method 2 ~= 2.5 * unit time

Method 3 ~= 3 * unit time

Method 4 ~= 4.2 * unit time

Method 5 ~= 4 * unit time

Was it helpful?

Solution

from itertools import islice

list(islice(infinite_counter(), 1000, 2000))

Note that this

list(next(iter([])) if ind > end else val for ind,val in enumerate(infinite_counter()) if ind >= start)

transforms to this

def _secret():
    for ind, val in enumerate(infinite_counter()):
        if ind >= start:
            if ind > end:
                yield list(next(iter([])))

            else:
                yield val

list(_secret())

which is easily improvable to

def _secret():
    for ind, val in enumerate(infinite_counter()):
        if ind < start:
            continue

        if ind > end:
            break

        yield val

list(_secret())

which looks fine to me.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top