How do I know if a generator is empty from the start?

https://stackoverflow.com/questions/661603

20-08-2019
|

Question

Is there a simple way of testing if the generator has no items, like peek, hasNext, isEmpty, something along those lines?

Solution

The simple answer to your question: no, there is no simple way. There are a whole lot of work-arounds.

There really shouldn't be a simple way, because of what generators are: a way to output a sequence of values without holding the sequence in memory. So there's no backward traversal.

You could write a has_next function or maybe even slap it on to a generator as a method with a fancy decorator if you wanted to.

OTHER TIPS

Suggestion:

def peek(iterable):
    try:
        first = next(iterable)
    except StopIteration:
        return None
    return first, itertools.chain([first], iterable)

Usage:

res = peek(mysequence)
if res is None:
    # sequence is empty.  Do stuff.
else:
    first, mysequence = res
    # Do something with first, maybe?
    # Then iterate over the sequence:
    for element in mysequence:
        # etc.

A simple way is to use the optional parameter for next() which is used if the generator is exhausted (or empty). For example:

iterable = some_generator()

_exhausted = object()

if next(iterable, _exhausted) == _exhausted:
    print('generator is empty')

Edit: Corrected the problem pointed out in mehtunguh's comment.

The best approach, IMHO, would be to avoid a special test. Most times, use of a generator is the test:

thing_generated = False

# Nothing is lost here. if nothing is generated, 
# the for block is not executed. Often, that's the only check
# you need to do. This can be done in the course of doing
# the work you wanted to do anyway on the generated output.
for thing in my_generator():
    thing_generated = True
    do_work(thing)

If that's not good enough, you can still perform an explicit test. At this point, thing will contain the last value generated. If nothing was generated, it will be undefined - unless you've already defined the variable. You could check the value of thing, but that's a bit unreliable. Instead, just set a flag within the block and check it afterward:

if not thing_generated:
    print "Avast, ye scurvy dog!"

next(generator, None) is not None

Or replace None but whatever value you know it's not in your generator.

Edit: Yes, this will skip 1 item in the generator. Often, however, I check whether a generator is empty only for validation purposes, then don't really use it. Or otherwise I do something like:

def foo(self):
    if next(self.my_generator(), None) is None:
        raise Exception("Not initiated")

    for x in self.my_generator():
        ...

That is, this works if your generator comes from a function, as in generator().

I hate to offer a second solution, especially one that I would not use myself, but, if you absolutely had to do this and to not consume the generator, as in other answers:

def do_something_with_item(item):
    print item

empty_marker = object()

try:
     first_item = my_generator.next()     
except StopIteration:
     print 'The generator was empty'
     first_item = empty_marker

if first_item is not empty_marker:
    do_something_with_item(first_item)
    for item in my_generator:
        do_something_with_item(item)

Now I really don't like this solution, because I believe that this is not how generators are to be used.

Sorry for the obvious approach, but the best way would be to do:

for item in my_generator:
     print item

Now you have detected that the generator is empty while you are using it. Of course, item will never be displayed if the generator is empty.

This may not exactly fit in with your code, but this is what the idiom of the generator is for: iterating, so perhaps you might change your approach slightly, or not use generators at all.

I realize that this post is 5 years old at this point, but I found it while looking for an idiomatic way of doing this, and did not see my solution posted. So for posterity:

import itertools

def get_generator():
    """
    Returns (bool, generator) where bool is true iff the generator is not empty.
    """
    gen = (i for i in [0, 1, 2, 3, 4])
    a, b = itertools.tee(gen)
    try:
        a.next()
    except StopIteration:
        return (False, b)
    return (True, b)

Of course, as I'm sure many commentators will point out, this is hacky and only works at all in certain limited situations (where the generators are side-effect free, for example). YMMV.

All you need to do to see if a generator is empty is to try to get the next result. Of course if you're not ready to use that result then you have to store it to return it again later.

Here's a wrapper class that can be added to an existing iterator to add an __nonzero__ test, so you can see if the generator is empty with a simple if. It can probably also be turned into a decorator.

class GenWrapper:
    def __init__(self, iter):
        self.source = iter
        self.stored = False

    def __iter__(self):
        return self

    def __nonzero__(self):
        if self.stored:
            return True
        try:
            self.value = next(self.source)
            self.stored = True
        except StopIteration:
            return False
        return True

    def __next__(self):  # use "next" (without underscores) for Python 2.x
        if self.stored:
            self.stored = False
            return self.value
        return next(self.source)

Here's how you'd use it:

with open(filename, 'r') as f:
    f = GenWrapper(f)
    if f:
        print 'Not empty'
    else:
        print 'Empty'

Note that you can check for emptiness at any time, not just at the start of the iteration.

>>> gen = (i for i in [])
>>> next(gen)
Traceback (most recent call last):
  File "<pyshell#43>", line 1, in <module>
    next(gen)
StopIteration

At the end of generator StopIteration is raised, since in your case end is reached immediately, exception is raised. But normally you shouldn't check for existence of next value.

another thing you can do is:

>>> gen = (i for i in [])
>>> if not list(gen):
    print('empty generator')

In my case I needed to know if a host of generators was populated before I passed it on to a function, which merged the items, i.e., zip(...). The solution is similar, but different enough, from the accepted answer:

Definition:

def has_items(iterable):
    try:
        return True, itertools.chain([next(iterable)], iterable)
    except StopIteration:
        return False, []

Usage:

def filter_empty(iterables):
    for iterable in iterables:
        itr_has_items, iterable = has_items(iterable)
        if itr_has_items:
            yield iterable


def merge_iterables(iterables):
    populated_iterables = filter_empty(iterables)
    for items in zip(*populated_iterables):
        # Use items for each "slice"

My particular problem has the property that the iterables are either empty or has exactly the same number of entries.

Just fell on this thread and realized that a very simple and easy to read answer was missing:

def is_empty(generator):
    for item in generator:
        return False
    return True

If we are not suppose to consume any item then we need to re-inject the first item into the generator:

def is_empty_no_side_effects(generator):
    try:
        item = next(generator)
        def my_generator():
            yield item
            yield from generator
        return my_generator(), False
    except StopIteration:
        return (_ for _ in []), True

Example:

>>> g=(i for i in [])
>>> g,empty=is_empty_no_side_effects(g)
>>> empty
True
>>> g=(i for i in range(10))
>>> g,empty=is_empty_no_side_effects(g)
>>> empty
False
>>> list(g)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

If you need to know before you use the generator, then no, there is no simple way. If you can wait until after you have used the generator, there is a simple way:

was_empty = True

for some_item in some_generator:
    was_empty = False
    do_something_with(some_item)

if was_empty:
    handle_already_empty_generator_case()

Here is my simple approach that i use to keep on returning an iterator while checking if something was yielded I just check if the loop runs:

        n = 0
        for key, value in iterator:
            n+=1
            yield key, value
        if n == 0:
            print ("nothing found in iterator)
            break

Here's a simple decorator which wraps the generator, so it returns None if empty. This can be useful if your code needs to know whether the generator will produce anything before looping through it.

def generator_or_none(func):
    """Wrap a generator function, returning None if it's empty. """

    def inner(*args, **kwargs):
        # peek at the first item; return None if it doesn't exist
        try:
            next(func(*args, **kwargs))
        except StopIteration:
            return None

        # return original generator otherwise first item will be missing
        return func(*args, **kwargs)

    return inner

Usage:

import random

@generator_or_none
def random_length_generator():
    for i in range(random.randint(0, 10)):
        yield i

gen = random_length_generator()
if gen is None:
    print('Generator is empty')

One example where this is useful is in templating code - i.e. jinja2

{% if content_generator %}
  <section>
    <h4>Section title</h4>
    {% for item in content_generator %}
      {{ item }}
    {% endfor %
  </section>
{% endif %}

Simply wrap the generator with itertools.chain, put something that will represent the end of the iterable as the second iterable, then simply check for that.

Ex:

import itertools

g = some_iterable
eog = object()
wrap_g = itertools.chain(g, [eog])

Now all that's left is to check for that value we appended to the end of the iterable, when you read it then that will signify the end

for value in wrap_g:
    if value == eog: # DING DING! We just found the last element of the iterable
        pass # Do something

using islice you need only check up to the first iteration to discover if it is empty.

from itertools import islice

def isempty(iterable):
return list(islice(iterable,1)) == []

What about using any()? I use it with generators and it's working fine. Here there is guy explaining a little about this

Use the peek function in cytoolz.

from cytoolz import peek
from typing import Tuple, Iterable

def is_empty_iterator(g: Iterable) -> Tuple[Iterable, bool]:
    try:
        _, g = peek(g)
        return g, False
    except StopIteration:
        return g, True

The iterator returned by this function will be equivalent to the original one passed in as an argument.

Prompted by Mark Ransom, here's a class that you can use to wrap any iterator so that you can peek ahead, push values back onto the stream and check for empty. It's a simple idea with a simple implementation that I've found very handy in the past.

class Pushable:

    def __init__(self, iter):
        self.source = iter
        self.stored = []

    def __iter__(self):
        return self

    def __bool__(self):
        if self.stored:
            return True
        try:
            self.stored.append(next(self.source))
        except StopIteration:
            return False
        return True

    def push(self, value):
        self.stored.append(value)

    def peek(self):
        if self.stored:
            return self.stored[-1]
        value = next(self.source)
        self.stored.append(value)
        return value

    def __next__(self):
        if self.stored:
            return self.stored.pop()
        return next(self.source)

I solved it by using the sum function. See below for an example I used with glob.iglob (which returns a generator).

def isEmpty():
    files = glob.iglob(search)
    if sum(1 for _ in files):
        return True
    return False

*This will probably not work for HUGE generators but should perform nicely for smaller lists

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow