我不知道如何在 Python 生成器中向前查看一个元素。我一看就不见了。

我的意思是:

gen = iter([1,2,3])
next_value = gen.next()  # okay, I looked forward and see that next_value = 1
# but now:
list(gen)  # is [2, 3]  -- the first value is gone!

这是一个更真实的例子:

gen = element_generator()
if gen.next_value() == 'STOP':
  quit_application()
else:
  process(gen.next())

谁能帮我写一个可以向前看一个元素的生成器?

有帮助吗?

解决方案

在Python的发电机API是一个办法:你可以不推回你读过的元素。但是你可以使用创建一个新的迭代器的 itertools模块里和预先设置元素:

import itertools

gen = iter([1,2,3])
peek = gen.next()
print list(itertools.chain([peek], gen))

其他提示

为了完整起见, more-itertools(这可能应的任何部分Python程序员的工具箱),包括实施此行为的peekable包装。如代码示例所示:

>>> p = peekable(xrange(2))
>>> p.peek()
0
>>> p.next()
0
>>> p.peek()
1
>>> p.next()
1

包是与两个Python 2和3兼容,即使文档显示的Python 2语法。

确定 - 2年为时已晚 - 但我遇到了这个问题,并没有发现任何的答案让我满意的。想出了这个元发生器:

class Peekorator(object):

    def __init__(self, generator):
        self.empty = False
        self.peek = None
        self.generator = generator
        try:
            self.peek = self.generator.next()
        except StopIteration:
            self.empty = True

    def __iter__(self):
        return self

    def next(self):
        """
        Return the self.peek element, or raise StopIteration
        if empty
        """
        if self.empty:
            raise StopIteration()
        to_return = self.peek
        try:
            self.peek = self.generator.next()
        except StopIteration:
            self.peek = None
            self.empty = True
        return to_return

def simple_iterator():
    for x in range(10):
        yield x*3

pkr = Peekorator(simple_iterator())
for i in pkr:
    print i, pkr.peek, pkr.empty

结果:

0 3 False
3 6 False
6 9 False
9 12 False    
...
24 27 False
27 None False

即。您在迭代访问到下一个项目在列表中在任何时刻。

可以使用itertools.tee以产生所述发电机的轻量化副本。然后,在一个拷贝偷看前方不会影响第二副本:

import itertools

def process(seq):
    peeker, items = itertools.tee(seq)

    # initial peek ahead
    # so that peeker is one ahead of items
    if next(peeker) == 'STOP':
        return

    for item in items:

        # peek ahead
        if next(peeker) == "STOP":
            return

        # process items
        print(item)

在“项目”发生器是你骚扰“速览者”不受影响。请注意,你不应该用原来的“序列”,呼吁它三通“后,将打破东西。

FWIW,这是的方式来解决这个问题。这需要你提前看在发电机1项的任何算法可以用这种写法使用电流发生器项目,上一个项目。然后,你不必裂伤您的发电机使用,你的代码会简单得多。见我的其他回答这个问题。

>>> gen = iter(range(10))
>>> peek = next(gen)
>>> peek
0
>>> gen = (value for g in ([peek], gen) for value in g)
>>> list(gen)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

只是为了好玩,我创建了一个前瞻类的实现基于建议 亚伦:

import itertools

class lookahead_chain(object):
    def __init__(self, it):
        self._it = iter(it)

    def __iter__(self):
        return self

    def next(self):
        return next(self._it)

    def peek(self, default=None, _chain=itertools.chain):
        it = self._it
        try:
            v = self._it.next()
            self._it = _chain((v,), it)
            return v
        except StopIteration:
            return default

lookahead = lookahead_chain

通过此,下面的工作:

>>> t = lookahead(xrange(8))
>>> list(itertools.islice(t, 3))
[0, 1, 2]
>>> t.peek()
3
>>> list(itertools.islice(t, 3))
[3, 4, 5]

通过这个实现是一个坏主意,打电话连续偷看多次...

在注视CPython的源代码我刚发现更好的方法是既短,更有效的:

class lookahead_tee(object):
    def __init__(self, it):
        self._it, = itertools.tee(it, 1)

    def __iter__(self):
        return self._it

    def peek(self, default=None):
        try:
            return self._it.__copy__().next()
        except StopIteration:
            return default

lookahead = lookahead_tee

用法是与上面相同,但你不会在这里付出代价连续使用偷看了很多次。随着几行你也可以看看未来多个项目的迭代器(最多可用RAM)。

除了使用项(I,I + 1),其中 'i' 是当前项和i + 1的 '偷看超前' 版本,则应当使用(i-1,i)中,在哪里“我-1' 是来自发电机的以前的版本。

调整你的算法这样会产生一些等同于你当前有,除了试图“偷看提前实现”额外的不必要的复杂性。

偷看前面就是一个错误,你不应该这样做。

此将工作 - 它缓冲项并调用与所述序列中的每个项和下一个项目的功能

您的要求是对在序列的最后会发生什么阴暗。什么是“向前看”,当你在最后一个呢?

def process_with_lookahead( iterable, aFunction ):
    prev= iterable.next()
    for item in iterable:
        aFunction( prev, item )
        prev= item
    aFunction( item, None )

def someLookaheadFunction( item, next_item ):
    print item, next_item

一个简单的解决方案是使用这样的功能:

def peek(it):
    first = next(it)
    return first, itertools.chain([first], it)

然后,你可以这样做:

>>> it = iter(range(10))
>>> x, it = peek(it)
>>> x
0
>>> next(it)
0
>>> next(it)
1

如果任何人有兴趣,并请纠正我,如果我错了,但我相信这是很容易添加一些功能推回任何迭代器。

class Back_pushable_iterator:
    """Class whose constructor takes an iterator as its only parameter, and
    returns an iterator that behaves in the same way, with added push back
    functionality.

    The idea is to be able to push back elements that need to be retrieved once
    more with the iterator semantics. This is particularly useful to implement
    LL(k) parsers that need k tokens of lookahead. Lookahead or push back is
    really a matter of perspective. The pushing back strategy allows a clean
    parser implementation based on recursive parser functions.

    The invoker of this class takes care of storing the elements that should be
    pushed back. A consequence of this is that any elements can be "pushed
    back", even elements that have never been retrieved from the iterator.
    The elements that are pushed back are then retrieved through the iterator
    interface in a LIFO-manner (as should logically be expected).

    This class works for any iterator but is especially meaningful for a
    generator iterator, which offers no obvious push back ability.

    In the LL(k) case mentioned above, the tokenizer can be implemented by a
    standard generator function (clean and simple), that is completed by this
    class for the needs of the actual parser.
    """
    def __init__(self, iterator):
        self.iterator = iterator
        self.pushed_back = []

    def __iter__(self):
        return self

    def __next__(self):
        if self.pushed_back:
            return self.pushed_back.pop()
        else:
            return next(self.iterator)

    def push_back(self, element):
        self.pushed_back.append(element)
it = Back_pushable_iterator(x for x in range(10))

x = next(it) # 0
print(x)
it.push_back(x)
x = next(it) # 0
print(x)
x = next(it) # 1
print(x)
x = next(it) # 2
y = next(it) # 3
print(x)
print(y)
it.push_back(y)
it.push_back(x)
x = next(it) # 2
y = next(it) # 3
print(x)
print(y)

for x in it:
    print(x) # 4-9

虽然itertools.chain()是这项工作的天然工具在这里,提防这样的循环:

for elem in gen:
    ...
    peek = next(gen)
    gen = itertools.chain([peek], gen)

...因为这会消耗线性生长的内存量,并最终慢慢停止。 (此代码基本上似乎产生了一个链表,每个链()调用一个节点。)我知道这不是因为我考察了库,但因为这只是造成了我的方案的一个主要放缓 - 摆脱gen = itertools.chain([peek], gen)线加速它再次。 (Python的3.3)

Python3片断为 @乔纳森 - 哈特利回答:

def peek(iterator, eoi=None):
    iterator = iter(iterator)

    try:
        prev = next(iterator)
    except StopIteration:
        return iterator

    for elm in iterator:
        yield prev, elm
        prev = elm

    yield prev, eoi


for curr, nxt in peek(range(10)):
    print((curr, nxt))

# (0, 1)
# (1, 2)
# (2, 3)
# (3, 4)
# (4, 5)
# (5, 6)
# (6, 7)
# (7, 8)
# (8, 9)
# (9, None)

这将会是简单的创建一个类,这是否对__iter__和产量只是prev项目,并把elm一些属性。

细胞工具 有一个 窥视 功能。

>> from cytoolz import peek
>> gen = iter([1,2,3])
>> first, continuation = peek(gen)
>> first
1
>> list(continuation)
[1, 2, 3]
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top