Pythonジェネレーターで1つの要素（Peek）を先に見てみる方法は？

https://stackoverflow.com/questions/2425270

19-09-2019
|

質問

Pythonジェネレーターの1つの要素を先駆けて見る方法がわかりません。私が見てすぐにそれはなくなりました。

これが私が言っていることです：

gen = iter([1,2,3])
next_value = gen.next()  # okay, I looked forward and see that next_value = 1
# but now:
list(gen)  # is [2, 3]  -- the first value is gone!

これがより現実的な例です：

gen = element_generator()
if gen.next_value() == 'STOP':
  quit_application()
else:
  process(gen.next())

誰かが私があなたが1つの要素を前方に見ることができるジェネレーターを書くのを手伝ってくれますか？

解決

PythonジェネレーターAPIは1つの方法です。読んだ要素を押し戻すことはできません。ただし、を使用して新しいイテレーターを作成できます itertoolsモジュール要素を準備します：

import itertools

gen = iter([1,2,3])
peek = gen.next()
print list(itertools.chain([peek], gen))

他のヒント

完全性のために、 more-itertools パッケージ（おそらくPythonプログラマーのツールボックスの一部である必要があります） peekable この動作を実装するラッパー。コードの例としてドキュメントショー：

>>> p = peekable(xrange(2))
>>> p.peek()
0
>>> p.next()
0
>>> p.peek()
1
>>> p.next()
1

ドキュメントにはPython 2の構文が表示されていても、パッケージはPython 2と3の両方と互換性があります。

OK- 2年遅すぎる - しかし、私はこの質問に出会いました、そして、私の満足に対する答えは見つかりませんでした。このメタジェネレーターを思いついた：

class Peekorator(object):

    def __init__(self, generator):
        self.empty = False
        self.peek = None
        self.generator = generator
        try:
            self.peek = self.generator.next()
        except StopIteration:
            self.empty = True

    def __iter__(self):
        return self

    def next(self):
        """
        Return the self.peek element, or raise StopIteration
        if empty
        """
        if self.empty:
            raise StopIteration()
        to_return = self.peek
        try:
            self.peek = self.generator.next()
        except StopIteration:
            self.peek = None
            self.empty = True
        return to_return

def simple_iterator():
    for x in range(10):
        yield x*3

pkr = Peekorator(simple_iterator())
for i in pkr:
    print i, pkr.peek, pkr.empty

結果：

0 3 False
3 6 False
6 9 False
9 12 False    
...
24 27 False
27 None False

つまり、リスト内の次のアイテムへの反復アクセス中はいつでもあります。

itertools.teeを使用して、ジェネレーターの軽量コピーを作成できます。次に、1つのコピーで先に覗くことで、2番目のコピーには影響しません。

import itertools

def process(seq):
    peeker, items = itertools.tee(seq)

    # initial peek ahead
    # so that peeker is one ahead of items
    if next(peeker) == 'STOP':
        return

    for item in items:

        # peek ahead
        if next(peeker) == "STOP":
            return

        # process items
        print(item)

「アイテム」ジェネレーターは、「ピーカー」を虐待することによって影響を受けません。「ティー」を呼び出した後、オリジナルの「seq」を使用しないでください。

fwiw、これはです違うこの問題を解決する方法。ジェネレーターで1つのアイテムを1項目に見る必要があるアルゴリズムは、現在のジェネレーターアイテムと以前のアイテムを使用するために、代わりに書かれています。その場合、ジェネレーターの使用をマングルする必要はなく、コードがはるかに簡単になります。この質問に対する私の他の答えを参照してください。

>>> gen = iter(range(10))
>>> peek = next(gen)
>>> peek
0
>>> gen = (value for g in ([peek], gen) for value in g)
>>> list(gen)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

楽しみのために、私はアーロンの提案に基づいて、Lookaheadクラスの実装を作成しました。

import itertools

class lookahead_chain(object):
    def __init__(self, it):
        self._it = iter(it)

    def __iter__(self):
        return self

    def next(self):
        return next(self._it)

    def peek(self, default=None, _chain=itertools.chain):
        it = self._it
        try:
            v = self._it.next()
            self._it = _chain((v,), it)
            return v
        except StopIteration:
            return default

lookahead = lookahead_chain

これにより、以下が機能します。

>>> t = lookahead(xrange(8))
>>> list(itertools.islice(t, 3))
[0, 1, 2]
>>> t.peek()
3
>>> list(itertools.islice(t, 3))
[3, 4, 5]

この実装では、ピークを何度も連続して呼び出すのは悪い考えです...

Cpythonソースコードを見ている間、私はより短く、より効率的なより良い方法を見つけました：

class lookahead_tee(object):
    def __init__(self, it):
        self._it, = itertools.tee(it, 1)

    def __iter__(self):
        return self._it

    def peek(self, default=None):
        try:
            return self._it.__copy__().next()
        except StopIteration:
            return default

lookahead = lookahead_tee

使用法は上記と同じですが、ここではピークを何度も連続して使用するために価格を支払うことはありません。さらにいくつかのラインを使用すると、Iterator（利用可能なRAMまで）で複数のアイテムを先導することもできます。

アイテム（i、i+1）を使用する代わりに、「i」は現在のアイテムであり、i+1は「ピーク先」バージョンであるため、（i-1、i）を使用する必要があります。ジェネレーターの以前のバージョンです。

この方法でアルゴリズムを微調整すると、「先に覗く」ことを試みるという不必要な複雑さとは別に、現在持っているものと同一のものが生成されます。

先に覗くことは間違いであり、あなたはそれをするべきではありません。

これは機能します - アイテムをバッファリングし、各アイテムと次のアイテムをシーケンスで呼び出します。

あなたの要件は、シーケンスの最後に何が起こるかについて曖昧です。あなたが最後のものにいるとき、「見栄えの良い」とはどういう意味ですか？

def process_with_lookahead( iterable, aFunction ):
    prev= iterable.next()
    for item in iterable:
        aFunction( prev, item )
        prev= item
    aFunction( item, None )

def someLookaheadFunction( item, next_item ):
    print item, next_item

簡単な解決策は、次のような関数を使用することです。

def peek(it):
    first = next(it)
    return first, itertools.chain([first], it)

その後、あなたはできます：

>>> it = iter(range(10))
>>> x, it = peek(it)
>>> x
0
>>> next(it)
0
>>> next(it)
1

誰かが興味を持っているなら、私が間違っている場合は私を修正してください、しかし、私は任意のイテレーターにプッシュバック機能を追加するのは非常に簡単だと思います。

class Back_pushable_iterator:
    """Class whose constructor takes an iterator as its only parameter, and
    returns an iterator that behaves in the same way, with added push back
    functionality.

    The idea is to be able to push back elements that need to be retrieved once
    more with the iterator semantics. This is particularly useful to implement
    LL(k) parsers that need k tokens of lookahead. Lookahead or push back is
    really a matter of perspective. The pushing back strategy allows a clean
    parser implementation based on recursive parser functions.

    The invoker of this class takes care of storing the elements that should be
    pushed back. A consequence of this is that any elements can be "pushed
    back", even elements that have never been retrieved from the iterator.
    The elements that are pushed back are then retrieved through the iterator
    interface in a LIFO-manner (as should logically be expected).

    This class works for any iterator but is especially meaningful for a
    generator iterator, which offers no obvious push back ability.

    In the LL(k) case mentioned above, the tokenizer can be implemented by a
    standard generator function (clean and simple), that is completed by this
    class for the needs of the actual parser.
    """
    def __init__(self, iterator):
        self.iterator = iterator
        self.pushed_back = []

    def __iter__(self):
        return self

    def __next__(self):
        if self.pushed_back:
            return self.pushed_back.pop()
        else:
            return next(self.iterator)

    def push_back(self, element):
        self.pushed_back.append(element)

it = Back_pushable_iterator(x for x in range(10))

x = next(it) # 0
print(x)
it.push_back(x)
x = next(it) # 0
print(x)
x = next(it) # 1
print(x)
x = next(it) # 2
y = next(it) # 3
print(x)
print(y)
it.push_back(y)
it.push_back(x)
x = next(it) # 2
y = next(it) # 3
print(x)
print(y)

for x in it:
    print(x) # 4-9

それでも itertools.chain() ここでの仕事のための自然なツールです。このようなループに注意してください。

for elem in gen:
    ...
    peek = next(gen)
    gen = itertools.chain([peek], gen)

...これにより、直線的に増加するメモリが消費され、最終的に停止するためです。（このコードは基本的にリンクされたリスト、1つのノードごとのノード（）コールを作成しているようです。）これは、LIBSを検査したからではなく、これが私のプログラムの大幅な減速をもたらしたからです。 gen = itertools.chain([peek], gen) ラインは再びスピードしました。（Python 3.3）

python3スニペットの @Jonathan-Hartley 答え：

def peek(iterator, eoi=None):
    iterator = iter(iterator)

    try:
        prev = next(iterator)
    except StopIteration:
        return iterator

    for elm in iterator:
        yield prev, elm
        prev = elm

    yield prev, eoi


for curr, nxt in peek(range(10)):
    print((curr, nxt))

# (0, 1)
# (1, 2)
# (2, 3)
# (3, 4)
# (4, 5)
# (5, 6)
# (6, 7)
# (7, 8)
# (8, 9)
# (9, None)

これを行うクラスを作成するのは簡単です __iter__ そして、ただします prev アイテムと置きます elm ある属性で。

WRT @David Zの投稿、新しいThe Newer seekable ツールは、ラップされたイテレーターを以前の位置にリセットできます。

>>> s = mit.seekable(range(3))
>>> s.next()
# 0

>>> s.seek(0)                                              # reset iterator
>>> s.next()
# 0

>>> s.next()
# 1

>>> s.seek(1)
>>> s.next()
# 1

>>> next(s)
# 2

Cytoolz がありますピーク関数。

>> from cytoolz import peek
>> gen = iter([1,2,3])
>> first, continuation = peek(gen)
>> first
1
>> list(continuation)
[1, 2, 3]

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow