パイソン:リスト理解の背後にあるメカニズム

https://stackoverflow.com/questions/4844010

27-10-2019
|

質問

リスト内包表記または in for ループコンテキスト内のキーワード、つまり:

for o in X:
    do_something_with(o)

または

l=[o for o in X]

背後にあるメカニズムはどのようになっているのか in 動作しますか？
どの関数/メソッドが含まれるか X それは電話しますか？
もし X 複数の方法に準拠できますが、優先順位は何ですか?
効率的な文章の書き方 X, 、それでリストの理解は早くなりますか？

解決

私の知る限り、完全かつ正しい答えです。

for, for ループとリスト内包表記の両方で、呼び出し iter() の上 X. iter() if は反復可能を返します X どちらかが持っています __iter__ 方法または __getitem__ 方法。両方を実装すると、 __iter__ 使用されている。どちらも持っていない場合は、取得します TypeError: 'Nothing' object is not iterable.

これにより、 __getitem__:

class GetItem(object):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, x):
        return self.data[x]

使用法：

>>> data = range(10)
>>> print [x*x for x in GetItem(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

導入例です __iter__:

class TheIterator(object):
    def __init__(self, data):
        self.data = data
        self.index = -1

    # Note: In  Python 3 this is called __next__
    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

    def __iter__(self):
        return self

class Iter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return TheIterator(data)

使用法：

>>> data = range(10)
>>> print [x*x for x in Iter(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

ご覧のとおり、イテレータを実装するには両方が必要です。 __iter__ それはイテレータを返します。

それらを組み合わせることができます。

class CombinedIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        self.index = -1
        return self

    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

使用法：

>>> well, you get it, it's all the same...

ただし、一度に実行できるイテレータは 1 つだけです。OK、この場合は次のようにするだけです:

class CheatIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return iter(self.data)

しかし、それは不正行為です。 __iter__ の方法 list。もっと簡単な方法は、yield を使用して、 __iter__ ジェネレーターに:

class Generator(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        for x in self.data:
            yield x

これは私がお勧めする最後の方法です。簡単かつ効率的。

他のヒント

Xは反復可能である必要があります。イテレータオブジェクトを返す__iter__()を実装する必要があります。イテレータオブジェクトは、next()を実装する必要があります。これは、呼び出されるたびに次のアイテムを返すか、次のアイテムがない場合はStopIterationを発生させます。

リスト、タプル、ジェネレーターはすべて反復可能です。

単純なfor演算子は同じメカニズムを使用することに注意してください。

Answering question's comments I can say that reading source is not the best idea in this case. The code that is responsible for execution of compiled code (ceval.c) does not seem to be very verbose for a person that sees Python sources for the first time. Here is the snippet that represents iteration in for loops:

   TARGET(FOR_ITER)
        /* before: [iter]; after: [iter, iter()] *or* [] */
        v = TOP();

        /*
          Here tp_iternext corresponds to next() in Python
        */
        x = (*v->ob_type->tp_iternext)(v); 
        if (x != NULL) {
            PUSH(x);
            PREDICT(STORE_FAST);
            PREDICT(UNPACK_SEQUENCE);
            DISPATCH();
        }
        if (PyErr_Occurred()) {
            if (!PyErr_ExceptionMatches(
                            PyExc_StopIteration))
                break;
            PyErr_Clear();
        }
        /* iterator ended normally */
        x = v = POP();
        Py_DECREF(v);
        JUMPBY(oparg);
        DISPATCH();

To find what actually happens here you need to dive into bunch of other files which verbosity is not much better. Thus I think that in such cases documentation and sites like SO are the first place to go while the source should be checked only for uncovered implementation details.

X must be an iterable object, meaning it needs to have an __iter__() method.

So, to start a for..in loop, or a list comprehension, first X's __iter__() method is called to obtain an iterator object; then that object's next() method is called for each iteration until StopIteration is raised, at which point the iteration stops.

I'm not sure what your third question means, and how to provide a meaningful answer to your fourth question except that your iterator should not construct the entire list in memory at once.

Maybe this helps (tutorial http://docs.python.org/tutorial/classes.html Section 9.9):

Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method next() which accesses elements in the container one at a time. When there are no more elements, next() raises a StopIteration exception which tells the for loop to terminate.

To answer your questions:

How does the mechanism behind in works?

It is the exact same mechanism as used for ordinary for loops, as others have already noted.

Which functions\methods within X does it call?

As noted in a comment below, it calls iter(X) to get an iterator. If X has a method function __iter__() defined, this will be called to return an iterator; otherwise, if X defines __getitem__(), this will be called repeatedly to iterate over X. See the Python documentation for iter() here: http://docs.python.org/library/functions.html#iter

If X can comply to more than one method, what's the precedence?

I'm not sure what your question is here, exactly, but Python has standard rules for how it resolves method names, and they are followed here. Here is a discussion of this:

Method Resolution Order (MRO) in new style Python classes

How to write an efficient X, so that list comprehension will be quick?

I suggest you read up more on iterators and generators in Python. One easy way to make any class support iteration is to make a generator function for iter(). Here is a discussion of generators:

http://linuxgazette.net/100/pramode.html

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow