Python: der Mechanismus hinter dem Listenverständnis

https://stackoverflow.com/questions/4844010

27-10-2019
|

Frage

Wenn Sie das Listenverständnis oder das Schlüsselwort in in einem for-Schleifenkontext verwenden, d. h.:

for o in X:
    do_something_with(o)

oder

l=[o for o in X]

Wie funktioniert der Mechanismus hinter in?
Welche Funktionen \ Methoden in X werden aufgerufen?
Wenn X mehr als einer Methode entsprechen kann, welche Priorität hat das?
Wie schreibe ich einen effizienten X, damit das Listenverständnis schnell erfolgt?

Lösung

Die, afaik, vollständige und korrekte Antwort.

for, sowohl für Schleifen als auch für Listenverständnisse, ruft iter() für X auf. iter() gibt eine iterable zurück, wenn X entweder eine __iter__-Methode oder eine __getitem__-Methode hat. Wenn beide implementiert sind, wird __iter__ verwendet. Wenn es keine hat, erhalten Sie TypeError: 'Nothing' object is not iterable.

Dies implementiert einen __getitem__:

class GetItem(object):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, x):
        return self.data[x]

Verwendung:

>>> data = range(10)
>>> print [x*x for x in GetItem(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Dies ist ein Beispiel für die Implementierung von __iter__:

class TheIterator(object):
    def __init__(self, data):
        self.data = data
        self.index = -1

    # Note: In  Python 3 this is called __next__
    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

    def __iter__(self):
        return self

class Iter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return TheIterator(data)

Verwendung:

>>> data = range(10)
>>> print [x*x for x in Iter(data)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Wie Sie sehen, müssen Sie sowohl einen Iterator als auch einen generischen Codetagcode implementieren, der den Iterator zurückgibt.

Sie können sie kombinieren:

class CombinedIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        self.index = -1
        return self

    def next(self):
        self.index += 1
        try:
            return self.data[self.index]
        except IndexError:
            raise StopIteration

Verwendung:

>>> well, you get it, it's all the same...

Dann kann aber nur ein Iterator gleichzeitig ausgeführt werden. OK, in diesem Fall können Sie einfach Folgendes tun:

class CheatIter(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return iter(self.data)

Aber das ist Betrug, weil Sie nur die __iter__-Methode von __iter__ wiederverwenden. Eine einfachere Möglichkeit besteht darin, Yield zu verwenden und list in einen Generator umzuwandeln:

class Generator(object):
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        for x in self.data:
            yield x

Letzteres würde ich empfehlen. Einfach und effizient.

Andere Tipps

X muss iterierbar sein.Es muss __iter__() implementieren, der ein Iteratorobjekt zurückgibt.Das Iteratorobjekt muss next() implementieren, der bei jedem Aufruf das nächste Element zurückgibt oder einen StopIteration auslöst, wenn kein nächstes Element vorhanden ist.

Listen, Tupel und Generatoren sind alle iterierbar.

Beachten Sie, dass der einfache for-Operator denselben Mechanismus verwendet.

Answering question's comments I can say that reading source is not the best idea in this case. The code that is responsible for execution of compiled code (ceval.c) does not seem to be very verbose for a person that sees Python sources for the first time. Here is the snippet that represents iteration in for loops:

   TARGET(FOR_ITER)
        /* before: [iter]; after: [iter, iter()] *or* [] */
        v = TOP();

        /*
          Here tp_iternext corresponds to next() in Python
        */
        x = (*v->ob_type->tp_iternext)(v); 
        if (x != NULL) {
            PUSH(x);
            PREDICT(STORE_FAST);
            PREDICT(UNPACK_SEQUENCE);
            DISPATCH();
        }
        if (PyErr_Occurred()) {
            if (!PyErr_ExceptionMatches(
                            PyExc_StopIteration))
                break;
            PyErr_Clear();
        }
        /* iterator ended normally */
        x = v = POP();
        Py_DECREF(v);
        JUMPBY(oparg);
        DISPATCH();

To find what actually happens here you need to dive into bunch of other files which verbosity is not much better. Thus I think that in such cases documentation and sites like SO are the first place to go while the source should be checked only for uncovered implementation details.

X must be an iterable object, meaning it needs to have an __iter__() method.

So, to start a for..in loop, or a list comprehension, first X's __iter__() method is called to obtain an iterator object; then that object's next() method is called for each iteration until StopIteration is raised, at which point the iteration stops.

I'm not sure what your third question means, and how to provide a meaningful answer to your fourth question except that your iterator should not construct the entire list in memory at once.

Maybe this helps (tutorial http://docs.python.org/tutorial/classes.html Section 9.9):

Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method next() which accesses elements in the container one at a time. When there are no more elements, next() raises a StopIteration exception which tells the for loop to terminate.

To answer your questions:

How does the mechanism behind in works?

It is the exact same mechanism as used for ordinary for loops, as others have already noted.

Which functions\methods within X does it call?

As noted in a comment below, it calls iter(X) to get an iterator. If X has a method function __iter__() defined, this will be called to return an iterator; otherwise, if X defines __getitem__(), this will be called repeatedly to iterate over X. See the Python documentation for iter() here: http://docs.python.org/library/functions.html#iter

If X can comply to more than one method, what's the precedence?

I'm not sure what your question is here, exactly, but Python has standard rules for how it resolves method names, and they are followed here. Here is a discussion of this:

Method Resolution Order (MRO) in new style Python classes

How to write an efficient X, so that list comprehension will be quick?

I suggest you read up more on iterators and generators in Python. One easy way to make any class support iteration is to make a generator function for iter(). Here is a discussion of generators:

http://linuxgazette.net/100/pramode.html

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow