def flatten(lst):
for element in lst:
if hasattr(element,"__iter__"):
yield from flatten(element)
elif not element is None:
yield element
new_list = flatten(L)
I'll break this down for you, first starting with generators. The yield
keyword is sister to return
, but with much different functionality. Both are used to bring values out of a function into its calling scope, but yield
allows you to jump back into the function afterwards! As an example, below is a generator that accepts a list full of numbers and produces the square for each number in the list.
def example_generator(number_list):
for number in number_list:
yield number**2
>>> gen = example_generator([1,2,3])
>>> type(gen)
<class 'generator'>
>>> next(gen) # next() is used to get the next value from an iterator
1
>>> next(gen)
4
>>> next(gen)
9
>>> next(gen)
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
next(gen)
StopIteration
Generators are one-time use, however. As you can see, after I reached the end of the generator, it threw an exception StopIteration
. If I built it again and ran through it with a loop, then tried to run through it AGAIN...
>>> gen = example_generator([1,2,3]) # remember this is a new generator, we JUST made it
>>> for item in gen:
... print(item)
1
4
9
>>> for item in gen:
... print(item)
>>>
It doesn't do anything the second time. The generator is exhausted. That's the downside -- the upside is that it's generally much faster and more memory-efficient to use generators instead of lists.
yield
also allows you to use another keyword: from
. That's what I did there in case of a nested list (hasattr(element,"__iter__")
just means that the element has an attribute .__iter__
, which means it can be iterated upon using something like a for
loop). You give yield from
another generator, and it yields each element from THAT generator in turn. For example:
def flatten_lite(lst):
for element in lst:
if type(element) is list: # more readable, IMO
yield from flatten_lite(element)
else:
yield element
a = flatten_lite([1,2,3,[4,5,6,[7],8],9])
Here's what it does in turn:
for element in [1,2,3,[4,5,6,[7],8],9]:
# element == 1
if element is of type list: # it's not, skip this
else: yield element # which is 1
:: NEXT ITERATION ::
# element == 2, same as before
:: NEXT ITERATION ::
# element == 3, same as before
:: NEXT ITERATION ::
# element == [4,5,6,[7],8]
if element is of type list: # it is!!
yield from flatten_lite([4,5,6,[7],8])
:: STOP EXECUTION UNTIL WE GET A VALUE FROM THAT NEW GENERATOR ::
>>> NEW GENERATOR
for element in [4,5,6,[7],8]:
# element is 4
yield 4
:: THE OUTER GENERATOR YIELDS 4 ::
:: NEXT ITERATION ::
# element is 5
yield 5
:: THE OUTER GENERATOR YIELDS 4 ::
:: NEXT ITERATION ::
# element is 6
yield 6
:: THE OUTER GENERATOR YIELDS 4 ::
:: NEXT ITERATION ::
# element is [7]
if element is of type list # [7] is a list!
yield from flatten_lite([7])
:: STOP EXECUTION UNTIL WE GET A VALUE FROM THAT NEW GENERATOR ::
# etc etc
So basically the code above says (in pseudocode):
flatten is a function that accepts parameter: lst
for each element in lst:
if element can be iterated on:
yield every element in turn from the generator created
by this function called on the element instead of the
main list
if it's not, and isn't None:
yield element
When you call it, it builds a generator that can be iterated upon. To make it into a formal list, you'll have to do list(flatten(L))
, but in most cases you don't need that.
Is that any clearer?