pythonic way of removing similar items from list

https://stackoverflow.com/questions/20995966

25-09-2022
|

Pregunta

I have a list of items from which i want to remove all similar values but the first and the last one. For example:

listIn = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1]

First three elements "1, 1, 1" are similar, so remove the middle "1".
Next two zeros are unmodified.
One is just one. Leave unmodified.
Four zeros. Remove items in-between the first and the last.

Resulting in:

listOut = [1, 1, 0, 0, 1, 0, 0, 1]

The way of doing this in c++ is very obvious, but it looks very different from the python coding style. Or is it the only way?

Basically, just removing excessive points on the graph where "y" value is not changed: enter image description here

Solución

Use itertools.groupby() to group your values:

from itertools import groupby

listOut = []
for value, group in groupby(listIn):
    listOut.append(next(group))
    for i in group:
        listOut.append(i)
        break

or, for added efficiency, as a generator:

from itertools import groupby

def reduced(it):
    for value, group in groupby(it):
        yield next(group)
        for i in group:
            yield i
            break

Demo:

>>> listIn = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1]
>>> list(reduced(listIn))
[1, 1, 0, 0, 1, 0, 0, 1]

Otros consejos

One-liner:

listOut = reduce(lambda x, y: x if x[-1] == y and x[-2] == y else x + [y], listIn, listIn[0:2])

This provides a numpythonic solution to the problem; it should be a lot faster for large arrays than one based on itertools. Arguably, if you are doing signal processing of any kind, there is plenty of reason to be using numpy.

import numpy as np
a = np.array([1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1], np.int)

change = a[:-1] != a[1:]
I = np.zeros_like(a, np.bool)
I[:-1] = change
I[1:] += change
print a[I]

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow