Question

I tried searching and couldn't find this exact situation, so apologies if it exists already.

I'm trying to remove duplicates from a list as well as the original item I'm searching for. If I have this:

ls = [1, 2, 3, 3]

I want to end up with this:

ls = [1, 2]

I know that using set will remove duplicates like this:

print set(ls)  # set([1, 2, 3])

But it still retains that 3 element which I want removed. I'm wondering if there's a way to remove the duplicates and original matching items too.

Was it helpful?

Solution

Use a list comprehension and list.count:

>>> ls = [1, 2, 3, 3]
>>> [x for x in ls if ls.count(x) == 1]
[1, 2]
>>>

Here is a reference on both of those.


Edit:

@Anonymous made a good point below. The above solution is perfect for small lists but may become slow with larger ones.

For large lists, you can do this instead:

>>> from collections import Counter
>>> ls = [1, 2, 3, 3]
>>> c = Counter(ls)
>>> [x for x in ls if c[x] == 1]
[1, 2]
>>>

Here is a reference on collections.Counter.

OTHER TIPS

If items are contigious, then you can use groupby which saves building an auxillary data structure in memory...:

from itertools import groupby, islice

data = [1, 2, 3, 3]
# could also use `sorted(data)` if need be...
new = [k for k, g in groupby(data) if len(list(islice(g, 2))) == 1]
# [1, 2]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top