Comparing and discarding two consecutive elements not complying certain condition in a list using Python

StackOverflow https://stackoverflow.com/questions/22093001

Question

So I have this list in Python that I am showing below, and for the sake of simplicity I'll name it as p:

 [[11 10]
  [12  9]
  [13  9]
  [13 10]
  [14  8]
  [14 10]
  [15  7]
  [15  9]
  [16  8]
  [17  7]
  [18  2]
  [18  8]
  [19  1]
  [19  7]
  [20  1]
  [20  2]
  [21  2]
  [21  4]
  [22  1]
  [22  3]
  [23  4]
  [24  3]
  [25  4]
  [25  6]
  [26  3]
  [26  5]
  [27  5]
  [27  6]
  [28  8]
  [28 10]
  [29  6]
  [30  5]
  [31  7]
  [31  9]
  [32  1]
  [32  2]
  [33  4]
  [33  6]
  [34  3]
  [34  5]] 

What I am trying to do is to compare consecutive pairs and then let only the elements with the same p[0]. In this sense, one can note that the pairs with p[0]= 11,12,16,17,23,24,29,30 will not survive and then p will become:

[[13  9]
 [13 10]
 [14  8]
 [14 10]
 [15  7]
 [15  9]
 [18  2]
 [18  8]
 [19  1]
 [19  7]
 [20  1]
 [20  2]
 [21  2]
 [21  4]
 [22  1]
 [22  3]
 [25  4]
 [25  6]
 [26  3]
 [26  5]
 [27  5]
 [27  6]
 [28  8]
 [28 10]
 [31  7]
 [31  9]
 [32  1]
 [32  2]
 [33  4]
 [33  6]
 [34  3]
 [34  5]] 

What would be a way of doing this in python? I'd be glad if somebody could give me an idea.

Was it helpful?

Solution

Note that the original question was "compare consecutive pairs and then let only the elements with the same p[0]" which means that one should not sort the original list. If the original list is sorted, then pairs which would otherwise not be "consecutive" will be picked up and kept. While the sample list was sorted, I will not assume that to be the case so that this will also handle a case such as

[[13, 5],
 [13, 2],
 [14, 9],
 [13. 6]]

to generate

[[13, 5],
 [13, 2]]

I will show a simple loop in order to make it easier to understand, but list comprehension can make it shorter. Note that Counter requires python 2.7

prev = None
newp = []
length = len(p) - 1
for i in range(length):
  if p[i][0] == p[i+1][0] or p[i][0] == prev:
    newp.append(p[i])
    prev = p[i][0]

if p[length][0] == p[length-1][0]:
  newp.append(p[length])

This will create the new list as you want

OTHER TIPS

writing a generator make it quite readable (you name what you are doing):

def find_pairs(items):
    last = (None, None)
    for item in items:
        if item[0] == last[0]:
            yield last
            yield item
        last = item

print [x for x in find_pairs(p)]

Assuming the input list is named as lst, and that you want to kill items with lonely first element globally, you can do this:

from collections import Counter
cnt = Counter(zip(*lst)[0])
result = [p for p in lst if cnt[p[0]] > 1]
print result

Output:

[[13, 9], [13, 10], [14, 8], [14, 10], [15, 7], [15, 9], [18, 2], [18, 8], [19, 1], [19, 7], [20, 1], [20, 2], [21, 2], [21, 4], [22, 1], [22, 3], [25, 4], [25, 6], [26, 3], [26, 5], [27, 5], [27, 6], [28, 8], [28, 10], [31, 7], [31, 9], [32, 1], [32, 2], [33, 4], [33, 6], [34, 3], [34, 5]]

But the code above is indeed incorrect -- I overlooked the question's requirement on consecutiveness. For the sake of completeness, let me write the supposedly correct solution:

from itertools import groupby
from operator import itemgetter
result = []
for _, k in groupby(lst, itemgetter(0)):
    k = list(k)
    if len(k) > 1:
        result.extend(k)
print result

They should produce the same result for this specific example though. You should test with a trickier input such as

[[11,0],[11,0],[12,0],[13,0],[12,0],[13,0]]

and the only the second solution will give the correct answer based on your requirements.

Possible solutions using list comprehension:

  1. Idea is to look forward and backward and select if there is a match. To avoid IndexError at boundary cases, using i > 0 and i < len(A)-1

    >>> ans = [el for i, el in enumerate(A) if (i > 0 and A[i-1][0] == el[0]) or \
    (i < len(A)-1 and A[i+1][0] == el[0])]

  2. Prepend and append dummy values to avoid handling of IndexError

    >>> A = [[None, None]] + A + [[None, None]]
    >>> ans = [el for i, el in enumerate(A) if A[i-1][0] == el[0] or \
    A[i+1][0] == el[0]] 
    

  3. Use zip to get next and previous element

    >>> A = [[None, None]] + A + [[None, None]]
    >>> ans = [e2 for e1, e2, e3 in zip(A, A[1:], A[2:]) if e2[0] == e1[0] or \
    e2[0] == e3[0]]

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top