Question

I have two files with the following contents below.My question is ,in the below shown code if id matches in file1 and file2 then how to match the second column in file1 and the corresponding second column in file2 till n columns..

   def process(file):
     pt = []
     f=open(file)
     for l in f: 
       parts=l.strip().split('\t')
        if len(parts) < 3:
          print 'error with either ur input file or field check parts'
          break
        else:
          pt.append(parts)
     return pt
   arr1 = process(file1)
   arr2 = process(file2)                  
   for arr in arr1:
     if arr[0] in arr2:
        //then match arr1[1] with arr2[1] and so on and get the results

file1:

ID674097256 voice tech department
ID674097257 NA NA
ID674097399 chat  order processing department

file2:

ID674097212 voice tech department
ID674097257 NA NA
ID674097399 chat  new processing department
Was it helpful?

Solution

use zip

for (a1, a2) in zip(arr1, arr2):
  if a1[0] == a2[0]:
      ##  do something.

OTHER TIPS

The question is not fully clear to me, but I think you are trying to do

for arr in arr1:
    for a in arr2:
        if a[0] == arr[0]:
             print a
             print arr
             # compare the rest of the fields

However, this may not be the best option in terms of performance. Think about sorting the files, have a look at questions like Compare two different files line by line and write the difference in third file - Python, etc.

if i understood you right, you need to match same lines in a files. this code maybe helpful for your task:

>>> s = range(10)
>>> s
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> s2 = range(20)
>>> s2
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> matching = {}
>>> for i, k in zip(s,s2):
...     matching[i] = k
...
>>> matching
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>

This code compares each row of the first array with every row of the second array. If the rows are same( if the lists are equivalent ), then the row is put into the list "rows" and duplicate instances of rows are removed.

    rows = [row1 for row1 in arr1 for row2 in arr2 if row1 == row2]
    rows = list(set(rows))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top