Question

I have two long vectors of names (list.1, list.2). I want to run a loop to check whether any name in list.2 matches with any name in list.1. If it does I want to append to a vector result the value for the position of the matching name in vector list.1.

 for (i in list.2){
  for (j in list.1){
    if(length(grep(list.2[i], list.1[j]), ignore.case=TRUE)==0){
      append(result, j)
      break
    } else append(nameComment.corresponding, 0)
  }
}

The above code is really brute-force and since my vectors are 5,000 and 60,000 name long, it will probably run for over 360,000,000 cycles. How could I improve it?

Was it helpful?

Solution

which and %in% would probably be good for this task, or match depending on what you are going for. A point to note is that match returns the index of the first match of it's first argument in it's second argument (that is to say if you have multiple values in the lookup table only the first match to that will be returned):

set.seed(123)
#  I am assuming these are the values you want to check if they are in the lookup 'table'
list2 <- sample( letters[1:10] , 10 , repl = T )
[1] "c" "h" "e" "i" "j" "a" "f" "i" "f" "e"

#  I am assuming this is the lookup table
list1 <- letters[1:3]
[1] "a" "b" "c"

#  Find which position in the lookup table each value is, NA if no match
match(list2 , list1 )
[1]  3 NA NA NA NA  1 NA NA NA NA

OTHER TIPS

This is totally what the set-operations intersect/union/setdiff() are for:

list.1 = c('Alan','Bill','Ted','Alice','Carol')
list.2 = c('Carol','Ted')
intersect(list.1, list.2)
 "Ted" "Carol"

...or if you really want the indices into list.1:

match(intersect(list.1, list.2), list.1)
  3 5
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top