Tom, this isn't strictly a data.table
problem. Also, it's hard to know exactly what you want without having the data you are using. I tried to figure out what you want, and I came up with this solution:
vin.match <- vapply(car.vins, function(x) which.min(adist(x, vin.vins)), integer(1L))
data.frame(car.vins, vin.vins=vin.vins[vin.match], vin.names=vin.names[vin.match])
# car.vins vin.vins vin.names
# 1 abcdekl abcdef NAME1
# 2 abcdeF abcdef NAME1
# 3 laskdjg laskdjf NAME2
# 4 blerghk blerghk NAME3
And here is the data:
vin.vins <- c("abcdef", "laskdjf", "blerghk")
vin.names <- paste0("NAME", 1:length(vin.vins))
car.vins <- c("abcdekl", "abcdeF", "laskdjg", "blerghk")
This will find the closest match for every value in car.vins
in vin.vins
, as per adist
. I'm not sure data.table
is needed for this particular step. If you provide your actual data (or a representative sample), then I can provide a more targeted answer.