How to find similar words in vector of strings in R?
https://www.tutorialspoint.com/how-to-find-similar-words-in-vector-of-strings-in-r
-
10-09-2020 - |
質問
How to find similar words in vector of strings in R?
Sometimes strings in a vector of strings have spelling errors and we want to extract the similar words to avoid that spelling error because similar words are likely to represent the correct and incorrect form of a word. This can be done by using agrep with lapply function.
Example 1
x1<-c("India","United Kingdoms","Indiaa","Egyypt","United Kingdom","Turkey","Egypt","Belaarus","Belarus") lapply(x1,agrep,x1,value=TRUE)
Output
[[1]] [1] "India" "Indiaa" [[2]] [1] "United Kingdoms" "United Kingdom" [[3]] [1] "India" "Indiaa" [[4]] [1] "Egyypt" "Egypt" [[5]] [1] "United Kingdoms" "United Kingdom" [[6]] [1] "Turkey" [[7]] [1] "Egyypt" "Egypt" [[8]] [1] "Belaarus" "Belarus" [[9]] [1] "Belaarus" "Belarus"
Example 2
x2<-c("Alhadi","Umair","Omar","Alhadi","Shanti","Shant","Umaer","Peter","Rahul","Pattrick","P eeter","Rahuls") lapply(x2,agrep,x2,value=TRUE)
Output
[[1]] [1] "Al-hadi" "Alhadi" [[2]] [1] "Umair" "Umaer" [[3]] [1] "Omar" [[4]] [1] "Al-hadi" "Alhadi" [[5]] [1] "Shanti" "Shant" [[6]] [1] "Shanti" "Shant" [[7]] [1] "Umair" "Umaer" [[8]] [1] "Peter" "Peeter" [[9]] [1] "Rahul" "Rahuls" [[10]] [1] "Pattrick" [[11]] [1] "Peter" "Peeter" [[12]] [1] "Rahul" "Rahuls"
Example 3
x3<-c("Alabamaa","New Yorky","New Yok","Alabma","Florida","Illinois","Texas","Illinoise") lapply(x3,agrep,x3,value=TRUE)
Output
[[1]] [1] "Alabamaa" [[2]] [1] "New Yorky" [[3]] [1] "New Yorky" "New Yok" [[4]] [1] "Alabamaa" "Alabma" [[5]] [1] "Florida" [[6]] [1] "Illinois" "Illinoise" [[7]] [1] "Texas" [[8]] [1] "Illinois" "Illinoise"
Advertisements
所属していません Tutorialspoint