Question

I have the following code for finding the consecutive vowels but it doesn't give me the right result: **Is my code wrong?

sapply(v, function(x){ gsub(".*[0-9]\\s", "", grep("[aeiou]{2}", x, value = TRUE, invert = FALSE)) })

in which v is:

c("Joe 4311 rsfuvgcozbxwlnnfevze", "Clayton 2414 qsncnpvdfpjmvmvbdvce", 
"Addison 25 melmasilbgrurqbezgyu", "Donovan 2013 gozagvswtitjjinrzgup", 
"Sage 540 aamyvegiadwjwpvwtjko", "Zavier 133 cyomwtxftslukvmvpmcl", 
"Maria 1241 ngqjynxnpblcztnlkack", "Mercedes 2400 xcwbxxljspneilwejutw", 
"Micheal 4400 oovhyodyubhqwzdcwybf", "Brylee 2532 sarbmelbeycrnhytbout", 
"Giancarlo 3351 xmocyljxquklbchgmdcj", "Elin 5513 nbjovdtmijpfluzixebu", 
"Ray 2553 snrqrzshlzmmhumzlecl", "Jade 4030 rhibewstyrwdervgqnru", 
"Amelia 5205 lcnvnjhamhzavdfosmae", "Karissa 2030 vhvzyfckgogduqqayzku", 
"Conor 325 sbgfntjejbtwsvidvtnu", "Tripp 454 xmvuhycjnvqgnmorfdrl", 
"River 5120 zcxavkwzhwbvdqadajgh", "Tianna 251 mwoqwzyfddhuunmtiioh", 
"Conner 3543 ngyuzdbeyizfarxuxntz", "Mackenzie 3113 yvycqaquwtfjjtqsdduh", 
"Melody 4422 buagtfiaipniavdnsxhv", "Dallas 5343 blyjvtlpvpqondrdhluu")

In v each line has the form "NAME SCORES WORD" and we want to find how many lines have two consecutive vowels in WORD?

Was it helpful?

Solution 2

Here's how to do it in one shot. We can use this regular expression to skip over everything before WORD and look for consecutive vowels in the last part.

> (zz <- do.call(rbind, lapply(v, function(x){ 
      grep("^.*[0-9]\\s.*[aeiou]{2}", x, value = TRUE)
      })))
     [,1]                                
[1,] "Sage 540 aamyvegiadwjwpvwtjko"     
[2,] "Mercedes 2400 xcwbxxljspneilwejutw"
[3,] "Micheal 4400 oovhyodyubhqwzdcwybf" 
[4,] "Brylee 2532 sarbmelbeycrnhytbout"  
[5,] "Amelia 5205 lcnvnjhamhzavdfosmae"  
[6,] "Tianna 251 mwoqwzyfddhuunmtiioh"   
[7,] "Melody 4422 buagtfiaipniavdnsxhv"  
[8,] "Dallas 5343 blyjvtlpvpqondrdhluu"  
> length(zz)
[1] 8

OTHER TIPS

If you strsplit the text first, you can apply the grep more easily.

v[grep("[aeiou]{2}",sapply(strsplit(v," "),"[",3))]

#[1] "Sage 540 aamyvegiadwjwpvwtjko"     
#[2] "Mercedes 2400 xcwbxxljspneilwejutw"
#[3] "Micheal 4400 oovhyodyubhqwzdcwybf" 
#[4] "Brylee 2532 sarbmelbeycrnhytbout"  
#[5] "Amelia 5205 lcnvnjhamhzavdfosmae"  
#[6] "Tianna 251 mwoqwzyfddhuunmtiioh"   
#[7] "Melody 4422 buagtfiaipniavdnsxhv"  
#[8] "Dallas 5343 blyjvtlpvpqondrdhluu"  

I think your life will be much easier if you make your three variables (name, score, word) explicit:

library(stringr)
df <- as.data.frame(str_split_fixed(v, " ", 3))
names(df) <- c("name", "score", "word")

Then extracting the matches is a simple subset:

subset(df, str_detect(word, "[aeiou]{2}"))

##        name score                 word
## 5      Sage   540 aamyvegiadwjwpvwtjko
## 8  Mercedes  2400 xcwbxxljspneilwejutw
## 9   Micheal  4400 oovhyodyubhqwzdcwybf
## 10   Brylee  2532 sarbmelbeycrnhytbout
## 15   Amelia  5205 lcnvnjhamhzavdfosmae
## 20   Tianna   251 mwoqwzyfddhuunmtiioh
## 23   Melody  4422 buagtfiaipniavdnsxhv
## 24   Dallas  5343 blyjvtlpvpqondrdhluu
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top