you need grep('a[1:12]+', names(dfn))
actually the proper way to do it would be grep('a[1-9]+', names(dfn))
the + after the [1-9] means that values from 1-9 can be repeated any number of times after the a but must appear at least once.
Question
I want the index of some variables in a data-frame, but my grep()
skill are insufficient.
Say I have this data frame,
( dfn <- data.frame(
a1 = c(3, 3, 0, 3, 0, 0),
a2 = c(1, NA, 0, NA, 1, 4),
a11 = c(0, 3, NA, 1, 3, 1),
a12 = c(0, 3, NA, 1, 3, 3),
a_12 = c(0, 3, NA, 1, NA, NA),
a_1 = c(12, 3, NA, 1, 4, NA)) )
a1 a2 a11 a12 a_12 a_1
1 3 1 0 0 0 12
2 3 NA 3 3 3 3
3 0 0 NA NA NA NA
4 3 NA 1 1 1 1
5 0 1 3 3 NA 4
6 0 4 1 3 NA NA
Now, what I want is to grep a1, a2, a11, and a12 (in real life the # after the a
' is a consecutive list from 1 to 12), how do I do that? I've tried the two grep's below, but with no luck.
foo <- grep('a[1:12]$', names(dfn) )
names(dfn[,foo])
[1] "a1" "a2"
I've also tried this,
bar <- grep('a[c(1:12)]$', names(dfn) )
names(dfn[,bar])
[1] "a1" "a2"
What I want is
[1] "a1" "a2" "a11" "a12"
Secondly, can anyone direct me to a good grep()
tutorial? Thanks!
Solution
you need grep('a[1:12]+', names(dfn))
actually the proper way to do it would be grep('a[1-9]+', names(dfn))
the + after the [1-9] means that values from 1-9 can be repeated any number of times after the a but must appear at least once.
OTHER TIPS
regmatches(names(dfn),regexpr('a[1-9]{1,2}',names(dfn)))
[1] "a1" "a2" "a11" "a12"
my regular expression is : a follwed by min =1 and max =2 numbers in the set [1-9]
You could just do this instead:
names(dfn)[names(dfn) %in% paste0("a",1:12)]
[1] "a1" "a2" "a11" "a12"
If you want the indexes, this will give you that:
which(names(dfn) %in% paste0("a",1:12))
[1] 1 2 3 4