Question

I have a list with 21 character strings called "r1score.list". I made this list by reading in text files from a folder. Then I made each txt into a data frame and put those in another list called "r1.score". (You can ignore the function(x){sortertoDF(x)} because I'm calling a function that I've written somewhere else to manipulate the data).

r1score.list <- dir(pattern="*.txt")

> r1score.list
 [1] "p01_control.txt"   "p02_control.txt"   "p03_control.txt"   "p04_pq.txt"           "p05_pq.txt"        "p06_pq.txt"       
 [7] "p07_doce.txt"      "p08_doce.txt"      "p09_doce.txt"      "p10_dact.txt"      "p11_dact.txt"      "p12_dact.txt"     
[13] "p16_carm.txt"      "p17_carm.txt"      "p18_carm.txt"      "p19_cisplatin.txt" "p20_cisplatin.txt" "p21_cisplatin.txt"
[19] "p22_amsacrine.txt" "p23_amsacrine.txt" "p24_amsacrine.txt"

r1.score <- llply(.data=r1score.list, .fun=function(x){sortertoDF(x)})

Right now, I have named each data frame in r1.score by doing this:

names(r1.score) <- r1score.list

But I want to name each data frame by just the word that comes after the underscore. Meaning, if I call the first data frame in r1.score, I want it to have the name "control". If I call the fourth data frame, I want it to have the name "pq". If I call the last data frame, I want it to have the name "amsacrine", etc etc. I don't want to go through 21 txt files and give them new names so that I could do this. Is there a simpler way??

Thank you.

Was it helpful?

Solution

This is easy to do with regular expressions (the funky pattern that is the first argument to sub):

names(r1.score) <- sub(".*_(.*)\\..*", "\\1", r1score.list)

The second argument is what we replace the value matched by the regular expression. In this case it is a special symbol \\1, which is a reference to the part of the pattern matched inside the parentheses, (.*). If you look carefully you can see that before the parenthesis we're matching .*_ which means "anything ending in an underscore" and on the other side we're matching \\..* which means anything starting with a period (we need to use \\. because otherwise a period is treated as a wild card).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top