Question

I need your help again :)

I wrote an R script, that generates a heatmap out of a given tab-seperated txt or xls file. At the moment, I delete all columns I don't want to have in the heatmap by hand in the xls file. Now I want to automatize it, but I don't know how :(

The interesting columns all start the same in all xls files, followed by an individual name:

xls-file 1: L1_tpm_xxxx L2_tpm_xxxx L3_tpm_xxxx

xls-file 2: L1_tpm_xxxx L2_tpm_xxxx L3_tpm_xxxx L4_tpm_xxxx L5_tpm_xxxx

Any ideas how to select those columns?

Thanking you in anticipation, Philipp

Was it helpful?

Solution

You could use (if you have read your data in a data.frame df):

df <- df[,grep("^L[[:digit:]]+_tpm.*",colnames(df))]

or you can explicitly write the columns that you want:

df <- df[,c("L1_tpm_xxxx","L2_tpm_xxxx","L3_tpm_xxxx")]

etc...

The following link is quite useful;-)

OTHER TIPS

If you think the column positions are going to be fixed across excel sheets, the simplest solution here is to just use column indices. For example, if you use read.table to import a tab-delimited text file as a data.frame, and then decide you'd prefer to only keep the first two columns, you might do something like this:

data <- read.table("path_to_file.txt", header=T, sep="\t")
data <- data[,1:2]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top