performing column-by-column operation in R
Question
Folks
I have a temperature data for ever zone in a building, something like this:
Lines <- "Date,Zone01,Zone02
01/01 01:00:00,24.5,21.3
01/01 02:00:00,24.3,21.1
01/01 03:00:00,24.1,21.1
01/01 04:00:00,24.1,20.9
01/01 05:00:00,25.,21.
01/01 06:00:00,26.,21.
01/01 07:00:00,26.6,22.3
01/01 08:00:00,28.,24.
01/01 09:00:00,28.9,26.5
01/01 10:00:00,29.4,29
01/01 11:00:00,30.,32.
01/01 12:00:00,33.,35.
01/01 13:00:00,33.4,36
01/01 14:00:00,35.8,38
01/01 15:00:00,32.3,37
01/01 16:00:00,30.,34.
01/01 17:00:00,29.,33.
01/01 18:00:00,28.,32.
01/01 19:00:00,26.3,30
01/01 20:00:00,26.,28.
01/01 21:00:00,25.9,25
01/01 22:00:00,25.8,21.3
01/01 23:00:00,25.6,21.4
01/01 24:00:00,25.5,21.5
01/02 01:00:00,25.4,21.6
01/02 02:00:00,25.3,21.8"
What I want to do is to calculate a 99th percentile of the temperature in every zone. I will do this command:
Q=quantile(Lines$Zone01,0.99)
But then I will have to do it manually for every column in the dataset. Is there a way to make this command to go through all the columns (from second column onwards)?
Thanks a lot.
Solution
Use a function from the apply
family, in this case sapply
:
> sapply(Lines[, -1], quantile, 0.99)
Zone01.99% Zone02.99%
35.20 37.75
You will notice that the effect of this is that the quantile
gets appended to the column name. To remove this, pass names=FALSE
as an argument to quantile
:
> sapply(Lines[, -1], quantile, 0.99, names=FALSE)
Zone01 Zone02
35.20 37.75
OTHER TIPS
Package plyr
has a nifty function named numcolwise
that will operate on each column of your dataframe if it is numeric. Something like:
library(plyr)
> numcolwise(function(x) quantile(x, .99))(dat)
Zone01 Zone02
99% 35.2 37.75
Should do the trick.
Of course you can always use the base apply family:
> apply(dat[, -1], 2, function(x) quantile(x, .99))
Zone01 Zone02
35.20 37.75
Assuming that your data are in a data.frame you could convert the columns with the temperature data to a matrix and use apply(matrix,2,quantile,0.99)