performing column-by-column operation in R

https://stackoverflow.com/questions/7648290

06-02-2021
|

Question

Folks

I have a temperature data for ever zone in a building, something like this:

Lines <- "Date,Zone01,Zone02
 01/01  01:00:00,24.5,21.3
 01/01  02:00:00,24.3,21.1
 01/01  03:00:00,24.1,21.1
 01/01  04:00:00,24.1,20.9
 01/01  05:00:00,25.,21.
 01/01  06:00:00,26.,21.
 01/01  07:00:00,26.6,22.3
 01/01  08:00:00,28.,24.
 01/01  09:00:00,28.9,26.5
 01/01  10:00:00,29.4,29
 01/01  11:00:00,30.,32.
 01/01  12:00:00,33.,35.
 01/01  13:00:00,33.4,36
 01/01  14:00:00,35.8,38
 01/01  15:00:00,32.3,37
 01/01  16:00:00,30.,34.
 01/01  17:00:00,29.,33.
 01/01  18:00:00,28.,32.
 01/01  19:00:00,26.3,30
 01/01  20:00:00,26.,28.
 01/01  21:00:00,25.9,25
 01/01  22:00:00,25.8,21.3
 01/01  23:00:00,25.6,21.4
 01/01  24:00:00,25.5,21.5
 01/02  01:00:00,25.4,21.6
 01/02  02:00:00,25.3,21.8"

What I want to do is to calculate a 99th percentile of the temperature in every zone. I will do this command:

Q=quantile(Lines$Zone01,0.99)

But then I will have to do it manually for every column in the dataset. Is there a way to make this command to go through all the columns (from second column onwards)?

Thanks a lot.

Solution

Use a function from the apply family, in this case sapply:

> sapply(Lines[, -1], quantile, 0.99)
Zone01.99% Zone02.99% 
     35.20      37.75

You will notice that the effect of this is that the quantile gets appended to the column name. To remove this, pass names=FALSE as an argument to quantile:

> sapply(Lines[, -1], quantile, 0.99, names=FALSE)
Zone01 Zone02 
 35.20  37.75

OTHER TIPS

Package plyr has a nifty function named numcolwise that will operate on each column of your dataframe if it is numeric. Something like:

library(plyr)
> numcolwise(function(x) quantile(x, .99))(dat)
    Zone01 Zone02
99%   35.2  37.75

Should do the trick.

Of course you can always use the base apply family:

> apply(dat[, -1], 2, function(x) quantile(x, .99)) 
Zone01 Zone02 
 35.20  37.75

Assuming that your data are in a data.frame you could convert the columns with the temperature data to a matrix and use apply(matrix,2,quantile,0.99)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow