Question

I'm trying to tabulate/map the counts of 2 factor-class vectors (b1 & b2) into a bigger dataframe. Summary of the vectors are as below:

> summary(b1)
(4,6] (6,8]  NA's 
   16     3     1 
> summary(b2)
(4,6] (6,8]  NA's 
    9     0    11 

I would like to map the above counts into a bigger dataframe:

  Intervals b1 b2
1  (-Inf,0] NA NA
2     (0,2] NA NA
3     (2,4] NA NA
4     (4,6] NA NA
5     (6,8] NA NA
6    (8,10] NA NA
7   (10,12] NA NA
8 (12, Inf] NA NA

My question: is there a vectorized or more direct way to do the above without resorting to a 'for' loop + if-else condition checking? It seems like something easily done, but I'm have been having this mental block and I haven't been successful in finding relevant help online. Any help/hint is appreciated. Thanks in advance!

My sample code is attached:

NoOfElement <- 20
MyBreaks <- c(seq(4, 8, by=2))
MyBigBreaks <- c(-Inf, seq(0,12, by=2), Inf)

set.seed(1)
a1 <- rnorm(NoOfElement, 5); a2 <- rnorm(NoOfElement, 4)
b1 <- cut(a1, MyBreaks); b2 <- cut(a2, MyBreaks)

c <- seq(-10, 10)
d <- cut(c, MyBigBreaks)

e <- data.frame( Intervals=levels(d), b1=NA, b2=NA )
Was it helpful?

Solution

The table function does the tabulation that you need. It returns a named vector, and you can compare the names against the column e$Intervals to assign the correct values.

This relies on the fact that the order of the factor levels is the same in e$Intervals and b1 and b2. This is so because these all come from cut.

e$b1[e$Intervals %in% names(table(b1))] <- table(b1)
e$b2[e$Intervals %in% names(table(b2))] <- table(b2)
e
##   Intervals b1 b2
## 1  (-Inf,0] NA NA
## 2     (0,2] NA NA
## 3     (2,4] NA NA
## 4     (4,6] 16  9
## 5     (6,8]  3  0
## 6    (8,10] NA NA
## 7   (10,12] NA NA
## 8 (12, Inf] NA NA
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top