Question

I'm new to R and want to utilize it to directly work with my data. My ultimate goal is to make a histogram / bar plot.

Depth: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

Percent: .4, .1, .5, .2, .1, .3, .9, .3, .2, .2, .8

I want to take the Depth vector and bin it into unequal chunks (0, 1-5, 6-8, 9-10), and take the Percent values and somehow sum them together for the matching chunks.

For example:

0 -> .4

1-5 -> 1.2

6-8 -> 1.4

9-10 -> 1.0

The actual data set goes into the thousands, and I feel R might be more suited for this then using C++ to group my data into a smaller table before letting R plot it.

I looked up how to use SPLIT and CUT, but I'm not quite sure how to utilize the data after I do cut it into ranges. If I do "breaks" for a CUT, I don't know how to include the Zero initial value (corresponding to .4 in the example).

Any suggestions or approaches would be appreciated.

Was it helpful?

Solution

You're on the right track with cut:

dat <- data.frame(Depth = 0:10,
                  Percent = c(0.4, 0.1, 0.5, 0.2, 0.1, 0.3, 0.9, 0.3, 0.2, 0.2, 0.8))

cuts <- cut(dat$Depth, breaks=c(0, 1, 6, 9, 11), right=FALSE)

Then you can use aggregate:

aggregate(dat$Percent, list(cuts), sum)

Or as a oneliner:

aggregate(dat$Percent, 
          list(cut(dat$Depth, 
                   breaks=c(0, 1, 6, 9, 11), 
                   right=FALSE)),
          sum)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top