Unsure how to plot a histogram with variable break points from a one column matrix in R

StackOverflow https://stackoverflow.com/questions/22927459

  •  29-06-2023
  •  | 
  •  

Вопрос

I have a matrix which has the following approximate dimensions: 20000 x 1. I would like to plot the values in a histogram with bins of length 0.01 from -0.05 to +0.15. However, the values in the matrix are pretty random - for eg, 0.0123421, 0.0124523, 0.124523, -0.011234, etc. Thus, I need to first count the number of values that fall into a particular bin, and then plot a histogram. For the numbers I gave, I'd have 2 values between 0.01 and 0.02, 1 between -0.02 and -0.01, and so on, which I need in a histogram. Is there an easy way to do this? I'm relatively new to R, so any help is appreciated!

Это было полезно?

Решение

As an example illustrating breaks (content summarized from an excellent post on R-bloggers which you can refer to here), lets assume that you start with some normally distributed data. In R, you can generate normal data this way using the rnorm() function:

data <-rnorm(n=1000, m=24.2, sd=2.2) 

We can then generate a simple histogram using the following call:

hist(data)

Now, let's assume that you want to have coarser or finer groups for your bins. There are a number of ways to do this. You could, for example, use the breaks() option. Below is a tidy example illustrating this:

hist(data, breaks=20, main="Breaks=20")
hist(data, breaks=5, main="Breaks=5")

Now, if you want more control over exactly the breakpoints between bins, you can be more precise with the breaks() option and give it a vector of breakpoints, like this:

hist(data, breaks=c(17,20,23,26,29,32), main="Breaks is vector of breakpoints")

This dictates exactly the start and end point of each bin. Of course, you could give the breaks vector as a sequence like this to cut down on the messiness of the code:

hist(data, breaks=seq(17,32,by=3), main="Breaks is vector of breakpoints")

Note that when giving breakpoints, the default for R is that the histogram cells are right-closed (left open) intervals of the form (a,b]. You can change this with the right=FALSE option, which would change the intervals to be of the form [a,b). This is important if you have a lot of points exactly at the breakpoint.

Другие советы

hist(x, breaks = seq(-.05, .15, .01))

See ?hist

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top