Question

I have to plot some data using histograms. My data are between [0,1], with no large concentrations on any particular point.

What's a good ratio between number of samples and number of bins (of equal length)?

Was it helpful?

Solution

I generally use the square root of the number of samples as the number of bins. This is the simplest choice listed in the discussion of an appropriate number of bins in the Wikipedia histogram article. From this article

There is no "best" number of bins, and different bin sizes can reveal different features of the data. Some theoreticians have attempted to determine an optimal number of bins, but these methods generally make strong assumptions about the shape of the distribution.

The use of the square root of the number of samples is generally a good place to start if you don't want to make assumptions about the distribution of your data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top