You are basically looking for a histogram with non-uniform bins or a histogram with equal counts.
The simplest case for a non-uniform histogram is to sort the N
values in x
and separate the sorted vector into k
bins, i.e. each bin will have N/k
of the samples (you can also set the ratio by specifying N = ck
).
Instead of a linear spacing the range domain x, you do a linear split of the ordered vector (thus a non-linear, non-uniform separation of the original range).
In your case it would look like this:
[sortedX, indeX] = sort(x);
nVals = length(x); % N
nBins = nVals/10; % k = N/c
% linear split of the sorted vector
stepX = (1:nVals/nBins:nVals);
if stepX(end)~=nVals, stepX = [stepX nVals+1]; end
% counting and bining on the indexed vector
for i = 1 : length(stepX)-1
bin = indeX(stepX(i):stepX(i+1)-1);
xbin(i,1) = mean(x(bin));
yy(i,1) = mean(y(bin));
end
To calculate the actual range (i.e. the edges of the histogram) you can use the midpoint between the max in bin i
and the min in bin i+1
. You can add something like the following in your loop:
% calculate the range
maxX(i) = max(x(bin));
minX(i) = min(x(bin));
The desired (non-linear) range is then:
rangeX = [min(x) maxX(1:end-1) + (minX(2:end) - maxX(1:end-1))/2 max(x)];
while your original (linear) range is:
rangeX_OP = min(x):0.3:max(x);
You can use histc
to verify the equal counts (for rangeX
) and non-equal counts (for rangeX_OP
). This is how the counts would look (for random x
in similar range to yours and c = 10
counts per bin). Top is the linear spacing if range, bottom is the non-linear.