Improving on code in question
Your loops (and the nested dot product) can be eliminated with bsxfun
and matrix multiplication as follows:
xcomps = bsxfun(@ge,xy(:,1),int_x);
ycomps = bsxfun(@ge,xy(:,2),int_y);
count_again = double(xcomps).'*double(ycomps); %' 143x143 = 143x1e5 * 1e5x143
count_again_fix = diff(diff(count_again')');
The multiplication step accomplishes the AND and summation done in sum(inds1 .* inds2)
, but without looping over the density matrix. EDIT: If you use single
instead of double
, execution time is nearly halved, but be sure to convert your answer to double
or whatever is required for the rest of the code. On my computer this takes around 0.5 sec.
Note: With rot90(count_again/size(xy,1),2)
you have a CDF, and in rot90(count_again_fix/size(xy,1),2)
you have a PDF.
Using accumarray
Another approach is to use accumarray
to make the joint histogram after we bin the data.
Starting with int_x
, int_y
, K1
, xy
, etc.:
% take (0,1) data onto [1 K1], following A.Dondas approach for easy comparison
ii = floor(xy(:,1)*(K1-eps))+1; ii(ii<1) = 1; ii(ii>K1) = K1;
jj = floor(xy(:,2)*(K1-eps))+1; jj(jj<1) = 1; jj(jj>K1) = K1;
% create the histogram and normalize
H = accumarray([ii jj],ones(1,size(ii,1)));
PDF = H / size(xy,1); % for probabilities summing to 1
On my computer, this takes around 0.01 sec.
The output is the same as A.Donda's converted from sparse to full (full(H)
). Although, as he A.Donda pointed out, it is correct to have the dimensions be K1
xK1
, rather than the size of count_again_fix
in the OPs code that was K1+1
xK1+1
.
To get the CDF, I believe you can simply apply cumsum
to each axis of the PDF.