Question

Im using the Gaussian Mixture Model to estimate loglikelihood function(the parameters are estimated by the EM algorithm)Im using Matlab...my data is of the size:17991402*1...17991402 data points of one dimension:

When I run gmdistribution.fit(X,2) I get the desired output

But when I run gmdistribution.fit(X,k) for k>2....the code crashes and I get the error"OUT OF MEMORY"..I have also tried an open source code which again gives me the same problem.Can someone help me out here?..Im basically looking for a code which will allow me to use different number of components on such a large dataset.

Thanks!!!

Était-ce utile?

La solution

Is it possible for you to decrease the iteration time? The default is 100.

OPTIONS = statset('MaxIter',50,'Display','final','TolFun',1e-6)
gmdistribution.fit(X,3,OPTIONS)

Or you may consider under-sampling the original data.

A general solution to out of memory problem is described in this document.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top