Question

This might be a silly question! I have a array P which represents the probability distribution of some data e.g. [0;0.3;0.7] How can I determine the type or class of discrete probability distribution of P? The original data is unavailable to me.

dfittool or fitdist requires me to give the data as input, while I already have its probability distribution. Any ideas?

Was it helpful?

Solution

It is not possible to find out a priori what kind of distribution some data (especially with as low n as in your example) is coming from.

If you have an idea of the process that generated your data, you might be able to get an idea of which distributions to test. Maybe your data comes from the family of gamma distributions, maybe your data comes from the family of Weibull distributions etc. Then, you can fit these general distributions and see whether they are likely to simplify to a more common distribution.

For a visual representation of how well your data could approximate a certain distribution, you can use PROBPLOT.

Once you have identified possible distributions, you can fit them to the data and use the Bayesian Information Criterion (BIC) to compare which fit describes the data best. Note that unless you have huge numbers of noise-free data, it is impossible to tell which fit is correct if you have several possible distributions with comparatively low BIC.

OTHER TIPS

You probably might have seen different probability distributions during lecture or your reading. All you have to do is plotting the given distribution against the candidates. As the distributions itself are parametrized, curve fitting or trial end error come into play. The distribution with the least error, best fit, might be the one you are looking for.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top