Question

I have a mixture of 3 gaussians but no matter how much I tweak the priors I can't get the posterior means to move from their prior values..

k = 3

n1 = 1000
n2 = 1000
n3 = 1000

n = n1+n2+n3

mean1 = 17.3
mean2 = 42.0
mean3 = 31.0

precision = 0.1

sigma = np.sqrt(1 / precision)

print "Standard deviation: %s" % sigma

data1 = np.random.normal(mean1,sigma,n1)
data2 = np.random.normal(mean2,sigma,n2)
data3 = np.random.normal(mean3,sigma,n3)

data = np.concatenate([data1 , data2, data3])

hist(data, bins=200,  color="k", histtype="stepfilled", alpha=0.8)
plt.title("Histogram of the dataset")
plt.ylim([0, None])

with pm.Model() as model:
    dd = pm.Dirichlet('dd', a=np.array([float(n/k) for i in range(k)]), shape=k)
    sd = pm.Uniform('precs', lower=1, upper=5, shape=k)
    means = pm.Normal('means', [25, 30, 35], 0.01, shape=k)
    category = pm.Categorical('category', p=dd, shape=n)

    points = pm.Normal('obs',
                     means[category],
                     sd=sd[category],
                     observed=data)
    tr = pm.sample(100000, step=pm.Metropolis())
    pm.traceplot(tr, vars=['means', 'precs', 'dd'])

output:

Standard deviation: 3.16227766017
 [-----------------100%-----------------] 100000 of 100000 complete in 157.2 sec

As you can see there is no convergence and the means do not move from their initial values histogram of data traceplots not converging

Was it helpful?

Solution

Unfortunately, this is a known issue: https://github.com/pymc-devs/pymc/issues/452 and https://github.com/pymc-devs/pymc/issues/443 which we are working on.

Note that there are other step-methods you can use for the categorical as in the example models in the issues. But even that doesn't lead to convergence.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top