Pergunta

The following model is part of the tutorial of PyMC, named disaster_model.py and can be imported in the main code to be used as a model:

"""
A model for the disasters data with a changepoint

changepoint ~ U(0, 110)
early_mean ~ Exp(1.)
late_mean ~ Exp(1.)
disasters[t] ~ Po(early_mean if t <= switchpoint, late_mean otherwise)

"""

from pymc import *
from numpy import array, empty
from numpy.random import randint

__all__ = ['disasters_array', 'switchpoint', 'early_mean', 'late_mean', 'rate', 'disasters']

disasters_array =   array([ 4, 5, 4, 0, 1, 4, 3, 4, 0, 6, 3, 3, 4, 0, 2, 6, 
                            3, 3, 5, 4, 5, 3, 1, 4, 4, 1, 5, 5, 3, 4, 2, 5, 
                            2, 2, 3, 4, 2, 1, 3, 2, 2, 1, 1, 1, 1, 3, 0, 0, 
                            1, 0, 1, 1, 0, 0, 3, 1, 0, 3, 2, 2, 0, 1, 1, 1, 
                            0, 1, 0, 1, 0, 0, 0, 2, 1, 0, 0, 0, 1, 1, 0, 2, 
                            3, 3, 1, 1, 2, 1, 1, 1, 1, 2, 4, 2, 0, 0, 1, 4, 
                            0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1])

# Define data and stochastics

switchpoint = DiscreteUniform('switchpoint', lower=0, upper=110, doc='Switchpoint[year]')
early_mean = Exponential('early_mean', beta=1.)
late_mean = Exponential('late_mean', beta=1.)

@deterministic(plot=False)
def rate(s=switchpoint, e=early_mean, l=late_mean):
    ''' Concatenate Poisson means '''
    out = empty(len(disasters_array))
    out[:s] = e
    out[s:] = l
    return out

disasters = Poisson('disasters', mu=rate, value=disasters_array, observed=True)

Now one can do a sampling from distributions using MCMC Metropolis Hasting algorithm to get posterior distribution of parameters.

from pymc.examples import disaster_model
from pymc import MCMC
M = MCMC(disaster_model)
M.sample(iter=10000, burn=1000, thin=10)

Now my problem is that suppose after this sampling I achieve new data. How can I update my posterior distributions afterwards? Basically how can implement online learning using PyMC?

Foi útil?

Solução

You would need to specify a new model for the update. The reason for this is that now you will have informative priors to use for the unknown parameters. Specifically, your DiscreteUniform on the switchpoint will either be a Categorical or a Multinomial (with n=1), and the rate parameters might both be normally distributed. You could fit these priors (using one of several methods) to the posterior samples from the first run of the model. If you planned updating repeatedly, you could easily do this update programatically.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top