Question

I have some difficulties understanding how to use GLM model with poisson.

import numpy as np
import scikits.statsmodels as sm

dataset = pd.DataFrame({'A':np.random.rand(100)*1000, 
                        'B':np.random.rand(100)*100,  
                        'C':np.random.rand(100)*10, 
                        'target':np.random.rand(100)})

X = dataset.ix[:,['A','B','C']].values
y = dataset.ix[:,['target']].values
size = 1e5
nbeta = 3

fam = sm.families.Poisson()
glm = sm.GLM(y,X, family=fam)
res = glm.fit()
  • I am using "target" column as a target, Should I label the target to O or 1 ?
  • Can anyone explain how ca I get the predicted value as poisson has another function predict
Was it helpful?

Solution

Sourceforge is down right now. When it's back up, you should read through the documentation and examples. There are plenty of usage notes for prediction and GLM.

How to label your target is up to you and probably a question for cross-validated. Poisson is intended for counts but can be used on continuous data, but you should know what you're doing.

If you have 0/1 then you want a Logit or Probit model. Something like this. You don't need to convert the pandas objects to numpy.

import numpy as np
import statsmodels.api as sm

dataset = pd.DataFrame({'A':np.random.rand(100)*1000, 
                        'B':np.random.rand(100)*100,  
                        'C':np.random.rand(100)*10, 
                        'target':np.random.randint(0, 5, 100)})

X = dataset[['A','B','C']]
X['constant'] = 1
y = dataset['target']
size = 1e5
nbeta = 3

fam = sm.families.Poisson()
glm = sm.GLM(y,X, family=fam)
res = glm.fit()

predict = res.predict()

Or you could directly use the maximum likelihood estimator for Poisson.

res = sm.Poisson(y, X).fit()
predict = res.predict()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top