Using robust linear methods from python module "statsmodels" with weights?

https://stackoverflow.com/questions/21755153

11-10-2022
|

Question

I have some data,y with errors, y_err, measured at x. I need to fit a straight line to this mimicking some code from matlab specifically the fit method with robust "on" and giving the weights as 1/yerr. The matlab documentation says it uses the bisquare method (also know as the TukeyBiweight method). My code so far is..

    rlm_model = sm.RLM(y, x, M=sm.robust.norms.TukeyBiweight())
    rlm_results = rlm_model.fit()
    print rlm_results.params

however I need to find a way of including weights derived from yerr. Hope people can help, this is the first time I have tried to used the statsmodel module.

In response to the first answer: I tried;

y=y*(yerr)
x=x*(yerr)

x=sm.add_constant(x, prepend=False)
rlm_model = sm.RLM(y, x, M=sm.robust.norms.TukeyBiweight())
results=rlm_model.fit()

but sadly this doesnt match the matlab function.

Solution

Weights reflecting heteroscedasticity, that is unequal variance across observations, are not yet supported by statsmodels RLM.

As a workaround, you can divide your y and x by yerr in the call to RLM.

I think, in analogy to weighted least squares, the parameter estimates, their standard errors and other statistics are still correct in this case. But I haven't checked yet.

as reference:

Carroll, Raymond J., and David Ruppert. "Robust estimation in heteroscedastic linear models." The annals of statistics (1982): 429-441.

They also estimate the variance function, but for fixed weights 1/sigma_i the optimization just uses

(y_i - x_i beta) / sigma_i

The weights 1/sigma_i will only be relative weights and will still be multiplied with a robust estimate of the scale of the errors.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow