Question

So I am using survreg, and I expect my predicted results to obey a lower bound of 0, but they indicate negative results frequently. I think it is somehow estimating a linear result instead of the survival model I'm trying to create. Here's what I've done:

linear.first.stage<-lm(y ~ x, data=clip)

First I estimated some points to speed up my estimation process. It fails to converge without this first stage. I create a survival object, following the code from ?survreg that provides an explicit example of a tobit regression. I duplicated this below for x and y. In my data set, y can only be observed at a non-negative value, but if it is positive, it tends to be distributed normally around 200 or so with sd of about 20. X may take any value and isn't theoretically bound by any particular number that immediately comes to mind.

surv_y<-Surv(clip$y, clip$y>0,type="left")
first.stage<-survreg(surv_y ~ x,init=(linear.first.stage), dist="gaussian", data=clip)

I run the survival regression, which should be equivalent to a Tobit. To confirm that my interpretation of events were the same, I ran the following:

test<-tobit(y~x, left=0, right=Inf, dist="gaussian", data=clip)
p_test<-predict(test)
p<-predict(first.stage)
plot(p_test-p)

The plot shows a flat line at zero, so upon visual inspection these commands are identical, as they should be. However, in both cases, results under 0 are predicted. This is problematic because I have stated that the leftward bound of observable information is 0. My expectations is that all predicted values must be >0.

I have tried predicting using types "link", "response", "linear", but to no avail. I assume the predict command is producing the outcomes as if the censorship was not occurring. How do I produce the prediction that obeys the lower bound of 0?

References:

  1. Running predict() after tobit() in package AER
  2. https://stats.stackexchange.com/questions/11440/standardized-residuals-of-a-tobit-model-in-r
Was it helpful?

Solution 2

Answer: Tobit is not the right regression type. Tobit predicts what the result ought to be in the absence of the truncation.

Clarification: I restructured my estimation process to reflect a zero-inflated or hurdle model. Tobit is for censored data, it says there exists a non-zero result, but we only observe 0 because the information is hidden somehow. For example, women's wages should be fit with Tobit, because married women who choose not to work still have a reservation wage, and still have some (invisible) return to effort doing unpaid labor of whatever type. Zero-inflated or hurdle models indicate that the result is truly zero. As in, no crimes occurred. Or no widgets produced. They more accurately reflected my model.

OTHER TIPS

You probably need to scale the prediction up in the sense that is described here by one of the authors of the package.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top