Predict with survreg/tobit goes past bound

https://stackoverflow.com/questions/15865222

02-04-2022
|

문제

So I am using survreg, and I expect my predicted results to obey a lower bound of 0, but they indicate negative results frequently. I think it is somehow estimating a linear result instead of the survival model I'm trying to create. Here's what I've done:

linear.first.stage<-lm(y ~ x, data=clip)

First I estimated some points to speed up my estimation process. It fails to converge without this first stage. I create a survival object, following the code from ?survreg that provides an explicit example of a tobit regression. I duplicated this below for x and y. In my data set, y can only be observed at a non-negative value, but if it is positive, it tends to be distributed normally around 200 or so with sd of about 20. X may take any value and isn't theoretically bound by any particular number that immediately comes to mind.

surv_y<-Surv(clip$y, clip$y>0,type="left")
first.stage<-survreg(surv_y ~ x,init=(linear.first.stage), dist="gaussian", data=clip)

I run the survival regression, which should be equivalent to a Tobit. To confirm that my interpretation of events were the same, I ran the following:

test<-tobit(y~x, left=0, right=Inf, dist="gaussian", data=clip)
p_test<-predict(test)
p<-predict(first.stage)
plot(p_test-p)

The plot shows a flat line at zero, so upon visual inspection these commands are identical, as they should be. However, in both cases, results under 0 are predicted. This is problematic because I have stated that the leftward bound of observable information is 0. My expectations is that all predicted values must be >0.

I have tried predicting using types "link", "response", "linear", but to no avail. I assume the predict command is producing the outcomes as if the censorship was not occurring. How do I produce the prediction that obeys the lower bound of 0?

References:

해결책 2

Answer: Tobit is not the right regression type. Tobit predicts what the result ought to be in the absence of the truncation.

Clarification: I restructured my estimation process to reflect a zero-inflated or hurdle model. Tobit is for censored data, it says there exists a non-zero result, but we only observe 0 because the information is hidden somehow. For example, women's wages should be fit with Tobit, because married women who choose not to work still have a reservation wage, and still have some (invisible) return to effort doing unpaid labor of whatever type. Zero-inflated or hurdle models indicate that the result is truly zero. As in, no crimes occurred. Or no widgets produced. They more accurately reflected my model.

다른 팁

You probably need to scale the prediction up in the sense that is described here by one of the authors of the package.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow