Question

I'm trying to create an actuarial survival analysis in R (I'm following some worked examples). I think the best way to do this is using the survival package. So something like:

library(survival)
surv.test <- survfit(Surv(TIME,STATUS), data=test)

However, to get the correct answer I will need to divide the TIME variable into 365 day intervals and I can't quite work out how to do this so it matches the given result.

As far as I can make out, there is no option within the survfit function that will do this. I went through several document examples and none of them were trying to create a stairstep type of plot (there is a type='interval' option, but seems to do something different). So I guess I need to regroup my data before I apply the survival function?

Any ideas?

P.S: In SPSS this would be INTERVAL = THRU 10000 BY 365; in Stata intervals(365) ... connect(stairsteps)

Was it helpful?

Solution

I am guessing that you want to divide the TIME variable into intervals because you want to plot a Kaplan-Meier curve. In R, that isn't necessary, you can just call plot on the survfit object. For example,

s=survfit(Surv(futime, fustat)~rx, data=ovarian)
plot(s)

enter image description here


I think I understand your question a little better. The reason why you are getting a thick black line is because you have a lot of censoring, and a + is being plotted at every single point where there is censoring, you can turn this off with mark.time=F. (You can see other options in ?survival:::plot.survfit)

However, if you still want to aggregate by year, simply divide your follow up time by 365, and round up. ceiling is used to round up. Here is an example of aggregating at different time levels without censoring.

par(mfrow=c(1,3))
plot(survfit(Surv(ceiling(futime), fustat)~rx, data=ovarian),col=c('blue','red'),main='Day',mark.time=F)
plot(survfit(Surv(ceiling(futime/30), fustat)~rx, data=ovarian),col=c('blue','red'),main='Month',mark.time=F)
plot(survfit(Surv(ceiling(futime/365), fustat)~rx, data=ovarian),col=c('blue','red'),main='Year',mark.time=F)
par(mfrow=c(1,1))

But I think that plotting the Kaplan-Meier without the censoring symbols will look very nice, and provide more insight.

enter image description here

OTHER TIPS

Hurray, I should be able to post the images now:

1) this is how the R basic survival plot looks like at the moment enter image description here

2) and this is how it should look like (SPSS example) enter image description here

That was exactly what I was missing! Thanks!

enter image description here

Solution:

vas.surv <- survfit(Surv(ceiling(TIME/365), STATUS)~1, conf.type="none", data=vasectomy)
plot(vas.surv, ylim=c(0.975,1), mark.time=F, xlab="Years", ylab="Cumulative Survival")

A nice touch would be to displays the days on the x-axis instead of the years (as in SPSS) example, but I'm not too bothered about this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top