After noting the total of Deaths
was 55 and you said that the number of flies was "around 50", I decided the likely assumption was that this was a completely observed process. So you need to replicate the duplicate deaths so there is one row for each death and assign an event marker of 1. The "long" format is clearly the preferred format. You can then create a Surv
-object with the 'Day' and 'event'
?Surv
df3 <- df2[rep(rownames(df2), df2$Deaths), ]
str(df3)
#---------------------
'data.frame': 55 obs. of 3 variables:
$ Exp : Factor w/ 1 level "A": 1 1 1 1 1 1 1 1 1 1 ...
$ Deaths: num 2 2 3 3 3 1 3 3 3 4 ...
$ Day : num 10 10 12 12 12 14 16 16 16 18 ...
#----------------------
df3$event=1
str(with(df3, Surv(Day, event) ) )
#------------------
Surv [1:55, 1:2] 10 10 12 12 12 14 16 16 16 18 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "time" "status"
- attr(*, "type")= chr "right"
Note: If this were being done in the coxph
function, the expansion to individual lines of date might not have been needed, since that function allows specification of case weights. (I'm guessing that the other regression function in the survival package would not have needed this to be done either.) In the past Terry Therneau has expressed puzzlement that people are creating Surv-objects outside the formula interface of the coxph
. The intended use of htis Surv-object was not described in sufficient detail to know whether a weighted analysis without exapnsion were possible.