Question

I am working with the (I think) very cool titanic data that is publicly available.

There are two principal ways of how to import it to R:

(1) You can either use the built-in dataset Titanic (library(datasets)) or

(2) you can download it as .csv-file, e.g. here.

Now, the data is aggregated frequency data. I would like to convert the multi-dimensional contingency table into an individual-level data frame.

PROBLEM: If I use the built-in dataset, this is no problem; if I use the imported .csv-file, however, it doesn't work. This is the error message I get:

Error in rep(1:nrow(tablevars), counts) : invalid 'times' argument In addition: Warning message: In expand.table(Titanic.table) : NAs introduced by coercion

Why? And what do I wrong? Many thanks.

R CODE

#required packages
library(datasets)
library(epitools)

#(1) Expansion of built-in data set
data(Titanic)    
Titanic.raw <- Titanic
class(Titanic.raw) # data is stored as "table"
Titanic.expand <- expand.table(Titanic.raw)

#(2) Expansion of imported data set
Titanic.raw <- read.table("Titanic.csv", header=TRUE, sep=",", row.names=1)
class(Titanic.raw) #data is stored as "data.frame"

Titanic.table <- as.table(as.matrix(Titanic.raw)) 
class(Titanic.table) #data is stored as "table"

Titanic.expand <- expand.table(Titanic.table)
Was it helpful?

Solution

I think you probably want xtabs: Watch out that the factor coding is different for the factors in the Titanic and the Titanic.new objects. By default factor levels have lexicographic order, while two of the Titanic factors do not :

 str(Titanic)
 table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ...
 - attr(*, "dimnames")=List of 4
  ..$ Class   : chr [1:4] "1st" "2nd" "3rd" "Crew"
  ..$ Sex     : chr [1:2] "Male" "Female"
  ..$ Age     : chr [1:2] "Child" "Adult"
  ..$ Survived: chr [1:2] "No" "Yes"

 Titanic.raw <- read.table("~/Downloads/Titanic.csv", header=TRUE, sep=",", row.names=1)

 str( Titanic.new <- 
               xtabs( Freq ~ Class + Sex + Age +Survived, data=Titanic.raw))

 xtabs [1:4, 1:2, 1:2, 1:2] 4 13 89 3 118 154 387 670 0 0 ...
 - attr(*, "dimnames")=List of 4
  ..$ Class   : chr [1:4] "1st" "2nd" "3rd" "Crew"
  ..$ Sex     : chr [1:2] "Female" "Male"
  ..$ Age     : chr [1:2] "Adult" "Child"
  ..$ Survived: chr [1:2] "No" "Yes"
 - attr(*, "class")= chr [1:2] "xtabs" "table"
 - attr(*, "call")= language xtabs(formula = Freq ~ Class + Sex + Age + Survived, data = Titanic.raw) 

An 'xtabs'-object inherits from 'table'-class so you can use that expand.table function.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top