The problem here is apply
; see "Details" in ?apply
: "If X is not an array [..], apply
attempts to coerce it to an array via as.matrix
if it is two-dimensional (e.g., a data frame)." Then have a look at "Details in ?as.matrix
: "The method for data frames will return a character matrix if there is only atomic columns and any non-(numeric/logical/complex) column". Thus, despite that your conversion to numeric works, the use of apply
to 'loop' over columns when checking their class first coerces the data frame to a character matrix.
A small example. First create a toy data frame:
df <- data.frame(x1 = c("a", "b"),
x2 = c("Not Available", 2),
x3 = c("Not Available", 3),
x4 = c(4, "Not available"))
Convert selected columns to numeric as you did in you question, or like this:
df[, 2:4] <- lapply(df[ , 2:4], function(x) as.numeric(x))
str(df)
If the resulting data frame is coerced to a matrix, as apply
would do, it is coerced to a character matrix
str(as.matrix(df))
# chr [1:2, 1:4] "a" "b" NA " 2" NA " 3" " 4" NA
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:4] "x1" "x2" "x3" "x4"
Instead of using apply
to check the class of your columns, you may try:
sapply(df, class)
# x1 x2 x3 x4
# "character" "numeric" "numeric" "numeric"
str(df)
# 'data.frame': 2 obs. of 4 variables:
# $ x1: chr "a" "b"
# $ x2: num NA 2
# $ x3: num NA 3
# $ x4: num 4 NA