Frage

I got a data frame where "." is used both as decimal marker and alone as NA.

A    B    C    D
1    .  1.2    6
1   12    .    3
2   14  1.6    4

To work on this data frame I need to obtain:

A    B    C    D
1   NA  1.2    6
1   12   NA    3
2   14  1.6    4

How should I deal to keep decimals but transform alone "." in column C?

Here is the data in a reproducible format:

data <- structure(list(A = c(1L, 1L, 2L), B = c(".", "12", "14"), C = c("1.2", 
    ".", "1.6"), D = c(6L, 3L, 4L)), .Names = c("A", "B", "C", "D"), 
    class = "data.frame", row.names = c(NA, -3L))
War es hilfreich?

Lösung 2

You can use type.convert and specify "." as your na.string:

df <- data ## Create a copy in case you need the original form
df
#   A  B   C D
# 1 1  . 1.2 6
# 2 1 12   . 3
# 3 2 14 1.6 4

df[] <- lapply(df, function(x) type.convert(as.character(x), na.strings="."))
df
#   A  B   C D
# 1 1 NA 1.2 6
# 2 1 12  NA 3
# 3 2 14 1.6 4

Note that the argument is na.strings (with a plural "s") so you can specify more characters to be treated as NA values if you have any.

Also, the actual answer to this question might be to simply specify the na.strings argument when you are first reading your data into R, perhaps with read.table or read.csv.

Let's replicate the process of reading a csv from within R:

x <- tempfile()
write.csv(data, x, row.names = FALSE)

read.csv(x)
#   A  B   C D
# 1 1  . 1.2 6
# 2 1 12   . 3
# 3 2 14 1.6 4

read.csv(x, na.strings = ".")
#   A  B   C D
# 1 1 NA 1.2 6
# 2 1 12  NA 3
# 3 2 14 1.6 4

Andere Tipps

Assuming your data frame is data:

data[data == "."] <- NA

should work. Or:

data <- sapply(data, as.numeric)
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top