I got a data frame where "." is used both as decimal marker and alone as NA.

A    B    C    D
1    .  1.2    6
1   12    .    3
2   14  1.6    4

To work on this data frame I need to obtain:

A    B    C    D
1   NA  1.2    6
1   12   NA    3
2   14  1.6    4

How should I deal to keep decimals but transform alone "." in column C?

Here is the data in a reproducible format:

data <- structure(list(A = c(1L, 1L, 2L), B = c(".", "12", "14"), C = c("1.2", 
    ".", "1.6"), D = c(6L, 3L, 4L)), .Names = c("A", "B", "C", "D"), 
    class = "data.frame", row.names = c(NA, -3L))
有帮助吗?

解决方案 2

You can use type.convert and specify "." as your na.string:

df <- data ## Create a copy in case you need the original form
df
#   A  B   C D
# 1 1  . 1.2 6
# 2 1 12   . 3
# 3 2 14 1.6 4

df[] <- lapply(df, function(x) type.convert(as.character(x), na.strings="."))
df
#   A  B   C D
# 1 1 NA 1.2 6
# 2 1 12  NA 3
# 3 2 14 1.6 4

Note that the argument is na.strings (with a plural "s") so you can specify more characters to be treated as NA values if you have any.

Also, the actual answer to this question might be to simply specify the na.strings argument when you are first reading your data into R, perhaps with read.table or read.csv.

Let's replicate the process of reading a csv from within R:

x <- tempfile()
write.csv(data, x, row.names = FALSE)

read.csv(x)
#   A  B   C D
# 1 1  . 1.2 6
# 2 1 12   . 3
# 3 2 14 1.6 4

read.csv(x, na.strings = ".")
#   A  B   C D
# 1 1 NA 1.2 6
# 2 1 12  NA 3
# 3 2 14 1.6 4

其他提示

Assuming your data frame is data:

data[data == "."] <- NA

should work. Or:

data <- sapply(data, as.numeric)
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top