문제

I got a data frame where "." is used both as decimal marker and alone as NA.

A    B    C    D
1    .  1.2    6
1   12    .    3
2   14  1.6    4

To work on this data frame I need to obtain:

A    B    C    D
1   NA  1.2    6
1   12   NA    3
2   14  1.6    4

How should I deal to keep decimals but transform alone "." in column C?

Here is the data in a reproducible format:

data <- structure(list(A = c(1L, 1L, 2L), B = c(".", "12", "14"), C = c("1.2", 
    ".", "1.6"), D = c(6L, 3L, 4L)), .Names = c("A", "B", "C", "D"), 
    class = "data.frame", row.names = c(NA, -3L))
도움이 되었습니까?

해결책 2

You can use type.convert and specify "." as your na.string:

df <- data ## Create a copy in case you need the original form
df
#   A  B   C D
# 1 1  . 1.2 6
# 2 1 12   . 3
# 3 2 14 1.6 4

df[] <- lapply(df, function(x) type.convert(as.character(x), na.strings="."))
df
#   A  B   C D
# 1 1 NA 1.2 6
# 2 1 12  NA 3
# 3 2 14 1.6 4

Note that the argument is na.strings (with a plural "s") so you can specify more characters to be treated as NA values if you have any.

Also, the actual answer to this question might be to simply specify the na.strings argument when you are first reading your data into R, perhaps with read.table or read.csv.

Let's replicate the process of reading a csv from within R:

x <- tempfile()
write.csv(data, x, row.names = FALSE)

read.csv(x)
#   A  B   C D
# 1 1  . 1.2 6
# 2 1 12   . 3
# 3 2 14 1.6 4

read.csv(x, na.strings = ".")
#   A  B   C D
# 1 1 NA 1.2 6
# 2 1 12  NA 3
# 3 2 14 1.6 4

다른 팁

Assuming your data frame is data:

data[data == "."] <- NA

should work. Or:

data <- sapply(data, as.numeric)
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top