ignore/remove NA values in read.csv

https://stackoverflow.com/questions/15808356

01-04-2022
|

Domanda

I have a csv file as shown below that I read into R using read.csv, where column C has 12/30 empty values. I want to work out the max of each column, but the R function "max" returns "NA" when used on column C. How do I get R to ignore the empty/NA values, I cannot see an "rm.na" in read.csv?

data<-data.frame(read.csv("test.csv"))

data

A   B   C   
1   5   6
15  2   3
8   3   3
7   5   4
5   3   8
4   1   4
5   3   4
2   2   10
4   3   8
6   5   2
1   4   4
10  8   4
0   6   0
7   3   8
5   3   3
13  12  13
6   0   0
0   0   2
5   2   NA
7   3   NA
1   8   NA
11  1   NA
1   4   NA
0   7   NA
4   5   NA
3   10  NA
2   0   NA
6   4   NA
0   19  NA
1   5   NA

> max(C)
[1] NA

Soluzione 2

you have two options that i can think of

 apply(data,2,max,na.rm=TRUE); # this will remove the NA's from columns that contain them

apply(na.omit(data),2,max); ## this will remove the NA rows from the data frame and then calculate the max values

Altri suggerimenti

    data<-na.omit(data)

then

    max(data)

If you do not wish to change the data frame then

    max(na.omit(data))

I'd suggest to remove the NA after reading like others have suggested. If, however, you insist on reading only the non-NA lines you can use the bash tool linux to remove them and create a new file:

grep -Ev file_with_NA.csv NA > file_without_NA.csv

If you run linux or mac, you already have this tool. On windows, you have to install MinGW or Cygwin to get the tools.

You should be able to use

max(x,na.rm=TRUE)

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow