Remove any row in a dataframe where the length of a zipcode is not equal to 5 digits

StackOverflow https://stackoverflow.com/questions/23647050

  •  22-07-2023
  •  | 
  •  

Domanda

I have a dataframe with zipcodes:

address <- as.data.frame(matrix(c('1111 Spam Street', '12 Foo Bar', '666 Dead End', 95524, 94118, 9021), ncol=2))
address$V2 <- as.numeric(as.character(address$V2))

Which looks like this:

    V1                   V2
1   1111 Spam Street    95524
2   12 Foo Bar          94118
3   666 Dead End         9021

Unfortunately, the last zipcode is incorrect and I would like to remove that row and end up with just this:

    V1                   V2
1   1111 Spam Street    95524
2   12 Foo Bar          94118

My attempt newaddress <- address[length(address$V2) != 5, ] is obviously wrong, because it is looking at the length of the column, not the values inside the column.

How can I remove any row in a dataframe where there is a numeric value in a column which is not 5 digits in length?

Any advice is appreciated, and I apologize in advance for such a simple question.

È stato utile?

Soluzione

This should do it

newaddress <- address[nchar(address$V2) ==5 , ] #would also remove rows with more than 5 digits

EDIT after comment by @Matt:

Assuming the values in address$V2 are integer, you can also do the following:

address[address$V2 >= 10000 & address$V2 <100000, ]

Altri suggerimenti

Using dplyr:

library(dplyr) 

address %.% filter(nchar(V2) == 5)
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top