Use colClasses
in your read.csv
call to read them in as character or factor: read.csv(*, colClasses="factor")
.
How to read numeric values as factors in R?
Question
I have a data frame A which has numeric column like:
zip code
00601
00602
00607
and so on.
If I read this in R using read.csv, they are read as numeric entities. I want them as factors.
I tried converting them back to factor using
A <- as.factor(A)
But this removes starting zeroes and make A like
zip code
601
602
607
I do not want this. I want to save zeroes.
Solution
OTHER TIPS
You may need to add leading zeros - as in this post. This first converts to a character class. Then, you can change this to a factor, which maintains the leading zeros.
Example
A <- data.frame("zip code"=c(00601,00602,00607))
class(A$zip.code) #numeric
A$zip.code <- sprintf("%05d", A$zip.code)
class(A$zip.code) #character
A$zip.code <- as.factor(A$zip.code)
class(A$zip.code) #factor
Resulting in:
> A$zip.code
[1] 00601 00602 00607
Levels: 00601 00602 00607
Writing A
as a .csv file
write.csv(A, "tmp.csv")
results in
"","zip.code"
"1","00601"
"2","00602"
"3","00607"
everything without any text qualifier is (attempted to be) read as numeric, so the issue is basically to know how your data (in case 00607
) is stored on the flat text file. If without text qualifier, you can either follow the suggestion of @Hong Ooi or use
read.csv(*, colClasses="character")
and then convert each column accordingly (in case you don' want/need all of them to factor
). Once you have a character vector (a data.frame column) converting it to factor is just straightforward
> zipCode <- c("00601", "00602", "00607")
> factor(zipCode)
[1] 00601 00602 00607
Levels: 00601 00602 00607