Domanda

I want to import a table (.txt file) in R with read.table(). One column in my table is an ID with nine numerals - some ids begin with a 0, other with 1 or 2.

R truncates the first 0 (012345678 becomes 12345678) which leads to problems when using this ID to merge another table.

Can someone give me a hint how to solve the problem?

È stato utile?

Soluzione

As said in Ben's answer, colClasses is the easier way to do it. Here is an example:

read.table(text = 'col1 col2
           0012 0001245',
           head=T,
           colClasses=c('character','numeric'))

  col1 col2
1 0012 1245      ## col1 keep 00 but not col2

Altri suggerimenti

A reproducible example would be nice, but: use the colClasses argument to read.table() to specify that you want this column to be read as a character variable, not numeric. Or make them back into character variables after reading them in, using sprintf to pad the numbers with leading zeros. (The former is probably easier.)

Here is a for loop to add leading zeros to rows based on a condition. Although this is a post-hoc solution (adding leading 0's after reading the table), it worked for me so thought I'd share:

Let's take the example of a column of zip codes. All values should contain 5 digits (e.g. 01234), but R removes leading zeros (so '01234' becomes '1234'). You can add a trailing zero to all cells that contain only 4 characters with this code:

for (i in 1:nrow(df)){
  if(nchar(df$zipCode[i])<5){
    df$zipCode[i]<- paste0('0',df$zipCode[i])
  }
}
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top