Question

I created a file called test2.txt with the following information:

col1 col2 col3 col4
1    A    B 
2    A    B 
3    A    B 
4    A    B 
5    A    B 
6    A    B 
7    A         C
8    A         C

When reading with the following command:

test.ws=read.table(paste(inputDir,'test2.txt',sep=''),fill=T,header=T)

I get the following:

  col1 col2 col3 col4
1    1    A    B   NA
2    2    A    B   NA
3    3    A    B   NA
4    4    A    B   NA
5    5    A    B   NA
6    6    A    B   NA
7    7    A    C   NA
8    8    A    C   NA

The columns are shifted to the left. What gives?!

I tried the following:

> count.fields(paste(inputDir,'test.txt',sep=''))
[1] 4 3 3 3 3 3 3 4 4

And it's telling me that the number of tabs is different, but it isn't! What am I to do with this information? It's worth a mention that, when importing this .txt file into Excel, it reads the tabs correctly and doesn't skip or shift any columns.

I tried to do this assigning column names separately, but that didn't work:

colNames=names(test.ws)
test.ws=read.table(paste(inputDir,'test2.txt',sep=''),skip=1,fill=T,header=T,col.names=colNames)

Yields:

Warning message:
In read.table(paste(inputDir, "test2.txt", sep = ""), skip = 1,  :
  header and 'col.names' are of different lengths

I found a similar issue online: https://stat.ethz.ch/pipermail/r-help/2008-July/166676.html. That question wasn't answered.

Was it helpful?

Solution

If that is tab separated data, set the separator accordingly: sep="\t". Otherwise (from the help on read.table)

If sep = "" (the default for read.table) the separator is ‘white space’, that is one or more spaces, tabs, newlines or carriage returns.

and so multiple consecutive tabs are being treated as a single delimiter.

Or use read.delim instead of read.table as that has defaults better suited for tab-separated data.

OTHER TIPS

Maybe you have fixed width columns?

read.fwf(textConnection("col1 col2 col3 col4
1    A    B 
2    A    B 
3    A    B 
4    A    B 
5    A    B 
6    A    B 
7    A         C
8    A         C"),widths = rep(5,4))

     V1    V2    V3   V4
1 col1  col2  col3  col4
2 1     A        B  <NA>
3 2     A        B  <NA>
4 3     A        B  <NA>
5 4     A        B  <NA>
6 5     A        B  <NA>
7 6     A        B  <NA>
8 7     A              C
9 8     A              C
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top