read.delim() - errors “more columns than column names” and “header and ''col.names” are of different lengths"

StackOverflow https://stackoverflow.com/questions/7284146

  •  19-01-2021
  •  | 
  •  

Preliminary information OS: Windows XP Professional Version 2002 Service Pack 3; R version: R 2.12.2 (2011-02-25)

I am attempting to read a 30,000 row by 80 column, tab-delimited text file into R using the read.delim() function. This file does have column headers with following naming convention: "_". The code that I use to attempt to read the data in is:

cc <- c("integer", "character", "integer", rep("character", 3), 
        rep("integer", 73))

example_data <- read.delim(file = 'C:/example.txt', row.names = FALSE,
                           col.names = TRUE, as.is = TRUE, colClasses = cc)

After I submit this command, I receive the following error message:

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
more columns than column names
In addition: Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote,  :
  header and 'col.names' are of different lengths

Information that may be important - from column 8 until column 80 the count of zeros in each column is as follows:

column 08: 29,000 zeros
column 13: 15,000 zeros
column 19: 500 zeros
column 43: 15,000 zeros
columns 65-80: 29,000 zeros for each column

Can anyone help identify reasons that I am receiving the above error messages? Any help will be greatly appreciated.

有帮助吗?

解决方案

The cause of the problem is your use of the col.names=TRUE argument. This is supposed to be used manually to specify column names for the resulting data frame, and therefore must be a vector with the same length as there are columns in the input, one name per column.

f you want read.delim to take column names from the file, consider using header=TRUE; you may also wish to reconsider row.names=TRUE as again this is intended as a specification of the row names rather than an instruction to read them from the file.

More information is available on the help page for read.delim.

其他提示

I also recently had the same error and it disappeared after converting the file to comma or semicolon delimited and read it with read.csv / read.csv2. I know this is not a fullfillig answer but maybe you might check that out.

If you want to read as character matrix then first convert your file into .csv format and use read.csv. Don't use any other declaration other than file name. e.g.;

read.csv("filepath")
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top