Option 1: I have limited experience with fwf data in "real life situations", but for large CSV files have found the count.fields
function to be very helpful. Try this:
(table(cnts <-
count.fields(paste0(path,filename), sep=",", quote="", comment.char="") )
Then you can search in cnts
for the line numbers with outlier values. For instance, if you noticed that there were only 10-20 field counts of 47 while the rest were 48, you might print out those locations:
which(cnts=47)
Option 2: I'm pretty sure I have seen solutions to this using sed and grep at a system level for counting field separators. I cobbled this together from some NIX forums and it gives me a table of counts of fields in a four line file that is well structured:
fct <- table(system("awk -F ',' '{print NF}' A.csv", intern=TRUE))
fct
#3
#4
And it took 6 seconds to count the fields in a 1.2 MM record dataset and none of the data were brought into R:
system.time( fct <- table(system("awk -F ',' '{print NF}' All.csv", intern=TRUE)) )
# user system elapsed
# 6.597 0.215 6.552
You can get the count of lines with :
sum(fct)