You can use read.table
, but you should use count.fields
or some kind of regex to figure out the correct number of columns first. Using Robert's "text" sample data:
Cols <- max(sapply(gregexpr("+", text, fixed = TRUE), length))+1
## Cols <- max(count.fields(textConnection(text), sep = "+"))
read.table(text = text, comment.char="", header = FALSE,
col.names=paste0("V", sequence(Cols)),
fill = TRUE, sep = "+")
# V1 V2 V3
# 1 paragemcard-resp insufcardioresp
# 2 dpco pneumonia
# 3 posopperfulceragastrica ards
# 4 pos op hematoma #rim direito expontanea
# 5 miopatiaduchenne-erb insuf.resp
# 6 dpco dhca #femur
# 7 posde#subtroncantГ©ricaesqВЄ complicepidural
# 8 dpco asma
Also, possibly useful: the "stringi" library makes counting elements easy (as an alternative to the gregexpr
step above).
library(stringi)
Cols <- max(stri_count_fixed(x, "+") + 1)
Why the need for the "Cols" step? read.table
and family decides how many columns to use either by (1) the maximum number of fields detected within the first 5 rows of data or (2) the length of the col.names
argument. In your example row with the most number of fields is the sixth row, so directly using read.csv
or read.table
would result in incorrectly wrapped data.