How to split unequal columns in R

https://stackoverflow.com//questions/9638131

10-12-2019
|

Question

I have a data set that should contain 14 columns, but when I read it into R it presents as two columns, with the latter columns reading in as one, and are all separated by "."

I read in using:

dat <- read.table ("/data/GER.female.RAWMACH", header = F, sep = "\t")

Below I have provided the output:

head (dat)

V1
TRAIT
CASE
CASE
CASE
CASE
CASE
CASE

V2 MARKER..........ALLELES..FREQ1....RSQR...EFFECT1..OR......STDERR..WALDCHISQ.PVALUE.....LRCHISQ.LRPVAL.NCASES.NCONTROLS
rs7 T A .9104 .0001 -3.944 0.019 19.634 0.0403 0.8408 0.0403 0.8409 260 446

rs6 A C .9114 .0002 -2.552 0.078 14.349 0.0316 0.8589 0.0316 0.8589 260 446

rs9 C T .8444 .0001 2.772 15.985 15.076 0.0338 0.8541 0.0338 0.8542 260 446

rs5 G A .9164 .0001 -3.683 0.025 18.039 0.0417 0.8382 0.0417 0.8383 260 446

rs2 T C .5168 .0001 -2.466 0.085 10.811 0.0520 0.8195 0.0520 0.8196 260 446

rs1 T G .8229 .0002 -1.727 0.178 12.241 0.0199 0.8878 0.0199 0.8878 260 446

I have tried a few things (rewriting the table, colsplit) with no success. What am I missing?

I appreciate any suggestions you may have!

Solution

You thought you had a tab separated file, but it wasn't. You also DO have a header. Just use the default white-space separator by dropping the sep="\t" and setting header=TRUE.

OTHER TIPS

It's hard to say for sure without more information, but I'm pretty confident that the best way to solve this will be through loading the table properly in the first place. Unless the actual structure of the data that you're loading is in the form that you're getting, you're loading it wrong; look at the documentation for read.table and related methods, in particular the sep and header arguments. I'm guessing this will clear up your issue with the data import without requiring after-the-fact cleanup.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow