Reading a text file with the right colClasses

Question

You have a typical data cleansing problem - in my experience, 80% of the project time for a typical analytical task gets consumed by data preparation.

Given your data sample, try the following:

Use read.csv() with the argument quote="". This will ignore all of your quote marks - but of course you may have to remove these later.
Use a regular expression to remove any garbage characters in numeric columns (e.g. " or _) and then coerce into numeric.

Try this:

data <- "
WK,MND,CS,SHP,RevCY,RevLY,TCY,TLY,ACY,ALY
\"2,JAN,GER,\"\"Victoria's Secrets\"\",29307,25419,841,768,2320,1755\"
2,JAN,KAP,Brand Shop,2027,-,95,0,175,-0
2,JAN,KAP,Kapp‚ Drugstore West,89768,78824,3309,3052,6197,5634
2,JAN,KAP,Kapp‚ P&C Centraal,680019,640951,8709,8116,19450,18385
2,JAN,KAP,Kapp‚ Sunglasses Centraal,49216,43940,464,421,550,478
2,JAN,KAP,Kapp‚ Sunglasses Schengen,25721,26592,306,318,333,378
2,JAN,KAP,Kapp‚ Sunglasses West,50280,53089,477,510,566,_78
"

Now read the data:

x <- read.csv(text=data, quote="", header=TRUE)

Start the cleaning process:

numericCols <- c(1, 5:10)
x[numericCols] <- lapply(x[numericCols], function(x)as.numeric(gsub("[-_\"]", "", x)))
x

The result:

  WK MND  CS                       SHP  RevCY  RevLY  TCY  TLY   ACY   ALY
1  2 JAN GER    ""Victoria's Secrets""  29307  25419  841  768  2320  1755
2  2 JAN KAP                Brand Shop   2027     NA   95    0   175     0
3  2 JAN KAP      Kapp‚ Drugstore West  89768  78824 3309 3052  6197  5634
4  2 JAN KAP        Kapp‚ P&C Centraal 680019 640951 8709 8116 19450 18385
5  2 JAN KAP Kapp‚ Sunglasses Centraal  49216  43940  464  421   550   478
6  2 JAN KAP Kapp‚ Sunglasses Schengen  25721  26592  306  318   333   378
7  2 JAN KAP     Kapp‚ Sunglasses West  50280  53089  477  510   566    78