
I have multiple files in a folder I read these files with:

files <- list.files( "PATH", pattern = '*csv' , full.names = TRUE)
for( i in length(files) ) { 
    df <- fread(files[i], header = TRUE, sep = ";",stringsAsFactors=FALSE)

I am aware of the fact that I am overwriting the df object again and again. What I want to achieve is:

I want to have my data formatted as this example data:

> (str(StockPriceReturns))
'zoo' series from 2000-04-03 to 2013-03-28
  Data: num [1:3246, 1:30] NA NA NA NA NA NA NA NA NA NA ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:30] "Bajaj.Auto" "BHEL" "Bharti.Airtel" "Cipla" ...
  Index:  Date[1:3246], format: "2000-04-03" "2000-04-04" "2000-04-05" "2000-04-06" ...
> (head(StockPriceReturns))
           Bajaj.Auto       BHEL Bharti.Airtel     Cipla Coal.India   Dr.Reddy
2000-04-03         NA  4.9171044            NA  6.810041         NA -3.2541653
2000-04-04         NA -8.3348496            NA -3.368606         NA -8.3353739
2000-04-05         NA  0.3305788            NA  0.836825         NA  0.2616345
2000-04-06         NA -2.7605266            NA -2.466056         NA -1.8941289
2000-04-07         NA  3.2543548            NA  7.690426         NA  7.6961041
2000-04-10         NA  3.3107586            NA  6.154276         NA  6.4769648

My data in fact looks like that at the moment:

> (dput(head(df1,10)))
"02.01.2002", "03.01.2002", "04.01.2002", "07.01.2002", "08.01.2002", 
"09.01.2002", "10.01.2002", "11.01.2002", "14.01.2002"), Price = c("na", 
"na", "na", "na", "na", "na", "na", "na", "na", "na"), Currency = c("E", 
"E", "E", "E", "E", "E", "E", "E", "E", "E"), CDax = c("-0,260460226", 
"-1,827437365", "-0,814370143", "0,861279951", "-0,339133689", 
"-1,034650372", "0,713336597", "0,52727784", "2,893518519", "0,05790388"
), `Total Price Returns` = c("na", "na", "na", "na", "na", "na", 
"na", "na", "na", "na"), AbnormalReturns = c("na", "na", "na", 
"na", "na", "na", "na", "na", "na", "na")), .Names = c("Name", 
"Date", "Price", "Currency", "CDax", "Total Price Returns", "AbnormalReturns"
), class = c("data.table", "data.frame"), row.names = c(NA, -10L
), .internal.selfref = <pointer: 0x00000000002a0788>)
                    Name       Date Price Currency         CDax
 1: C-QUADRAT INVESTMENT 01.01.2002    na        E -0,260460226
 2: C-QUADRAT INVESTMENT 02.01.2002    na        E -1,827437365
 3: C-QUADRAT INVESTMENT 03.01.2002    na        E -0,814370143
 4: C-QUADRAT INVESTMENT 04.01.2002    na        E  0,861279951
 5: C-QUADRAT INVESTMENT 07.01.2002    na        E -0,339133689
 6: C-QUADRAT INVESTMENT 08.01.2002    na        E -1,034650372
 7: C-QUADRAT INVESTMENT 09.01.2002    na        E  0,713336597
 8: C-QUADRAT INVESTMENT 10.01.2002    na        E   0,52727784
 9: C-QUADRAT INVESTMENT 11.01.2002    na        E  2,893518519
10: C-QUADRAT INVESTMENT 14.01.2002    na        E   0,05790388
    Total Price Returns AbnormalReturns
 1:                  na              na
 2:                  na              na
 3:                  na              na
 4:                  na              na
 5:                  na              na
 6:                  na              na
 7:                  na              na
 8:                  na              na
 9:                  na              na
10:                  na              na

I can do this for the univariate case were I just convert my data.frame to a zoo object and then this object to an xts object with this function:

dfToZoo <- function(df) {
    date <- as.Date(df[, 1], format = '%d.%m.%Y')
    #TODO have a look if the column are rightly named
    with(df, zoo(TotalReturns, date))

However how to do this for the multivariate case?

Was it helpful?


One possibility is to read your data with read.zoo and reshape it by using the split argument.

# some data
df <- data.frame(Date = 20140101 + 0:2,
                 Name = rep(c("Bajaj.Auto", "BHEL", "Bharti.Airtel"), each = 3),
                 CDAX = rnorm(9))
#       Date          Name       CDAX
# 1 20140101    Bajaj.Auto  0.4020118
# 2 20140102    Bajaj.Auto -0.7317482
# 3 20140103    Bajaj.Auto  0.8303732
# 4 20140101          BHEL -1.2080828
# 5 20140102          BHEL -1.0479844
# 6 20140103          BHEL  1.4411577
# 7 20140101 Bharti.Airtel -1.0158475
# 8 20140102 Bharti.Airtel  0.4119747
# 9 20140103 Bharti.Airtel -0.3810761

# convert to zoo object. Use 'split' to reshape, and 'format' date if necessary.
z <- read.zoo(file = df, format = "%Y%m%d", split = "Name")

# ‘zoo’ series from 2002-01-01 to 2002-01-04
#   Data: num [1:4, 1:3] 2.308 0.1058 0.457 -0.0772 1.0274 ...
# - attr(*, "dimnames")=List of 2
#   ..$ : NULL
#   ..$ : chr [1:3] "Bajaj.Auto" "Bharti.Airtel" "BHEL"
#   Index:  Date[1:4], format: "2002-01-01" "2002-01-02" "2002-01-03" "2002-01-04"

#            Bajaj.Auto Bharti.Airtel      BHEL
# 2014-01-01  0.4020118    -1.0158475 -1.208083
# 2014-01-02 -0.7317482     0.4119747 -1.047984
# 2014-01-03  0.8303732    -0.3810761  1.441158 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top