Question

I wish to perform date manipulation of an imported csv file with the first column being a date column of the format dd/mm/yyyy and I wish to use either R or Octave for this as after this date manipulation I will need to do various matrix/vector operations on the rest of the data, dependent upon these dates.

The dates in the imported csv file will NOT include weekends and there will also invariably be some other missing dates and what I wish to do is check the file and insert all these missing dates plus weekends such that the date column is completely contiguous from beginning to end, with no missing dates, and with "dummy" empty values associated with these inserted dates to be written appropriately in the resultant matrix. Which of R or Octave should I use for ease of doing this? I know that using Octave to do this would be very tricky, but I don't know about R. Ultimately all dates and data will be written to another named text file for subsequent plotting in Gnuplot.

Additionally, if someone could give hints as to which date functions I would need to use, how to approach this problem etc. that would be great.

Was it helpful?

Solution

It sounds as though you are dealing with financial data. The R packages zoo, xts, and quantmod should probably be reviewed because they offer worked solution to common data processing tasks in this area. There are other packages that define financial calendars. There is an R-SIG mailing list devoted to this topic as well. Even if you are dealing wihth some other real-world scenario that has data restricted to non-holiday weekdays, you are still going to find useful functionality in those packages for the task you (rather vaguely) have outlined.

Doing a search on SO for "[r] finance calendar" brings up this potentially relevant hit as well as several others.

OTHER TIPS

You can manipulate dates in either, so it mostly boils down to personal preference for the language.

It's been a while since I used Octave, but I use R and MATLAB regularly, and of the two I personally prefer R for data manipulation (and data munging tasks generally). If you choose R, the lubridate package is a good place to start.

I have never used Octave but I use R for data manipulation particulary csv files with Date as the first column and so far I am happy with it. The functions I suggest while working with date is strptime function. After you load your csv data frame, conver the date character to date. This is an example:

 % if Date is in the first column
df$Date<-strptime(as.character(df[,"Date"]),tz="CET",format="%d-%m-%Y %H:%M")

you can then extract the day, the month and the year using

year<-format(df$Date,"%Y")
month<-format(df$Date,"%m")
day<-format(df$Date,"%d")

many more...depending on your problem. I just tried to give you a starting point. Good luck!

Assuming that the data looks like:

date,attr1,attr2,attr3
"23/01/2011",1,2,3
"24/01/2011",4,5,6
"25/01/2011",7,8,9
"26/01/2011",10,11,12
"28/01/2011",13,45,55
"31/01/2011",2,2,2

Then you can try the following:

data<-read.csv("yourfile.csv")
#not easy to insert new rows in data frame. So split data and dates
dates<-as.vector(data[[1]])
data<-as.matrix(data[,2:ncol(data)])
rows<-nrow(data)
for(i in 1:(rows-1)){
  dd<-as.Date(dates[i],"%d/%m/%y%y")
  dd1<-as.Date(dates[(i+1)],"%d/%m/%y%y")
  diff<-dd1-dd
  if (diff>1){
    for (j in 1:(diff-1)){
      new.date<-format(dd+j,format="%d/%m/%y%y")
      dates[length(dates)+1]<-strtrim(paste(new.date,""),10)
      data<-rbind(data,c(-1,-1,-1))
    }
  }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top