Question

I am trying to figure out how endpoints selects the times when my data is only weakly regular: that is, some observations are missing. I have 1 minute returns with some minutes missing and I am trying to convert to 5 minute intervals. How will endpoints decide which times to keep? The call I use is:

endpoints(ret_1_min_xts, k=5, "minutes")

My series looks like this, for example:

1986-02-04 09:32:00 1
1986-02-04 09:33:00 2
1986-02-04 09:34:00 3
1986-02-04 09:35:00 4
1986-02-04 09:36:00 5
1986-02-04 09:37:00 6
1986-02-04 09:38:00 7
1986-02-04 09:39:00 8
1986-02-04 09:40:00 9
1986-02-04 09:41:00 10
1986-02-04 09:42:00 11
1986-02-04 09:45:00 12
...

with the call to endpoints returning:

1986-02-04 09:34:00
1986-02-04 09:39:00 
1986-02-04 09:42:00 
1986-02-04 09:49:00
1986-02-04 09:54:00
...

I am trying to look at the source code of endpoints but it seems that the function is in C and is called via .Call; am i understanding that correctly? if someone could explain the methodology used, that would be very helpful.

Was it helpful?

Solution

as answered in the comments above and taken directly from the endpoints.c source code, the function returns this:

c(0,which(diff(_x%/%on%/%k+1) != 0),NROW(_x))

where _x = .index(my_xts)

what does this actually do? with respect to my call of the endpoints function:

the stuff inside diff first removes the seconds and then moves things into k minute increments (all this using integer division). diff then simply notes the points where you change to the next increment and which returns those points where the increment occurs. in effect this simply returns the last point in each 5 minute interval (k=5 in my call)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top