Question

I’m trying to create a custom function which use the drawdown function from the tseries package. I want to apply this function to the correct range of values in the function, but even though this is a pretty newbie question, I can’t see a possible solution.

Here’s what my dataframe looks like:

> subSetTrades
   Instrument  EntryTime   ExitTime AccountValue
1         JPM 2007-03-01 2007-04-10         6997
2         JPM 2007-04-10 2007-05-29         7261
3         JPM 2007-05-29 2007-07-18         7545
4         JPM 2007-07-18 2007-07-19         7614
5         JPM 2007-07-19 2007-08-22         7897
6         JPM 2007-08-22 2007-08-28         7678
7         JPM 2007-08-28 2007-09-17         7587
8         JPM 2007-09-17 2007-10-17         7752
9         JPM 2007-10-17 2007-10-29         7717
10        JPM 2007-10-29 2007-11-02         7423
11        KFT 2007-04-13 2007-05-14         6992
12        KFT 2007-05-14 2007-05-21         6944
13        KFT 2007-05-21 2007-07-09         7069
14        KFT 2007-07-09 2007-07-16         6919
15        KFT 2007-07-16 2007-07-27         6713
16        KFT 2007-07-27 2007-09-07         6820
17        KFT 2007-09-07 2007-10-12         6927
18        KFT 2007-10-12 2007-11-28         6983
19        KFT 2007-11-28 2007-12-18         6957
20        KFT 2007-12-18 2008-02-20         7146

If I manually calculate the values I want my function to output, the results are correct:

# Apply the function to the dataframe
with(subSetTrades, tapply(AccountValue, Instrument, MDD_Duration))
JPM KFT 
106  85 
> # Check the function for JPM
> maxdrawdown(subSetTrades[1:10,4])$from
[1] 5
> maxdrawdown(subSetTrades[1:10,4])$to
[1] 10
> # Get the entry time for JPM on row 5
> # Get the exit time for JPM on row 10
> # Calculate the time difference
> difftime(subSetTrades[10,3], subSetTrades[5,2], units='days')
Time difference of 106 days
# Check the calculations for the other Instrument
> maxdrawdown(subSetTrades[11:20,4])$from
[1] 3
> maxdrawdown(subSetTrades[11:20,4])$to
[1] 5
> # Get the exittime on row 5 for KFT, get the entrytime for KFT on row 3, 
# and calculate the time difference
> difftime(subSetTrades[15,3], subSetTrades[13,2])
Time difference of 67 days

As you can see in the above example, my custom function (MDD_Duration) gives the right values for JPM but gives the wrong values for KFT: instead of 85 the result should be 67. The function MDD_Duration is the following:

MDD_Duration <- function(x){
    require(tseries)
    # Get starting point
    mdd_Start <- maxdrawdown(x)$from
    mdd_StartDate <- subSetTrades$EntryTime[mdd_Start]
    # Get the endpoint
    mdd_End <- maxdrawdown(x)$to
    mdd_EndDate <- subSetTrades$ExitTime[mdd_End]
    return(difftime(mdd_EndDate, mdd_StartDate, units='days'))
}

Manually retracing the steps of this custom function shows there is a problem with the calculation with the ‘from’ and ‘to’ row numbers (i.e. R needs to adjust the values of KFT for the length of the instrument which preceded it, in this case JPM). For the possible solution, R needs to do something like:

Get the ‘from’ value of the maxdrawdown function if this instrument is the first (i.e. in top of the list). However, if the current instrument is the second (or third, etc), then take into account the length of the previous instrument. So, if instrument JPM has a length of 10, the searching for the values of KFT should start at +10. And the searching for the from and to values for instrument 3 should start at the lenght of instrument 1 + the length of instrument 2.

I tried using nrow into the function (which seems the obvious solution to this answer), which resulted in errors regarding ‘argument of length 0’, even though nrow was used correctly (i.e. the same statement outside the function did work). I also tried to subset the data inside the function, which also didn’t work out. Any ideas are highly welcome. :)

Was it helpful?

Solution

split is your friend here. If I modify your function so that it expects a data frame with the three variables of interest (AccountValue, EntryTime, ExitTime) like this:

MDD_Duration <- function(x){
    # require(tseries)
    # Get starting point
    mdd_Start <- maxdrawdown(x$AccountValue)$from
    mdd_StartDate <- x$EntryTime[mdd_Start]
    # Get the endpoint
    mdd_End <- maxdrawdown(x$AccountValue)$to
    mdd_EndDate <- x$ExitTime[mdd_End]
    return(difftime(mdd_EndDate, mdd_StartDate, units='days'))
}

The we can apply it to the splitted version of your data:

> sapply(split(subSetTrades[,-1], subSetTrades[,1]), MDD_Duration)
JPM KFT 
106  67

It might be helpful to see what split is doing to your data:

> split(subSetTrades[,-1], subSetTrades[,1])
$JPM
    EntryTime   ExitTime AccountValue
1  2007-03-01 2007-04-10         6997
2  2007-04-10 2007-05-29         7261
3  2007-05-29 2007-07-18         7545
4  2007-07-18 2007-07-19         7614
5  2007-07-19 2007-08-22         7897
6  2007-08-22 2007-08-28         7678
7  2007-08-28 2007-09-17         7587
8  2007-09-17 2007-10-17         7752
9  2007-10-17 2007-10-29         7717
10 2007-10-29 2007-11-02         7423

$KFT
    EntryTime   ExitTime AccountValue
11 2007-04-13 2007-05-14         6992
12 2007-05-14 2007-05-21         6944
13 2007-05-21 2007-07-09         7069
14 2007-07-09 2007-07-16         6919
15 2007-07-16 2007-07-27         6713
16 2007-07-27 2007-09-07         6820
17 2007-09-07 2007-10-12         6927
18 2007-10-12 2007-11-28         6983
19 2007-11-28 2007-12-18         6957
20 2007-12-18 2008-02-20         7146

So as long as you have a function that will accept and work with a data frame/ subset of your data set, we can use split to form the subsets and lapply or sapply to apply our function to those subsets.

You might want to incorporate this into your function MDD_Duration():

MDD_Duration2 <- function(x){
    FUN <- function(x) {
        # Get starting point
        mdd_Start <- maxdrawdown(x$AccountValue)$from
        mdd_StartDate <- x$EntryTime[mdd_Start]
        # Get the endpoint
        mdd_End <- maxdrawdown(x$AccountValue)$to
        mdd_EndDate <- x$ExitTime[mdd_End]
        return(difftime(mdd_EndDate, mdd_StartDate, units='days'))
    }
    sapply(split(x, droplevels(x[, "Instrument"])), FUN)
}

Where we use the new (in R 2.12.x) function droplevels on x[, "Instrument"]) to allow the function to work even if we have a single level of data or operate on a subset of the data:

> MDD_Duration2(subSetTrades)
JPM KFT 
106  67 
> MDD_Duration2(subSetTrades[1:10,])
JPM 
106
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top