Domanda

I am trying find a vectorized solution of updating vector b values based on values of vector a. The problem I have is this:

> # Vector a is the "driver" meaning if there is 1 or -1 in vector a
> # -1 or 1 needs to follow in vector b. The challenge I have is when 
> # I have 1 or -1 in a and in b I have two or more -1 or 1
> # then all but first same values in b should be set to 0 if values 
> # in a does not change
> a <- c(0, 1, 0, 0, 0, 0, 0,-1, 0, 0, 1, 1,-1,-1, 0, 0, 1, 0, 0,-1, 0, 1, 0, 0, 0, 0, 0)
> b <- c(0, 0,-1, 0,-1, 0, 0, 0, 0, 1, 1,-1,-1, 1, 1, 0, 0,-1, 0, 0, 1, 0,-1,-1, 0,-1, 0)
> a
 [1]  0  1  0  0  0  0  0 -1  0  0  1  1 -1 -1  0  0  1  0  0 -1  0  1  0  0  0  0  0
> b
 [1]  0  0 -1  0 -1  0  0  0  0  1  1 -1 -1  1  1  0  0 -1  0  0  1  0 -1 -1  0 -1  0
> 
> # I need a vectorized function(a, b), if possible, that changes b 
> # based on a like below (removing some repeated values in b)
> # like below
> b[5] <- 0
> b[11] <- 0
> b[24] <- 0
> b[26] <- 0
> a
 [1]  0  1  0  0  0  0  0 -1  0  0  1  1 -1 -1  0  0  1  0  0 -1  0  1  0  0  0  0  0
> b
 [1]  0  0 -1  0  0  0  0  0  0  1  0 -1 -1  1  1  0  0 -1  0  0  1  0 -1  0  0  0  0

Any help/hint in how to do this in vectorized way highly appreciated.

I tried "standard" approaches using rle, cumsum, diff, ...

# I tried to play around with
test <- data.frame(
        a=a,
        b=b,
        a.plus.b=a + b,
        diff.a.plus.b=c(0, diff(a + b)),
        cumsum.a.plus.b=cumsum(a + b),
        diff.cumsum.a.plus.b=c(0, diff(cumsum(a + b)))
)
test 

rle(b)
rle(b)$values
rle(b)$lengths

Edit: Based on David request to be more clear about what I am trying to do I will explain in length the problem.

I am building simplified trading backtesting functionality (since quantstrat is to complex and to slow for my needs).

The problem above (at the top of the message) arises when I get an entry signal vector a above with values 1 (go long) or -1 (go short). After entry signal three things can happen (kept in vector b):
- a time stop is hit (exit at the end of day b==-1 if long and b==1 if short),
- a profit target is reached (again b==-1, b==1) or
- a stop loss triggered (again b==-1, b==1).

So vector b represents possible events/exits after each entry (there are no overlapping trades - one closes before another is entered). Sometimes the trades are going directly into my favour and we immediately hit profit target. Great. Sometimes we hit stop before we hit profit target. Sometimes neither stop is hit neither we reach profit target by end of day, so, we are left with end of day.

I need to remove all but the first exit events after entry (a==1 or a==-1). Since not all can/will happen, just the first (from time perspective) should stay and I should remove the subsequent ones.

Let me give an example. We enter a long trade at 9:31 (on close of a first minute regular trading hours bar). So a becomes:

a <- c(1, 0, 0, 0, 0, ..., 0)

We always exit at the close of last minute bar (time stop) so we add last possible exit to b:

b <- c(0, 0, 0, 0, 0, ...,-1)

We also know that (in the backtest) that our profit target would already be reached on the the close of the bar at 9:35 so we add this fact to b (b[5] <- -1):

b <- c(0, 0, 0, 0,-1, ...,-1)

And, we also know (in the backtest) that a stop would trigger at 9:33 so we add this to b (b[3] <- -1) which now becomes:

b <- c(0, 0,-1, 0,-1, ...,-1)

So, since my profit target will never be reached (stop is hit before) and we will not be in the trade on the market close I should set b[5] <- 0 and b[length(b)] <- 0 . So, removing all but first exit triggers in b after entry (a==1). The b should become:

b <- c(0, 0,-1, 0, 0, ..., 0)

I need to process this for say thousand days in the past...

I hope this clarifies what I am trying to do.

È stato utile?

Soluzione

I'm not sure if I really understand what you're trying to do, but if do understand I think I have a vectorized solution for you.

> f <- function(a,b){
+   b[unique(c(which(a[-length(a)] == 0 & b[-1] != 0) + 1,which(b[-length(b)] == b[-1] & b[-1] != 0)))] <- 0
+   return(b)
+ }
> f(a,b)
 [1]  0  0 -1  0  0  0  0  0  0  0  0  0 -1  0  1  0  0 -1  0  0  1  0  0  0  0  0  0

Here was my rational. I think you want to set values of b to zero based on two different scenarios:

1) When non-zero values of b repeat. If so this should find those indices:

which(b[-length(b)] == b[-1] & b[-1] != 0)

2) When non-zero values of b occur when the previous index of a was zero. If so this should do the trick:

which(a[-length(a)] == 0 & b[-1] != 0) + 1

Hopefully I didn't misunderstand your goals here.

EDIT:

Second try here. I'm still pretty sure that I don't understand what you're trying to do since my solution still flags b[10] (which you say it shouldn't), but from what you're writing the best I can understand is that you want to make the following changes:

Non-zero values of "b" that follow zero values of "a" must be set to zero.

Since this rule incorrectly flags b[10] can you please tell me why it is incorrect? I think this problem will need to be phrased that way in order for me to give you a solution since the finance talk just sounds like jibberish to me.

Anyway, here is the vectorized solution for the rule I listed.:

> f <- function(a,b) {
+   b[which(b != 0)[which(!which(b != 0) %in% (which(a[-length(a)] != 0) + 1))]] <- 0
+   return(b)
+ }
> f.indices <- function(a,b) which(b != 0)[which(!which(b != 0) %in% (which(a[-length(a)] != 0) + 1))]
> f(a,b)
 [1]  0  0 -1  0  0  0  0  0  0  0  0 -1 -1  1  1  0  0 -1  0  0  1  0 -1  0  0  0  0
> f.indices(a,b)
[1]  5 10 11 24 26

EDIT: Third try is the charm...

Now operating under the assumption that goal is the set all non-zero values of b to be zero except for the first value that follows a non-zero value of a. I'm not sure if/how that can be fully vectorized, but here should a quick solution:

> a <- c(0, 1, 0, 0, 0, 0, 0,-1, 0, 0, 1, 1,-1,-1, 0, 0, 1, 0, 0,-1, 0, 1, 0, 0, 0, 0, 0)
> b <- c(0, 0,-1, 0,-1, 0, 0, 0, 0, 1, 1,-1,-1, 1, 1, 0, 0,-1, 0, 0, 1, 0,-1,-1, 0,-1, 0)
> 
> f <- function(a,b){
+   #non-zero b indices
+   nz.b <- which(b != 0)
+   #non-zero a indices
+   nz.a <- which(a != 0)  
+   #non-zero b indices that do not follow non-zero a indices
+   nz.b.rm <- nz.b
+   for(i in nz.a){
+     nz.b.rm <- nz.b.rm[!nz.b.rm %in% nz.b[nz.b > i][1]] 
+   }
+   #print non-zero b indices that do no folow non-zero a indices
+   print(paste0("Indices Removed: ",paste(nz.b.rm,collapse=",")))
+   #remove non-zero b indices that do not follow non-zero a indices
+   return(b[-nz.b.rm])
+ }
> 
> b.new <- f(a,b)
[1] "Indices Removed: 5,11,24,26"
> b.new
 [1]  0  0 -1  0  0  0  0  0  1 -1 -1  1  1  0  0 -1  0  0  1  0 -1  0  0
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top