Question

Is there an easy way to generate two time-series with a fixed correlation? For instance 0.5.

Does anyone know a solution in R or Python? Thanks!

Was it helpful?

Solution

This question is quite general, I think. It is not limited to just time-series. What you are asking is to generate 2d random variable, with known covariance. r==0.5, std1=1 and std2=2 would translate to a covariance matrix of [[1,1],[1,4]]. Therefore, if we assume the data is multidimensional normal distributed, we can generate such a random variable:

In [42]:
import numpy as np
val=np.random.multivariate_normal((0,0),[[1,1],[1,4]],1000)
In [43]:

np.corrcoef(val.T)
Out[43]:
array([[ 1.      ,  0.488883],
       [ 0.488883,  1.      ]])
In [44]:

np.cov(val.T)
Out[44]:
array([[ 1.03693888,  0.96490767],
       [ 0.96490767,  3.75671707]])
In [45]:

val=np.random.multivariate_normal((0,0),[[1,1],[1,4]],10)
In [46]:

np.corrcoef(val.T)
Out[46]:
array([[ 1.        ,  0.56807297],
       [ 0.56807297,  1.        ]])
In [48]:

val[:,0]
Out[48]:
array([-0.77425116,  0.35758601, -1.21668939, -0.95127533, -0.5714381 ,
        0.87530824,  0.9594394 ,  1.30123373,  1.92511929,  0.98070711])
In [49]:

val[:,1]
Out[49]:
array([-1.75698285,  2.24011423, -3.5129411 , -1.33889305,  2.32720257,
        0.53750133,  3.23935645,  2.96819425, -0.72551024,  3.0743096 ])

As shown in this example, if your sample size is small, the resulting random variable may deviate from r=0.5, considerably.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top