Domanda

I have a variable with a given distribution (normale in my below example).

set.seed(32)    
var1 = rnorm(100,mean=0,sd=1)

I want to create a variable (var2) that is correlated to var1 with a linear correlation coefficient (roughly or exactly) equals to "Corr". The slope of regression between var1 and var2 should (rougly or exactly) equals 1.

Corr = 0.3

How can I achieve this?

I wanted to do something like this:

decorelation = rnorm(100,mean=0,sd=1-Corr)
var2 = var1 + decorelation

But of course when running:

cor(var1,var2)

The result is not close to Corr!

È stato utile?

Soluzione

I did something similar a while ago. I am pasting some code that is for 3 correlated variables but it can be easily generalized to something more complex.

Create an F matrix first:

cor_Matrix <-  matrix(c (1.00, 0.90, 0.20 ,
                     0.90, 1.00, 0.40 ,
                     0.20, 0.40, 1.00), 
                  nrow=3,ncol=3,byrow=TRUE)

This can be an arbitrary correlation matrix.

library(psych) 

fit<-principal(cor_Matrix, nfactors=3, rotate="none")

fit$loadings

loadings<-matrix(fit$loadings[1:3, 1:3],nrow=3,ncol=3,byrow=F)
loadings

#create three rannor variable

cases <- t(replicate(3, rnorm(3000)) ) #edited, changed to 3000 cases from 150 cases

multivar <- loadings %*% cases
T_multivar <- t(multivar)

var<-as.data.frame(T_multivar)

cor(var)

Again, this can be generalized. You approach listed above does not create a multivariate data set.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top