Question

I'm running simulations that I wish to analyse with an ANOVA methood, and I'm just having issues figuring out how to set it up.

I have a 2^5 factorial design, so I have 32 runs of my experiment with the appropriate +/- values outlining the 32 possible combinations.

However, I also have ~500 replicates for each combination. I'm gathering the data through simulations, so I just have 500 different runs for each of the combinations.

This results in 32 vectors of 500 response values. I also have a design matrix/vectors saving all of the +/- values used as the 5 different factors.

I think this would be relatively straight forward without the replicates (make factor variables for the factors, fit model do anova etc), but I'm getting confused with how I should set up my data matrix to deal with the replicates. Should I have a 500x32 matrix for my data? Take averages of my responses?

Thanks

Was it helpful?

Solution 2

Your matrix should have 6 columns and 2^5 * 500 = 16000 rows. The first column is your response variable, which I'm assuming is a continuous numerical variable. The next five columns each represent a treatment.

Say you are measuring plant height response to fertilizer and light. Fertilizer is either added or is not added, and light is either high or low. Your response if plant height. The first treatment column is "Fertilizer", and contains 1's and 0's. The second treatment column is "Light", and contains 1's and 0's.

Note that by "treatment" I do not mean "High Light" and "Low Light" --- the column is just for "Light".

To do the anova, you would do something like:

aov(Height~Fertilizer*Light, data=Data)

Basically, the first row in Data would be a column vector of the "32 vectors of 500 response values" (i.e., rbind() or c() those 32 vectors), and the next 5 columns should be your "design matrix/ vectors saving all of the +/- values used as the 5 different factors".

I hope that this description helps to organize your thoughts.

OTHER TIPS

You almost certainly want a data frame with 32x500 = 16000 rows, and 6 columns (5 for the covariates, and 1 for the response).

The data frame would be laid out like this:

x1 x2 x3 x4 x5 y
 0  0  0  0  0 *
 0  0  0  0  0 *
 ...

 1  0  0  0  0 *
 1  0  0  0  0 *
 ...

 0  1  0  0  0 *
 0  1  0  0  0 *
 ...

where each covariate pattern is replicated 500 times. You can generate this with

df <- expand.grid(1:500, x1=0:1, x2=0:1, x3=0:1, x4=0:1, x5=0:1)[, -1]

and then tack your response on the side.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top