Question

I am new to R and Programming itself having used XL and Miner for data analysis so please excuse if the problem seems too basic.

I have 4 data frames : farm1, farm2, farm3, farm4

`farm1<-structure(list(a = c(-0.700315674269212, 0.174376310290089, -0.802953642024395, 
-0.282317708655969, 0.198528974423857, 0.836114237945342, 0.983599830924647, 
1.14907220855077, -0.471945076669, -0.947783585965569), b = c(-0.0456355425554554, 
-0.301284883241843, 0.460328270868957, -0.496976686442155, -0.0325366991757349, 
0.458486775369624, -0.597532470372807, -0.648309589555456, 2.14749512128352, 
0.245124871567864), c = c(28.4681916252671, 31.5059762466411, 
36.5396753644422, 32.0019564063665, 33.6858689252592, 30.3833642979702, 
31.7212812595004, 33.2019595830279, 33.0727170129226, 31.4977963355712
), d = c(68.8195032459844, 68.3337594834099, 67.4836963601874, 
60.2779662871057, 67.0529412957513, 62.0801084450559, 63.0332790311212, 
57.9849455014888, 61.9213678477396, 51.4985302058811), e = c(5L, 
8L, 8L, 8L, 8L, 7L, 6L, 6L, 8L, 8L), f = c(17L, 12L, 12L, 13L, 
14L, 10L, 13L, 11L, 12L, 13L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

`farm2<-structure(list(a = c(-0.164523596253587, -0.253361680136508, 
0.696963375404737, 0.556663198673657, -0.68875569454952, -0.70749515696212, 
0.36458196213683, 0.768532924515416, -0.112346212150228, 0.881107726454215
), b = c(-0.568668732818502, -0.135178615123832, 1.1780869965732, 
-1.52356680042976, 0.593946187628422, 0.332950371213518, 1.06309983727636, 
-0.304183923634301, 0.370018809916288, 0.267098790772231), c = c(33.1943176411012, 
30.1639208202477, 33.0233590742733, 28.6119107117576, 36.2990711051031, 
37.9411996955176, 30.8983355706005, 28.8675961210504, 33.7091588823272, 
31.5948361883575), d = c(78.4097065630287, 63.764559983601, 68.1384361747047, 
64.168012952684, 59.5403607467056, 65.1327537970861, 53.1702482266538, 
72.7933291693773, 64.9195200292714, 77.0356700221729), e = c(7L, 
8L, 9L, 9L, 7L, 8L, 9L, 7L, 10L, 7L), f = c(11L, 12L, 13L, 12L, 
12L, 14L, 12L, 15L, 13L, 14L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

`farm3<-structure(list(a = c(-0.54252003099165, 1.20786780598317, 1.16040261569495, 
0.700213649514998, 1.58683345454085, 0.558486425565304, -1.27659220845804, 
-0.573265414236886, -1.22461261489836, -0.473400636439312), b = c(0.0601604404345152, 
-0.588894486259664, 0.531496192632572, -1.51839408178679, 0.306557860789766, 
-1.53644982353759, -0.300976126836611, -0.528279904445006, -0.652094780680999, 
-0.0568967778473925), c = c(30.1388999683276, 32.1263476194327, 
29.2672350543427, 32.4740863172122, 30.0362460682435, 37.3018618081179, 
34.1501224280516, 34.7305226884857, 33.152556073479, 37.0465282415583
), d = c(60.1855812763061, 61.2301316178366, 72.59369343125, 
60.0958218801378, 62.7557155383882, 61.6431524233481, 62.080042788709, 
62.3253201821406, 66.965129987607, 62.9360171063824), e = c(9L, 
8L, 6L, 9L, 8L, 9L, 8L, 9L, 8L, 6L), f = c(10L, 9L, 12L, 11L, 
12L, 15L, 14L, 12L, 13L, 9L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

`farm4<-structure(list(a = c(-1.91435942568001, 1.17658331201856, -1.664972436212, 
-0.463530401472386, -1.11592010504285, -0.750819001193448, 2.08716654562835, 
0.0173956196932517, -1.28630053043433, -1.64060553441858), b = c(-1.23132342155804, 
0.983895570053379, 0.219924803660651, -1.46725002909224, 0.521022742648139, 
-0.158754604716016, 1.4645873119698, -0.766081999604665, -0.430211753928547, 
-0.926109497377437), c = c(33.350561303818, 31.9443205018561, 
31.0457948763685, 29.2119135576389, 27.5376190695755, 28.774423110153, 
35.0000864111417, 30.1361999156095, 27.8467194578465, 37.6078718672707
), d = c(66.5506022642347, 62.5681173945218, 70.3508982922541, 
69.3185359082496, 60.2845417106131, 77.2366147872428, 62.4698378191539, 
55.4530320987231, 63.1336023882747, 65.2452300353941), e = c(5L, 
9L, 8L, 8L, 8L, 9L, 8L, 9L, 9L, 7L), f = c(12L, 15L, 10L, 12L, 
7L, 13L, 10L, 15L, 9L, 12L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

All of them are similar in structure. I am trying to run correlation exercises and facing the following problems: 1) Run correlation within each data frame between variables ‘a’ ‘b’, ‘c’, ‘d’, e’, 'f'

2) Run the above exercise across 4 data frames (comparing variables a, b, c, d, e,f within each data frame) and display the result in a table for farm1, farm2, farm3 and farm4. Instead of running the same commands 4 times, can I specify the function and apply it at once on all 4 data frames?

Each data frame relates to a unique farm and cannot be merged.

I referred the following posts and picked up a few things but unable to exactly address my problem.
https://stackoverflow.com/search?q=data+frame+correlation, Calculate correlation for more than two variables?, Calculate Correlations of Pairs of Columns in a Data Frame in R, Calculate correlation by aggregating columns of data frame, Pairwise Correlation Table

Was it helpful?

Solution

Very hard to give you a good answer without a reproducible example.

  1. Use mget to group your frames in the same list. The list is suitable for xxapply function. ia musing lapply here.
  2. cor can be applied to a matrix. You subset you data frame by column and create a matrix. This expects that you have numeric observations, otherwise you should filer using as.numeric

Applying this , we can this one liner code:

cols <- letters[6] ## a,b,...f
lapply(mget(ls(pattern='farm')),
     function(x)cor(as.matrix(x[,cols])))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top