Question

I am running coarsened exact matching (CEM) via the package MatchIt as a pre-processing step and want to use the matched data in further analyses. In looking at summary statistics for the matched data, I noticed that means extracted from the matched dataset differ from the MatchIt summary output. For example, using the lalonde dataset:

library(MatchIt)
library(doBy)
data(lalonde)

m.out <- matchit(treat ~ age + educ + black + hispan + married + nodegree + re74 + re75, data = lalonde, method = "cem")
summary(m.out)   #Means from MatchIt summary output:

Summary of balance for matched data: 

             Means Treated   Means Control 
 age         21.5441         21.1781 
 educ        10.2941         10.3827 
 black       0.8676          0.8676 
 hispan      0.0588          0.0588 
 married     0.0441          0.0441 
 nodegree    0.6176          0.6176 
 re74        456.1345        622.8740 
 re75        350.6728        520.7135 

m.dat<-match.data(m.out)
ExtractedMeans<-summaryBy(age+educ+black+hispan+married+nodegree+re74+re75 ~ treat, data = m.dat, FUN=function(x) { c(Mean=mean(x)) } )
ExtractedMeans   #Means extracted manually from matched data:

treat         1          0 
age.Mean      21.544    19.628 
educ.Mean     10.294     9.7179 
black.Mean    0.8676    0.60256 
hispan.Mean   0.0588    0.10256 
married.Mean  0.0441    0.07692 
nodegree.Mean 0.6176    0.75641 
re74.Mean     456.13    609.61 
re75.Mean     350.67    464.22 

The means for the control group extracted manually from the matched data are not consistent with the MatchIt summary output. Does anybody know what is going on here? I posted this question to the MatchIt gmane email list last week but have not received a response. Thank you for any help.

Was it helpful?

Solution

The 'doSummary' function is not using the weights. If you multiply the weights by the variable that you want to average, you will get the same average as the package displays. As an example, take your code and do this:

> tapply(m.dat$age, m.dat$treat, mean)
       0        1 
19.62821 21.54412

> tapply(m.dat$age*m.dat$weights, m.dat$treat, mean)
       0        1 
21.17811 21.54412

And so, they are equal the MatchIt results...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top