Barplot mean /w SD in R-Project
-
02-07-2021 - |
Domanda
Sounds like a trivial one, but some research didn´t come up with an elegant solution: I have a dataframe structured with a categorial variable (GROUP) and a continuous read-out variable (bloodpressure). How can a make a simple box-plot showing the mean for each group with its standard deviation? There are multiple groups: A,B,C,D How can I perform an ANOVA post-hoc analysis within the dataframe. How does it work with Mann-Whitney-U-Test? Can I mark the significance level in the bar-plot? How can I streamline this operation to multiple continuous variables (dia_bloodpressure, sys_bloodpressure, mean_bloodpressure) and sink() the output in different files (by name of the variable)?
Soluzione
After some research I came up with the agricolae package. This one provides multiple group comparison. The resulting objects can be pipelined into a decent plotting function for groupwise bar-graphs +/- SD or SEM. Unfortunately, no way to use markers of significance between groups in the plots.
Altri suggerimenti
After some more programming in R, I stumbled over another nice package suitable for medical research: psych.
Considering the question above, describe()
and describeBy()
get statistical overview of a dataframe and sort it by a grouping variable.
The function error.bars.by()
is an advanced plotting function for mean values +/- SD.
The package offers many functions on covariate analysis, which are useful in psychological research but might also help for medical and marketing research.
A possible code snippet:
library(psych)
x<-c(1,2,3,4,5,6,7,8,9,NA)
y<-c(2,3,NA,3,4,NA,2,3,NA,2)
group<-rep((factor(LETTERS[1:2])),5)
df<-data.frame(x,y,group)
df
by(df$x,df$group,summary)
by(df$x,df$group,mean)
sd(df$x) #result: NA
sd(df$x, na.rm=TRUE) #result: 2.738613
v = c("x", "y")#or
v = colnames(df)[1:2]
sapply(v, function(i) tapply(df[[i]], df$group, sd, na.rm=TRUE))
describeBy(df$x, df$group)
error.bars.by(df$x, df$group, bars=TRUE)