I think my question is best understood with a little code:
#Load data
b <- structure(list(s1 = c(18.5, 24, 17.2, 19.9, 18), s2 = c(26.3,
25.3, 24, 21.2, 24.5), s3 = c(20.6, 25.2, 20.8, 24.7, 22.9),
s4 = c(25.5, 19.9, 22.6, 17.5, 20.4)), .Names = c("s1", "s2",
"s3", "s4"), row.names = c(NA, -5L), class = "data.frame")
# Model A
# One way (the wrong way) to test wether s1,s2,s3,s4 differs:
summary(aov(s1~s2+s3+s4, data=b))
# R does not complain here - but I don't know what I am doing. I guess I am trying
# to explain the variance in s1, with the variable s2,s3 and s4.
# I am not sure how this actually is different from a proper anova (see below).
# Also I dont understand why the Sum of Squares for s3 is much larger than the sum of
# squares for s2 and s4.
# Model B
# The correct way to do it (requires reshape)
# install.packages('reshape')
# library(reshape)
summary(aov(value ~variable, data=melt(b)))
# This is correct - I am here testing variation within the factors of 'variable',
# to explain variation in 'value'.
# Doing
TukeyHSD(aov(value ~variable, data=melt(b)))
# shows me that s1 is significantly different from s2.
# My way of thinking is that this result should be evident from "model A"
# What does Sum of Squares in model A mean? - why is it so big for s3?
So as from the comments in the code above: I am asking for an explanation of how and why Model A is wrong.
Solution
Model A is not ANOVA. You are modelling one response variable (s1) using s2, s3 and s4 as predictors; this is analysis of covariance here. The reason why it is so big for s3 will become apparent if you plot a correlation matrix; cor( b ) will show you
You cannot compare it with the model B, where you treat s1 as predictor rather than response variable, and your response variable is the class (s1, s2, s3 or s4).