euclidean distance between vectors grouped by other variable in SPSS, R or Excel

Question 1

As an example of how I would approach this in SPSS, first lets read the example data into SPSS.

data list list (",") / case group val1 val2 val3 val4 (6F1.0).
begin data
1,1,3,5,6,8
2,1,2,7,5,4
3,2,1,3,6,8
4,2,5,4,3,7
5,1,8,6,5,3
end data.
dataset name orig.

Then we can use SPLIT FILE and PROXIMITIES to get our distance matrix by group. Note, as you mentioned in the comments to flodel's answer, this produces a seperate dataset we need to work with (also note case practically never matters in SPSS syntax, e.g. split file and SPLIT FILE are equivalent).

sort cases by group.
split file by group.
dataset declare dist.
PROXIMITIES val1, val2, val3, val4
/STANDARDIZE = Z
/MEASURE = EUCLID
/PRINT = NONE
/MATRIX = OUT('dist').

Unlike R, basically everything within an SPSS data matrix is like an R data.frame, so SPLIT file near functionally replaces all the different *ply functions in R. Very convienant, but less flexible in general. So now we need to aggregate the distances in the dist file I saved the results to. We first sum across rows, and then sum by group via an AGGREGATE command.

dataset activate dist.
compute dist_sum = SUM(VAR1 to VAR3).
*it appears SPSS keeps empty cases - we dont want them in the aggregation.
select if MISSING(dist_sum) = 0.
dataset activate dist.
DATASET DECLARE dist_agg.
AGGREGATE
  /OUTFILE='dist_agg'
  /BREAK=group
  /dist_sum = SUM(dist_sum)
  /N_Cases=N.
dataset activate dist_agg.
compute mean_dist = dist_sum /(N_Cases*(N_Cases - 1)).

Here I save the aggregated results into another dataset named dist_agg. Because SPSS (annoyingly) saves the full distance matrix, the mean will not be n*(n-1)/2 (as in the equivalent R syntax), but will be n*(n-1) assuming you do not want to count the diagonal elements towards the mean. Then we can just merge these back into the orig data file via a match files command.

*merge back into the original dataset.
dataset activate orig.
match files file = *
/table = 'dist_agg'
/by group.
exe.

*clean out old datasets if you like.
dataset close dist.
dataset close dist_agg.

The flexibility of R to go back and forth between matrix and data.frame objects makes SPSS a bit more clunky for this job. I could write a much more concise program to do this in SPSS's MATRIX language, but to do it across groups in MATRIX is a pain in the butt (compared to R's *ply syntax).

Question 2

Yes, it is probably easier in R...

Your data:

dat <- data.frame(case  = 1:5, 
                  group = c(1, 1, 2, 2, 1),
                  val1  = c(3, 2, 1, 5, 8),
                  val2  = c(5, 7, 3, 4, 6),
                  val3  = c(6, 5, 6, 3, 5),
                  val4  = c(8, 4, 8, 7, 3))

A short solution:

library(plyr)
ddply(dat[c("group", "val1", "val2", "val3", "val4")],
      "group", function(x)c(mean.ED = mean(dist(scale(as.matrix(x))))))
#   group  mean.ED
# 1     1 3.121136
# 2     2 3.162278

Question 3

Here is a much simpler solution using base R.

d <- by (dat[,2:5], dat$group, function(x) dist(x))

sapply(d,mean)