k means clustering algorithm

Question 1

Step 3 corresponds to calculating the mean for each cluster. For cluster 1, you'd get as new cluster center (B+C+D+E) / 4, which is (35.25 33.75 29.75 21.75), i.e sum each component for all the points in the cluster separately, and divide it by the number of points in the cluster.

The cluster center (A for cluster 1) is usually not part of the calculation of the new cluster center.

Question 2

Don't throw in other distance functions into k-means.

K-means is designed to minimize the "sum of squares", not distances! By minimizing the sum of squares, it will coincidentially minimize Squared Eudlidean and thus Euclidean distance, but this may not hold for other distances and thus K-means may stop converging when used with arbitrary distance functions.

Again: k-means does not minimize arbitrary distances. It minimizes the "sum of squares" which happens to agree with squared Euclidean distance.

If you want an algorithm that is well-defined for arbitrary distance functions, consider using k-medoids (Wikipedia), a k-means variant. PAM is guaranteed to converge with arbitrary distance functions.

Question 3

For each cluster with n-dimensional points, calculate an n-dimensional center of mass to get the centroid. In your example, there are 4-dimensional points, so the center of mass is the mean along each of the 4 dimensions. For cluster 1, the centroid is: (30.20, 30.00, 27.80, 30.40). For example, the mean for the first dimension is calculated as (10+21+43+37+40)/5 = 30.20.

See the Wikipedia article on K-Means clustering for more information.