Is this a valid algorithm?
Yes, the algorithm is correct.
Given a node R
that is to be considered the root of the spanning tree, the depth of a node N
in the spanning tree is at least the length of the shortest path from R
to N
in the graph, so the sum of the depths is at least the sum of the lengths of the shortest paths (from R
).
The tree constructed by the algorithm connects each node to R
with one of the shortest paths, so the sum of the depths is the sum of the distances, which we saw above is a lower bound.
As a small optimisation, if the number of nodes is at least 3, no nodes with degree 1 need to be considered as the root of the tree. (For a tree rooted at a node R
with degree 1, consider the same graph, viewed as a tree rooted at R
's neighbour. The depth of R
increases by 1, the depth of all other nodes decreases by 1, so the sum of depths decreases by number_of_nodes - 2
.)