To make distance and similarity metrics work out, you need one column per word in the vocabulary, then fill those columns with booleans zero and one as the corresponding words occur in samples. E.g.
G T SO Y!
google, test, stackoverflow => 1, 1, 1, 0
test, google => 1, 1, 0, 0
stackoverflow, yahoo => 0, 0, 1, 1
etc.
The squared Euclidean distance between the first two vectors is now
(1 - 1)² + (1 - 1)² + (1 - 0)² + (0 - 0)² = 1
which makes intuitive sense as the vectors differ in exactly one position. Similarly, the squared distance between the final two vectors is four, which is the maximal squared distance in this space.
This encoding is an extension of the "one-hot" or "one-of-K" coding, and it's a staple of machine learning on text (although few textbooks care to spell it out).