View normalized cross correlation as a normalized vector dot product. If the angle between two vectors is zero, their dot product will be 1; if they are facing in the opposite direction, then their dot product with be negative 1. This is idea is actually direct if you take the array and stack the column end to end. The result is essentially a dot product between two vectors.
Also just as a personal anecdote: what confused me about template matching at first, was intuitively I believed element wise subtraction of two images would be a good metric for image similarity. When I first saw cross correlation, I wondered why it used element wise multiplication. Then I realized that the later operation is the same thing as a vector dot product. The vector dot product, as I mentioned before, indicates when two vectors are pointing in the same direction. In your case, the two vectors are normalized first; hence why the range is from -1 to 1. If you want to read more about the implementation, "Fast Normalized Cross Correlation" by J.P. Lewis is a classical paper on the subject.