I will use the golf course example data you linked, to set the stage:
import numpy as np
A=np.matrix((4,4,3,4,4,3,4,2,5,4,5,3,5,4,5,4,4,5,5,5,2,4,4,4,3,4,5))
A=A.reshape((3,9)).T
This gives you the original 9 rows, 3 columns table with scores of 9 holes for 3 players:
matrix([[4, 4, 5],
[4, 5, 5],
[3, 3, 2],
[4, 5, 4],
[4, 4, 4],
[3, 5, 4],
[4, 4, 3],
[2, 4, 4],
[5, 5, 5]])
Now the singular value decomposition:
U, s, V = np.linalg.svd(A)
The most important thing to investigate is the vector s
of singular values:
array([ 21.11673273, 2.0140035 , 1.423864 ])
It shows that the first value is much bigger than the others, indicating that the corresponding Truncated SVD with only one value represents the original matrix A
quite well. To calculate this representation, you take column 1 of U
multiplied by the first row of V
, multiplied by the first singular value. This is what the last cited command in R does. Here is the same in Python:
U[:,0]*s[0]*V[0,:]
And here is the result of this product:
matrix([[ 3.95411864, 4.64939923, 4.34718814],
[ 4.28153222, 5.03438425, 4.70714912],
[ 2.42985854, 2.85711772, 2.67140498],
[ 3.97540054, 4.67442327, 4.37058562],
[ 3.64798696, 4.28943826, 4.01062464],
[ 3.69694905, 4.3470097 , 4.06445393],
[ 3.34185528, 3.92947728, 3.67406114],
[ 3.09108399, 3.63461111, 3.39836128],
[ 4.5599837 , 5.36179782, 5.0132808 ]])
Concerning the vector factors U[:,0]
and V[0,:]
: Figuratively speaking, U
can be seen as a representation of a hole's difficulty, while V
encodes a player's strength.