質問

first look at this example

>>> x = t.randn(512)
>>> w = t.randn(512, 500000)
>>> (x @ w).var()
tensor(513.9548)

it makes sense that the variance is close to 512 because each one of 500000, is a dot product of a 512 vector and a 512 vector, that is sampled from a distribution with a standard deviation of 1 and mean of 0

However, I wanted the variance to go down to 1, and consequently the std to be 1 since standard deviation is square root of variance, where 1 is the variance.

To do this I tried the below

>>> x = t.randn(512)
>>> w = t.randn(512, 500000) * (1/512)
>>> (x @ w).var()
tensor(0.0021)

However the variance is actually now 512 / 512 / 512 instead of 512/ 512

In order to do this correctly, I needed to try

>>> x = t.randn(512)
>>> w = t.randn(512, 500000) * (1 / (512 ** .5))
>>> (x @ w).var()
tensor(1.0216)

Why is that the case?

正しい解決策はありません

ライセンス: CC-BY-SA帰属
所属していません datascience.stackexchange
scroll top