質問

I am new in this field so please be gentle with terminology. In the original paper; "Understanding the difficulty of training deep feedforward neural networks", I dont understand how equation 15 is obtained, it states that giving eq 1 :

$$ W_{ij} \sim U\left[−\frac{1}{\sqrt{n}},\frac{1}{\sqrt{n}}\right] $$

it gives rise to variance with the following property:

$$ n*Var[W]=1/3 $$

where $n$ is the size of the layer.

How is this last equation(15) obtained?

Thanks!!

正しい解決策はありません

ライセンス: CC-BY-SA帰属
所属していません datascience.stackexchange
scroll top