some notes about《Negative eigenvalues of the hssian in deep neural networks》

2019/7/21 17:31:56 人评论 次浏览 分类:学习教程

Here’s some part Appendix C “optimal step sizes” of[1]:
θt+1=θtαH(θt)1g(θt)(1)\theta_{t+1}=\theta_t-\alpha H(\theta_t)^{-1}g(\theta_t)-----------(1)
α:learning rate,also called step size in [1]\alpha:\text{learning rate,also called step size in [1]}

Quotation from[1]:“if we project the gradient in the basis of eigenvectors,we get:”
g(θ)=Σi=1N[g(θ)Tvi]vi(2)g(\theta)=\Sigma_{i=1}^N[g(\theta)^Tv_i]v_i-----------(2)
I can not understand (2) ,so I design a simple example:

let g(θ)={2312}g(\theta)=\left\{ \begin{matrix} 2 & 3 \\ 1 & 2 \end{matrix} \right\}

then its eigen values and eigen vectors are :
λ1=2+3,v1=(3,1)\lambda_{1}=2+\sqrt{3},v_1=(\sqrt{3},1)

λ2=23,v2=(3,1)\lambda_2=2-\sqrt{3},v_2 =(-\sqrt{3},1)

gT(θ)={2132}22g^T(\theta)=\left\{ \begin{matrix} 2 & 1\\ 3 & 2 \end{matrix} \right\}_{2·2}

the dimension of gT(θ)g^T(\theta)is 2x2,which is NOT be compatible with v1v_1and v2v_2,so
[gT(θ)vi]vi[g^T(\theta)v_i]v_i in (2) can NOT be computed,

Could you tell me where am I wrong?
Thanks~~!

Reference:
[1]Negative eigenvalues of the hssian in deep neural networks

相关资讯

    暂无相关的资讯...

共有访客发表了评论 网友评论

验证码: 看不清楚?
    -->