# some notes about《Negative eigenvalues of the hssian in deep neural networks》

2019/7/21 17:31:56 人评论 次浏览 分类：学习教程

Here’s some part Appendix C “optimal step sizes” of[1]:
$\theta_{t+1}=\theta_t-\alpha H(\theta_t)^{-1}g(\theta_t)-----------(1)$
$\alpha:\text{learning rate,also called step size in [1]}$

Quotation from[1]:“if we project the gradient in the basis of eigenvectors,we get:”
$g(\theta)=\Sigma_{i=1}^N[g(\theta)^Tv_i]v_i-----------(2)$
I can not understand (2) ,so I design a simple example:

let $g(\theta)=\left\{ \begin{matrix} 2 & 3 \\ 1 & 2 \end{matrix} \right\}$

then its eigen values and eigen vectors are :
$\lambda_{1}=2+\sqrt{3},v_1=(\sqrt{3},1)$

$\lambda_2=2-\sqrt{3},v_2 =(-\sqrt{3},1)$

$g^T(\theta)=\left\{ \begin{matrix} 2 & 1\\ 3 & 2 \end{matrix} \right\}_{2·2}$

the dimension of $g^T(\theta)$is 2x2,which is NOT be compatible with $v_1$and $v_2$,so
$[g^T(\theta)v_i]v_i$ in (2) can NOT be computed,

Could you tell me where am I wrong?
Thanks~~!

Reference:
[1]Negative eigenvalues of the hssian in deep neural networks

暂无相关的资讯...

-->