Here’s some part Appendix C “optimal step sizes” of:
Quotation from:“if we project the gradient in the basis of eigenvectors,we get:”
I can not understand (2) ,so I design a simple example:
then its eigen values and eigen vectors are :
the dimension of is 2x2,which is NOT be compatible with and ,so
in (2) can NOT be computed,
Could you tell me where am I wrong?
Negative eigenvalues of the hssian in deep neural networks