Convergence rate and Hessian spectra

  • Remember: If a Hessian matrix is positive definite everywhere, then the function is convex => bad neg eigenvalues
  • Large eigenvalues of the
    Metrics for flatness Some metrics, such as the maximum Hessian eigenvalue, measure the worstcase loss increase under an adversarial perturbation to the weights [10, 16], while other proposed metrics, such as the Hessian trace, measure the expected loss increase under random perturbations to the weights.

hessian_spectrum_vit_resnet.png