Hi, a question of inconsistency of the dymamics of tr(FIM) and tr(H). #11

wizard1203 · 2021-12-11T02:36:21Z

Hi, thanks for your awesome work!

I noticed that the results in the paper: PYHESSIAN: Neural Networks Through the Lens of the Hessian, the tr(H) keeps increasing during training.

And in this paper: Hessian-based Analysis of Large Batch Training and Robustness to Adversaries, the dominant eigenvalue of the Hessian w.r.t weights could decrease during small-batch training.

And in this paper: CRITICAL LEARNING PERIODS IN DEEP NETWORKS. The trace of FIM increases first, and decrease.

Are there some relationships between them? Are they inconsistent from others?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hi, a question of inconsistency of the dymamics of tr(FIM) and tr(H). #11

Hi, a question of inconsistency of the dymamics of tr(FIM) and tr(H). #11

wizard1203 commented Dec 11, 2021

Hi, a question of inconsistency of the dymamics of tr(FIM) and tr(H). #11

Hi, a question of inconsistency of the dymamics of tr(FIM) and tr(H). #11

Comments

wizard1203 commented Dec 11, 2021