Activity–weight duality in feed-forward neural networks reveals two co-determinants for generalizationYu FengWei Zhanget al.2023Nature Machine IntelligencePaper
Phases of learning dynamics in artificial neural networks in the absence or presence of mislabeled dataYu FengYuhai Tu2021Machine Learning: Science and Tech.Paper
The inverse variance-flatness relation in stochastic gradient descent is critical for finding flat minimaYu FengYuhai Tu2021PNASPaper