Workshop paper

Energy Minimization for Training Dense Associative Memory

Abstract

Dense Associative Memories (DenseAMs) are modern generalizations of Hopfield networks with high-capacity, energy-based retrieval dynamics, but it remains unclear what the most elegant training principle should be for these models. Contrastive divergence (CD) is theoretically well motivated but requires expensive iterative negative sampling, and backpropagating reconstruction loss through long inference trajectories is also costly while not directly leveraging the explicit energy objective. Inspired by the Hebbian learning rule in classical Hopfield networks, we propose to train DenseAMs by direct energy minimization. For DenseAMs with translation-invariant kernel energies, we show that the partition function is independent of memory parameters, so maximum likelihood estimation (MLE) reduces exactly to minimizing data energy. This yields a sampling-free training rule that preserves an explicit energy formulation. We demonstrate the method in both ambient space and latent space, where a stop-gradient coupling with an autoencoder enables stable joint training and memory synthesis from latent noise.