Soft-Masked Diffusion Language Models
Michael Hersche, Samuel Moor, et al.
ICLR 2026
The ability to generalize compositionally is central to intelligent behavior. While recent work shows that networks can generalize compositionally under certain conditions, many studies focus on simple compositional tasks, such as those that are purely linguistic (or unimodal), or those with no temporal structure. Here we investigate how representational dynamics shape compositional generalization in recurrent neural network (RNN) models during cognitive tasks with evolving temporal structure, providing insights into neural computation during flexible reasoning. We trained RNNs on the Concrete Permuted Rules (C-PRO) task, a cognitive compositional task established for humans that requires integration of information across task phases. We assessed how different learning regimes induced generalization and representational dynamics. We systematically varied model initializations to generate RNNs that exhibited a wide range of compositional generalization performance, ranging from 38% to 90%. Analysis of high-performing models revealed nontrivial temporal dynamics of task representations, highlighting the importance of selectively engaging the right features at the appropriate task phase for generalization. Our findings reveal that successful compositional generalization requires the orchestration of structured intermediate representations that are dynamically composed, resulting in complex, feature-specific representational dynamics – providing testable principles for how neural systems enable flexible reasoning.
Michael Hersche, Samuel Moor, et al.
ICLR 2026
Robert Farrell, Rajarshi Das, et al.
AAAI-SS 2010
Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025
Daniel Karl I. Weidele, Hendrik Strobelt, et al.
SysML 2019