Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Jie Ren, Zhenwei Dai, et al.
NeurIPS 2025
Learning molecular representations robust to 3D rotations typically relies on symmetry-aware architectures or extensive augmentation. Here, we show that contrastive multimodal pretraining alone can induce SO(3) invariance in molecular embeddings. We jointly train a 3D electron density encoder, based on a VQGAN, and a SMILES-based transformer encoder on 855k molecules, using CLIP-style and SigLIP objectives to align volumetric and symbolic modalities. Because SMILES embeddings are rotation-invariant, the contrastive loss implicitly enforces rotation-consistency in the 3D encoder. To assess geometric generalization, we introduce a benchmark of 1,000 molecules with five random SO(3) rotations each. Our model retrieves rotated variants with 77% Recall@10 (vs. 9.8% for a unimodal baseline) and organizes latent space by chemical properties, achieving functional group-wise Recall@10 above 98% and a Davies–Bouldin index of 2.35 (vs. 34.46 baseline). Fine-tuning with rotated data reveals a trade-off between retrieval precision and pose diversity. These results demonstrate that contrastive multimodal pretraining can yield symmetry-aware molecular representations without explicit equivariant design.
Jie Ren, Zhenwei Dai, et al.
NeurIPS 2025
Rares Christian, Pavithra Harsha, et al.
NeurIPS 2025
Saiteja Utpala, Alex Gu, et al.
NAACL 2024
Tian Gao, Amit Dhurandhar, et al.
NeurIPS 2025