Vittorio Castelli, Lawrence Bergman, et al.
Knowledge-Based Systems
We present a multimodal, multitask foundation model for automated interpretation of IR and NMR spectra toward molecular structure elucidation. The model jointly ingests and IR spectra (optionally alongside molecular formula information) and generates the corresponding structure as a SMILES string. To overcome the scarcity of paired experimental datasets, we pretrain at scale on simulated multimodal spectra and then finetune on a smaller set of experimental measurements. A multitask formulation (predicting from each single modality as well as from combined inputs) forces the network to learn modality-specific cues and their synergies, while remaining robust to missing modalities. Across experimental benchmarks, the approach achieves up to 96% Top-1 accuracy and performs on par with expert human chemists. We further show that incorporating unpaired spectral data can improve performance, offering a practical route to leverage heterogeneous laboratory archives. Overall, multimodal foundation models provide a scalable path to faster, more accurate, and more reproducible spectroscopic interpretation in routine chemical workflows.
Vittorio Castelli, Lawrence Bergman, et al.
Knowledge-Based Systems
Jinho Hwang, Wei Zhang, et al.
CLOUD 2014
Shuang Chen, Herbert Freeman
International Journal of Pattern Recognition and Artificial Intelligence
Robert Farrell, Rajarshi Das, et al.
AAAI-SS 2010