Multi-modal contrastive learning for chemical structure elucidation with VibraCLIP

Author

Rocabert-Oriols, Pau

Lo Conte, Camilla

López, Núria

Heras-Domingo, Javier

Publication date

2025-11-11



Abstract

Identifying molecular structures from vibrational spectra is central to chemical analysis but remains challenging due to spectral ambiguity and the limitations of single-modality methods. While deep learning has advanced various spectroscopic characterization techniques, leveraging the complementary nature of infrared (IR) and Raman spectroscopies remains largely underexplored. We introduce VibraCLIP, a contrastive learning framework that embeds molecular graphs, IR and Raman spectra into a shared latent space. A lightweight fine-tuning protocol ensures generalization from theoretical to experimental datasets. VibraCLIP enables accurate, scalable, and data-efficient molecular identification, linking vibrational spectroscopy with structural interpretation. This tri-modal design captures rich structure–spectra relationships, achieving Top-1 retrieval accuracy of 81.7% and reaching 98.9% Top-25 accuracy with molecular mass integration. By integrating complementary vibrational spectroscopic signals with molecular representations, VibraCLIP provides a practical framework for automated spectral analysis, with potential applications in fields such as synthesis monitoring, drug development, and astrochemical detection.

Document Type

Article

Document version

Published version

Language

English

CDU Subject

54 - Chemistry. Crystallography. Mineralogy

Subject

Química

Pages

10 p.

Publisher

RSC

Grant Agreement Number

Institute of Chemical Research of Catalonia (ICIQ) Summer Fellow Program

Department of Research and Universities of the Generalitat de Catalunya for funding through grant (reference: SGR-01155)

PID2024-157556OB-I00 funded by MICIU/AEI/10.13039/501100011033/FEDER, UE

Documents

d5dd00269a.pdf

1.046Mb

Rights

Attribution 4.0 International

Attribution 4.0 International

This item appears in the following Collection(s)

Papers [1240]