Multi-modal contrastive learning for chemical structure elucidation with VibraCLIP

Autor/a

Rocabert-Oriols, Pau

Lo Conte, Camilla

López, Núria

Heras-Domingo, Javier

Data de publicació

2025-11-11



Resum

Identifying molecular structures from vibrational spectra is central to chemical analysis but remains challenging due to spectral ambiguity and the limitations of single-modality methods. While deep learning has advanced various spectroscopic characterization techniques, leveraging the complementary nature of infrared (IR) and Raman spectroscopies remains largely underexplored. We introduce VibraCLIP, a contrastive learning framework that embeds molecular graphs, IR and Raman spectra into a shared latent space. A lightweight fine-tuning protocol ensures generalization from theoretical to experimental datasets. VibraCLIP enables accurate, scalable, and data-efficient molecular identification, linking vibrational spectroscopy with structural interpretation. This tri-modal design captures rich structure–spectra relationships, achieving Top-1 retrieval accuracy of 81.7% and reaching 98.9% Top-25 accuracy with molecular mass integration. By integrating complementary vibrational spectroscopic signals with molecular representations, VibraCLIP provides a practical framework for automated spectral analysis, with potential applications in fields such as synthesis monitoring, drug development, and astrochemical detection.

Tipus de document

Article

Versió del document

Versió publicada

Llengua

Anglès

Matèries CDU

54 - Química

Paraules clau

Química

Pàgines

10 p.

Publicat per

RSC

Número de l'acord de la subvenció

Institute of Chemical Research of Catalonia (ICIQ) Summer Fellow Program

Department of Research and Universities of the Generalitat de Catalunya for funding through grant (reference: SGR-01155)

PID2024-157556OB-I00 funded by MICIU/AEI/10.13039/501100011033/FEDER, UE

Documents

d5dd00269a.pdf

1.046Mb

Drets

Attribution 4.0 International

Attribution 4.0 International

Aquest element apareix en la col·lecció o col·leccions següent(s)

Papers [1240]