Multi-modal contrastive learning for chemical structure elucidation with VibraCLIP

Autor/a

Rocabert-Oriols, Pau

Lo Conte, Camilla

López, Núria

Heras-Domingo, Javier

Fecha de publicación

2025-11-11



Resumen

Identifying molecular structures from vibrational spectra is central to chemical analysis but remains challenging due to spectral ambiguity and the limitations of single-modality methods. While deep learning has advanced various spectroscopic characterization techniques, leveraging the complementary nature of infrared (IR) and Raman spectroscopies remains largely underexplored. We introduce VibraCLIP, a contrastive learning framework that embeds molecular graphs, IR and Raman spectra into a shared latent space. A lightweight fine-tuning protocol ensures generalization from theoretical to experimental datasets. VibraCLIP enables accurate, scalable, and data-efficient molecular identification, linking vibrational spectroscopy with structural interpretation. This tri-modal design captures rich structure–spectra relationships, achieving Top-1 retrieval accuracy of 81.7% and reaching 98.9% Top-25 accuracy with molecular mass integration. By integrating complementary vibrational spectroscopic signals with molecular representations, VibraCLIP provides a practical framework for automated spectral analysis, with potential applications in fields such as synthesis monitoring, drug development, and astrochemical detection.

Tipo de documento

Artículo

Versión del documento

Versión publicada

Lengua

Inglés

Materias CDU

54 - Química

Palabras clave

Química

Páginas

10 p.

Publicado por

RSC

Número del acuerdo de la subvención

Institute of Chemical Research of Catalonia (ICIQ) Summer Fellow Program

Department of Research and Universities of the Generalitat de Catalunya for funding through grant (reference: SGR-01155)

PID2024-157556OB-I00 funded by MICIU/AEI/10.13039/501100011033/FEDER, UE

Documentos

d5dd00269a.pdf

1.046Mb

Derechos

Attribution 4.0 International

Attribution 4.0 International

Este ítem aparece en la(s) siguiente(s) colección(ones)

Papers [1240]