Correspondence between audio and visual deep models for musical instrument detection in video recordings

Inicio | ¿Qué es? | Contacto

English | Català

Consultar RECERCAT

Por comunidades y
colecciones Por fecha Por autores Por títulos Por temas (CDU)

Consultar departamento

Por fecha Por autores Por títulos Por temas (CDU)

Estadisticas

Del documento Todo RECERCAT

Mi RECERCAT

Entrar Alertas por correo-e

Directorio de otros repositorios

RECERCAT Principal > Universitat Pompeu Fabra > Articles, congressos, llibres > Visualizar documento

Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/10230/37216

Título:	Correspondence between audio and visual deep models for musical instrument detection in video recordings
Autor/a:	Slizovskaia, Olga; Gómez Gutiérrez, Emilia, 1975-; Haro Ortega, Gloria
Abstract:	Comunicació presentada a: 18th International Society for Music Information Retrieval Conference (ISMIR17) celebrat del 23 al 27 d'octubre de 2017 a Suzhou, Xina.
Abstract:	This work aims at investigating cross-modal connections between audio and video sources in the task of musical instrument recognition. We also address in this work the understanding of the representations learned by convolutional neural networks (CNNs) and we study feature correspondence between audio and visual components of a multimodal CNN architecture. For each instrument category, we select the most activated neurons and investigate existing cross-correlations between neurons from the audio and video CNN which activate the same instrument category. We analyse two training schemes for multimodal applications and perform a comparative analysis and visualisation of model predictions.
Abstract:	This work is partly supported by the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X GPU and WiMIR society for covering the registration expenses.
Derechos:	© Olga Slizovskaia, Emilia Gomez, Gloria Haro. Licensed under a Creative Commons Attribution 4.0 International License (CC BY4.0). Attribution: Olga Slizovskaia, Emilia Gomez, Gloria Haro. “Correspondence between audio and visual deep models for musical instrument detection in video recordings”, 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017. https://creativecommons.org/licenses/by/4.0/
Tipo de documento:	Objeto de conferencia Artículo - Versión publicada
Editor:	International Society for Music Information Retrieval (ISMIR)
Compartir:

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Automatic musical instrument recognition in audiovisual recordings by combining image and audio classification strategies

Slizovskaia, Olga; Gómez Gutiérrez, Emilia, 1975-; Haro Ortega, Gloria

End-to-end sound source separation conditioned on instrument labels

Slizovskaia, Olga; Kim, Leo; Haro Ortega, Gloria; Gómez Gutiérrez, Emilia, 1975-

Musical instrument recognition in user-generated videos using a multimodal convolutional neural network architecture

Slizovskaia, Olga; Gómez Gutiérrez, Emilia, 1975-; Haro Ortega, Gloria

Acoustic scene classification by ensembling gradient boosting machine and convolutional neural networks

Fonseca, Eduardo; Gong, Rong; Bogdanov, Dmitry; Slizovskaia, Olga; Gómez Gutiérrez, Emilia, 1975-; Serra, Xavier

Timbre analysis of music audio signals with convolutional neural networks

Pons Puig, Jordi; Slizovskaia, Olga; Gómez Gutiérrez, Emilia, 1975-; Serra, Xavier

Accesibilidad | Aviso legal | Política de Cookies | Documentos de uso interno

Coordinación

Patrocinio