Title:
|
Timbre analysis of music audio signals with convolutional neural networks
|
Author:
|
Pons Puig, Jordi; Slizovskaia, Olga; Gómez Gutiérrez, Emilia, 1975-; Serra, Xavier
|
Abstract:
|
Comunicació presentada a la EUSIPCO 2017: 25th European Signal Processing Conference, celebrada els dies 28 d'agost a 2 de setembre de 2017 a Kos, Grècia. |
Abstract:
|
The focus of this work is to study how to efficiently
tailor Convolutional Neural Networks (CNNs) towards learning
timbre representations from log-mel magnitude spectrograms.
We first review the trends when designing CNN architectures.
Through this literature overview we discuss which are the crucial
points to consider for efficiently learning timbre representations
using CNNs. From this discussion we propose a design strategy
meant to capture the relevant time-frequency contexts for
learning timbre, which permits using domain knowledge for
designing architectures. In addition, one of our main goals is
to design efficient CNN architectures – what reduces the risk of
these models to over-fit, since CNNs’ number of parameters is
minimized. Several architectures based on the design principles
we propose are successfully assessed for different research tasks
related to timbre: singing voice phoneme classification, musical
instrument recognition and music auto-tagging. |
Abstract:
|
This work is partially supported by: the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502), the CompMusic project (ERC grant agreement 267583) and the CASAS Spanish research project (TIN2015-70816-R). |
Subject(s):
|
-Intel·ligència artificial -- Aplicacions musicals -So -- Tractament per ordinador |
Rights:
|
© EURASIP. The paper is provided after EURASIP's permission.
|
Document type:
|
Conference Object Article - Published version |
Published by:
|
European Association for Signal Processing (EURASIP)
|
Share:
|
|