dc.contributor
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.contributor
Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla
dc.contributor.author
Nogueiras Rodríguez, Albino
dc.contributor.author
Moreno Bilbao, M. Asunción
dc.identifier
Nogueiras, A.; Moreno, M. NaniBD: a set of tools for transcribing and validating speech databases. A: International Conference on Language Resources and Evaluation. "LREC 1998: 1st International Conference on Language Resources and Evaluation: proceedings". Granada: European Language Resources Association (ELRA), 1998.
dc.identifier
https://hdl.handle.net/2117/22969
dc.description.abstract
This paper describes NaniBD, a set of tools designed for
transcribing and validating speech databases, developed at the
Signal Processing Group (GPS) of the Department of Signal
Theory and Communications of the Polytechnic University of
Catalonia (TSC/UPC). The main purpose of its development
was the need of a revision system in order to validate and
annotate the Spanish corpus of SpeechDat (II) in the speech
processing environment available at GPS. Despite of this,
NaniBD is designed as a general-purpose system that might fit
any other database, idiom or speech processing system. So far,
the system has been used to revise some 200,000 speech files
from three different corpora. In this paper we will focus our
attention to the actual implementation used in the transcription
of a SpeechDat (II) specifications compatible Catalonian corpus.
1000 speakers, each of them uttering 44 files, compose this
corpus. In this application, we use speech-noise detection,
automatic recognition of spontaneous prompts, digit and letter
to text translation and access to an external database in order to
minimise the amount of time spent by human operators in the
revision procedure.
dc.description.abstract
Peer Reviewed
dc.description.abstract
Postprint (published version)
dc.format
application/pdf
dc.publisher
European Language Resources Association (ELRA)
dc.relation
http://www.coli.uni-saarland.de/~regneri/courses/res4cl-07/papers/Nog98b.pdf
dc.rights
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.rights
Attribution-NonCommercial-NoDerivs 3.0 Spain
dc.subject
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Radiocomunicació i exploració electromagnètica::Teledetecció
dc.subject
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic
dc.subject
Remote sensing
dc.subject
Automatic speech recognition
dc.subject
Reconeixement automàtic de la parla
dc.title
NaniBD: a set of tools for transcribing and validating speech databases
dc.type
Conference report