Language technologies: question answering in speech transcripts

Inici | Què és? | Contacte

English | Castellano

Consultar RECERCAT

Per comunitats i
col·leccions Per data Per autors Per títols Per matèries

Consultar col·lecció

Per data Per autors Per títols Per matèries

Estadístiques

Del document Tot RECERCAT

El meu RECERCAT

Entrar Alertes per correu-e

Directori d’altres repositoris

Pàgina inicial del RECERCAT > Universitat Politècnica de Catalunya > Documents de recerca > Visualitza document

Per accedir als documents amb el text complet, si us plau, seguiu el següent enllaç: http://hdl.handle.net/2117/13011

Títol:	Language technologies: question answering in speech transcripts
Autor/a:	Turmo Borras, Jorge; Surdeanu, Mihai; Galibert, Olivier; Rosset, Sophie
Altres autors:	Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics; Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
Abstract:	The Question Answering (QA) task consists of providing short, relevant answers to natural language questions. Most QA research has focused on extracting information from text sources, providing a the shortest relevant text in response to a question. For example, the correct answer to the question ”How many groups participate in the CHIL project?” is ”16”, whereas the response to “who are the partners in CHIL?” is a list of them. This simple example illustrates the two main advantages of QA over current search engines: first, the input is a natural language question rather a keyword query; and second, the answer provides the desired information content and not simply a potentially large set of documents or URLs that the user must plow through. One of the aims of the CHIL project was to provide information about what has been said during interactive seminars. Since the information must be located in speech data, the QA systems have to be able to deal with transcripts (manual or automatic) of spontaneous speech. This is a departure from much of the QA research carried by natural language groups who have typically developed techniques for written texts which are assumed to have a correct syntactic and semantic structure. The structure of spoken language is different from that of written language, and some of the anchor points used in processing such as punctuation must be inferred and are therefore error prone. Other spoken language phenomena include disfluencies, repetitions, restarts and corrections. In the case that automatic processing is used to create the speech transcripts, an additional challenge is dealing with the recognition errors. The response can be a short string, as in text-based QA, or an audio segment containing the response. This chapter summarizes the CHIL efforts devoted to QA for spoken language carried out at UPC and at CNRS-LIMSI. Research at UPC adapted a QA system developed for written texts to manually and automatically created speech transcripts, whereas at LIMSI an interactive oral QA system developed for the French language was adapted to the English language. CHIL organized the pilot track on Question Answering in Speech Transcripts (QAst), as part of CLEF 2007, in order to compare and evaluate QA technology on both manually and automatically produced transcripts of spontaneous speech.
Abstract:	Peer Reviewed
Matèries:	-Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural -Question-answering systems -Natural language processing (Computer science) -Oral question answering -Speech transcripts -Tractament del llenguatge natural (Informàtica)
Drets:
Tipus de document:	Article - Versió publicada Capítol o part de llibre
Publicat per:	Springer-Verlag
Compartir:

Mostra el registre complet del document

Documents relacionats

Altres documents del mateix autor/a

TALP-UPC at TREC 2005: Experiments using voting scheme among three heterogeneous QA systems

Ferrés Domènech, Daniel; Kanaan Izquierdo, Samir; González Pellicer, Edgar; Ageno Pulido, Alicia; Fuentes Fort, Maria; Rodríguez Hontoria, Horacio; Surdeanu, Mihai; Turmo Borras, Jorge

A Bootstrapping architecture for time expression recognition in unlabelled corpora via syntactic-semantic patterns

Poveda Poveda, Jordi; Surdeanu, Mihai; Turmo Borras, Jorge

Using Evolutive Summary Counters for Efficient Cooperative Caching in Search Engines

Domínguez Sal, David; Aguilar Saborit, Josep; Surdeanu, Mihai; Larriba Pey, Josep

SVMs for the temporal expression chunking problem

Poveda Poveda, Jordi; Surdeanu, Mihai

Projective dependency parsing with perceptron

Carreras Pérez, Xavier; Surdeanu, Mihai; Màrquez Villodre, Lluís

Accessibilitat | Avís legal | Política de Cookies | Documents d'ús intern

Coordinació

Patrocini