Study and comparison of rule-based and statistical Catalan-Spanish machine translation systems

Costa-jussà, Marta R.; Farrús, Mireia; Mariño Acebal, José B.; Fonollosa, José A Rodriguez; Costa-jussà, Marta R.; Farrús, Mireia; Mariño Acebal, José B.; Fonollosa, José A Rodriguez

Study and comparison of rule-based and statistical Catalan-Spanish machine translation systems

Autor/a

Costa-jussà, Marta R.

Farrús, Mireia

Mariño Acebal, José B.

Fonollosa, José A Rodriguez

Fecha de publicación

2017-04-27T17:20:23Z

2012

Resumen

Machine translation systems can be classified into rule-based and corpus-based approaches, in terms of their core methodology. Since both paradigms have been largely used during the last years, one of the aims in the research community is to know how these systems differ in terms of translation quality. To this end, this paper reports a study and comparison of several specific Catalan-Spanish machine translation systems: two rule-based and two corpus-based (particularly, statistical-based) systems, all of them freely available on the web. The translation quality analysis is performed under two different domains: journalistic and medical. The systems are evaluated by using standard automatic measures, as well as by native human evaluators. In addition to these traditional evaluation procedures, this paper reports a novel linguistic evaluation, which provides information about the errors encountered at the orthographic, morphological, lexical, semantic and syntactic levels. Results show that while rule-based systems provide a better performance at orthographic and morphological levels, statistical systems tend to commit less semantic errors. Furthermore, results show all the evaluations performed are cha- racterised by some degree of correlation, and human evaluators tend to be specially critical with semantic and syntactic errors.

This work has been partially funded by the Spanish Ministry of Economy and Competivity through the Juan de la Cierva fellowship program. The authors also want to thank the Barcelona Media Innovation Centre for its support and permission to publish this research.

Tipo de documento

Artículo

Versión publicada

Lengua

Inglés

Materias y palabras clave

Rule-based machine translation; Statistical machine translation; Catalan; Spanish

Publicado por

Slovak Academy of Sciences

Documentos relacionados

Computing and Informatics. 2012;31(2):245-70.

Citación recomendada

Esta citación se ha generado automáticamente.

Exportar

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Derechos

Este ítem aparece en la(s) siguiente(s) colección(ones)

Recerca: articles, congressos, llibres [21054]

Study and comparison of rule-based and statistical Catalan-Spanish machine translation systems

Autor/a

Fecha de publicación

Compartir

Resumen

Tipo de documento

Lengua

Materias y palabras clave

Publicado por

Documentos relacionados

Citación recomendada

Exportar

Derechos

Este ítem aparece en la(s) siguiente(s) colección(ones)