Efficient finetuning strategies for multilingual neural machine translation

Other authors

Universitat Politècnica de Catalunya. Departament de Ciències de la Computació

Escolano Peinado, Carlos

Publication date

2024-01-25

Abstract

Driven by the ambition to eliminate language barriers worldwide, Machine Translation has become a central area of interest in today's artificial intelligence research. Despite significant advancements, the concentration of research and resources has been predominantly on a highresource languages. This discrepancy in language coverage points out a critical gap in the field. Recent breakthroughs in Machine Translation have seen the emergence of Multilingual Large Pre-Trained models, which have set new benchmarks across the field by enabling low-resource languages benefit from zero-shot translation. However, these models obtain high performances at the cost of requiring huge amounts of data and hardware resources. The focus of this thesis is to explore and formulate a fine-tuning strategy for a multilingual machine translation model, such as M2M100. Specifically, the project aims to extend their linguistic capabilities by incorporating new low-resource languages by fine-tuning specific language adapters using Low-Rank Adaptation methods. To evaluate the performance of the strategy, state-of-the-art techniques and evaluation metrics are employed, considering factors like scalability, catastrophic forgetting and zero-shot translation. The implemented approach has successfully developed a M2M100 language translator in a low-resource context resulting in a SacreBLEU score of 5.6 when just training a 13 % of the parameters while the full fine-tuning methodology reach a score of 7.43. Furthermore, this framework demonstrates its capability to produce more advanced and efficient machine translation models, which can deliver high-quality translations with reduced computational demands.

Document Type

Master thesis

Language

English

Publisher

Universitat Politècnica de Catalunya

Recommended citation

This citation was generated automatically.

Rights

Open Access

This item appears in the following Collection(s)