Evaluation of Interpretability Methods in Extractive NLP

Altres autors/es

Universitat Politècnica de Catalunya. Departament de Ciències de la Computació

Escolano Peinado, Carlos

Ferrando Monsonís, Javier

Data de publicació

2023-06-30

Resum

In the context of advanced Natural Language Processing (NLP), this study delves into the interpretability domain, aiming to understand the logic behind models' decision making process. It focuses on three NLP tasks - Question Answering (QA), Text Summarization (TS), and Error Detection (ED). BERT and DistilBERT models were fine-tuned to address each task. The research introduces tailor-made datasets, placing special emphasis on the ED task for misspelling to enhance transparency. Additionally, novel faithfulness metrics are introduced to assess interpretability methods. A variety of gradient-based methods and an attention-based method, Aggregation of Layer-wise Token-to-token Interactions (ALTI), were evaluated. ALTI outperformed the gradient-based methods in all tasks albeit its higher computational cost. Integrated Gradients (IG) showed the highest variability, performing well in shorter ED sequences but less so in longer QA and TS sequences. The work stresses the need to broaden the task spectrum in interpretability evaluations and the significance of ensuring the robustness of emerging metrics.

Tipus de document

Master thesis

Llengua

Anglès

Publicat per

Universitat Politècnica de Catalunya

Citació recomanada

Aquesta citació s'ha generat automàticament.

Drets

Open Access

Aquest element apareix en la col·lecció o col·leccions següent(s)