Evaluation of Interpretability Methods in Extractive NLP

Reverter Lopez, Enric; Reverter Lopez, Enric

Abstract

In the context of advanced Natural Language Processing (NLP), this study delves into the interpretability domain, aiming to understand the logic behind models' decision making process. It focuses on three NLP tasks - Question Answering (QA), Text Summarization (TS), and Error Detection (ED). BERT and DistilBERT models were fine-tuned to address each task. The research introduces tailor-made datasets, placing special emphasis on the ED task for misspelling to enhance transparency. Additionally, novel faithfulness metrics are introduced to assess interpretability methods. A variety of gradient-based methods and an attention-based method, Aggregation of Layer-wise Token-to-token Interactions (ALTI), were evaluated. ALTI outperformed the gradient-based methods in all tasks albeit its higher computational cost. Integrated Gradients (IG) showed the highest variability, performing well in shorter ED sequences but less so in longer QA and TS sequences. The work stresses the need to broaden the task spectrum in interpretability evaluations and the significance of ensuring the robustness of emerging metrics.

Document Type

Master thesis

Language

English

Subjects and keywords

Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic; Natural language processing (Computer science); Processament del llenguatge natural; Interpretabilitat; Resposta a preguntes; Resum de textos; Detecció d'errors; BERT; Mètodes basats en gradients; Mètodes basats en atenció; Gradients integrats; Mètriques de fidelitat; Natural Language Processing; Interpretability; Question Answering; Text Summarization; Error Detection; Gradient-based Methods; Attention-based Methods; Integrated Gradients; Faithfulness Metrics; Tractament del llenguatge natural (Informàtica)

Publisher

Universitat Politècnica de Catalunya

Evaluation of Interpretability Methods in Extractive NLP

Author

Other authors

Publication date

Share

Abstract

Document Type

Language

Subjects and keywords

Publisher

Recommended citation

Export

Rights

This item appears in the following Collection(s)