Exploring the Capacity of Large Language Models to Assess the Chronic Pain Experience: Algorithm Development and Validation

Amidei, Jacopo; Kaltenbrunner, Andreas; Ferreira de Sa, Jose Gregorio; Albajes, Klara; Nieto, Rubén; Serrat, Mayte; Amidei, Jacopo; Kaltenbrunner, Andreas; Ferreira de Sa, Jose Gregorio; Albajes, Klara; Nieto, Rubén; Serrat, Mayte

Exploring the Capacity of Large Language Models to Assess the Chronic Pain Experience: Algorithm Development and Validation

To access the full text documents, please follow this link: https://hdl.handle.net/11351/13053

Author

Amidei, Jacopo

Kaltenbrunner, Andreas

Ferreira de Sa, Jose Gregorio

Albajes, Klara

Nieto, Rubén

Serrat, Mayte

Other authors

Institut Català de la Salut

[Amidei J, Kaltenbrunner A, Ferreira De Sá JG] AI and Data for Society Research Group, Internet Interdisciplinary Institute, Universitat Oberta de Catalunya, Barcelona, Spain. [Nieto R] eHealth Lab Research Group, Faculty of Psychology and Educational Sciences, Universitat Oberta de Catalunya, Barcelona, Spain. [Serrat M] Unitat d’Expertesa en Síndromes de Sensibilització Central, Servei de Reumatologia, Vall d’Hebron Hospital Universitari, Barcelona, Spain. Escola Universitària de Fisioteràpia, Escoles Universitàries Gimbernat, Barcelona, Spain. [Albajes K] Psyclinic Mental Health, Barcelona, Spain

Vall d'Hebron Barcelona Hospital Campus

Publication date

2025-05-08T11:09:14Z

2025

Abstract

Automated assessment; Chronic pain; Large language models

Avaluació automatitzada; Dolor crònic; Models de llenguatge extens

Evaluación automatizada; Dolor crónico; Modelos de lenguaje extenso

Background: Chronic pain, affecting more than 20% of the global population, has an enormous pernicious impact on individuals as well as economic ramifications at both the health and social levels. Accordingly, tools that enhance pain assessment can considerably impact people suffering from pain and society at large. In this context, assessment methods based on individuals’ personal experiences, such as written narratives (WNs), offer relevant insights into understanding pain from a personal perspective. This approach can uncover subjective, intricate, and multifaceted aspects that standardized questionnaires can overlook. However, WNs can be time-consuming for clinicians. Therefore, a tool that uses WNs while reducing the time required for their evaluation could have a significantly beneficial impact on people's pain assessment. Objective: This study is the first evaluation of the potential of applying large language models (LLMs) to assist clinicians in assessing patients’ pain expressed through WNs. Methods: We performed an experiment based on 43 WNs made by people with fibromyalgia and qualitatively evaluated in a prior study. Focusing on pain severity and disability, we prompt GPT-4 (with temperature parameter settings 0 or 1) to assign scores and scores’ explanations, to these WNs. Then, we quantitatively compare GPT-4 scores with experts’ scores of the same narratives, using statistical measures such as Pearson correlations, root mean squared error, the weighted version of the Gwet agreement coefficient, and Krippendorff α. Additionally, 2 experts specialized in chronic pain conducted a qualitative analysis of the scores’ explanation to assess their accuracy and potential applicability of GPT’s analysis for future pain narrative evaluations. Results: Our analysis reveals that GPT-4’s performance in assessing pain narratives yielded promising results. GPT-4 was comparable in terms of agreement with experts (with a weighted percentage agreement higher than 0.95), correlations with standardized measurements (for example in the range of 0.43 and 0.49 between the Revised Fibromyalgia Impact Questionnaire and GTP-4 with temperatures 1), and low error rates (root mean squared error of 1.20 for severity and 1.44 for disability). Moreover, experts generally deemed the ratings provided by GPT-4, as well as the scores’ explanation, to be adequate. However, we observe that GPT has a slight tendency to overestimate pain severity and disability with a lower SD than expert estimates. Conclusions: These findings underline the potential of LLMs in facilitating the assessment of WNs of people with fibromyalgia, offering a novel approach to understanding and evaluating patient pain experiences. Integrating automated assessments through LLMs presents opportunities for streamlining and enhancing the assessment process, paving the way for improved patient care and tailored interventions in the chronic pain management field.

Universitat Oberta de Catalunya supported this study by covering the fees associated with the publication of this manuscript.

Document Type

Article

Published version

Language

English

Subjects and keywords

Qüestionaris; Algorismes; Fibromiàlgia; Dolor crònic; Dolor - Mesurament; PHENOMENA AND PROCESSES::Mathematical Concepts::Algorithms; DISEASES::Pathological Conditions, Signs and Symptoms::Signs and Symptoms::Neurologic Manifestations::Pain::Chronic Pain; DISEASES::Musculoskeletal Diseases::Muscular Diseases::Fibromyalgia; ANALYTICAL, DIAGNOSTIC AND THERAPEUTIC TECHNIQUES, AND EQUIPMENT::Diagnosis::Diagnostic Techniques and Procedures::Physical Examination::Neurologic Examination::Pain Measurement; ANALYTICAL, DIAGNOSTIC AND THERAPEUTIC TECHNIQUES, AND EQUIPMENT::Investigative Techniques::Epidemiologic Methods::Data Collection::Surveys and Questionnaires; FENÓMENOS Y PROCESOS::conceptos matemáticos::algoritmos; ENFERMEDADES::afecciones patológicas, signos y síntomas::signos y síntomas::manifestaciones neurológicas::dolor::dolor crónico; ENFERMEDADES::enfermedades musculoesqueléticas::enfermedades musculares::fibromialgia; TÉCNICAS Y EQUIPOS ANALÍTICOS, DIAGNÓSTICOS Y TERAPÉUTICOS::diagnóstico::técnicas y procedimientos diagnósticos::exploración física::exploración neurológica::medida del dolor; TÉCNICAS Y EQUIPOS ANALÍTICOS, DIAGNÓSTICOS Y TERAPÉUTICOS::técnicas de investigación::métodos epidemiológicos::recopilación de datos::encuestas y cuestionarios

Publisher

JMIR Publications

Related items

Journal of Medical Internet Research;27

https://doi.org/10.2196/65903

Recommended citation

This citation was generated automatically.

Export

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Rights

Attribution 4.0 International

http://creativecommons.org/licenses/by/4.0/

This item appears in the following Collection(s)

Articles científics - HVH [3439]

Exploring the Capacity of Large Language Models to Assess the Chronic Pain Experience: Algorithm Development and Validation

Author

Other authors

Publication date

Share

Abstract

Document Type

Language

Subjects and keywords

Publisher

Related items

Recommended citation

Export

Rights

This item appears in the following Collection(s)