TY - JOUR
T1 - Diagnostic performance of ChatGPT-4.0 in histopathological description analysis of oral and maxillofacial lesions
T2 - a comparative study with pathologists
AU - Cuevas-Nunez, Maria
AU - Silberberg, Valentina Ignacia Alvarez
AU - Arregui, Maria
AU - Jham, Bruno C.
AU - Ballester-Victoria, Rosa
AU - Koptseva, Inessa
AU - de Tejada, María José Biosca Gómez
AU - Posada-Caez, Rodolfo
AU - Manich, Victor Gil
AU - Bara-Casaus, Javier
AU - Fernández-Figueras, Maria Teresa
N1 - Publisher Copyright:
© 2024 Elsevier Inc.
PY - 2025/4
Y1 - 2025/4
N2 - Objective: To evaluate the diagnostic performance of ChatGPT-4.0 in histopathological diagnoses of oral and maxillofacial lesions and compare its performance with pathologists. Study Design: A retrospective analysis of 102 histopathological descriptions was conducted. Data, including site, age and sex, were anonymized from the General University Hospital's Department of Pathology. ChatGPT-4.0 provided diagnoses, which were categorized as correct, similar, or different compared to pathologists' diagnoses. Descriptive statistics, Chi-squared tests, correlation, and regression analyses were used to assess accuracy and the influence of age and gender. Results: ChatGPT-4.0 correctly diagnosed 61 out of 102 cases, yielding an accuracy of 59.8%. The distribution of diagnostic scores did not significantly deviate from expectations (Chi-squared Statistic: 0.0, P = 1.0). A moderate negative correlation between age and diagnostic scores (r = −0.33) was observed, with age significantly predicting scores (P = .001). No significant difference was found between genders (P = .26). ChatGPT-4.0 performed worst with granuloma and inflammation cases (100% incorrect) and best with mucocele cases (93.3% correct). Conclusion: ChatGPT-4.0 shows moderate accuracy in histopathological diagnosis of oral and maxillofacial lesions, with performance varying by lesion type. Improvements are needed to enhance its clinical reliability.
AB - Objective: To evaluate the diagnostic performance of ChatGPT-4.0 in histopathological diagnoses of oral and maxillofacial lesions and compare its performance with pathologists. Study Design: A retrospective analysis of 102 histopathological descriptions was conducted. Data, including site, age and sex, were anonymized from the General University Hospital's Department of Pathology. ChatGPT-4.0 provided diagnoses, which were categorized as correct, similar, or different compared to pathologists' diagnoses. Descriptive statistics, Chi-squared tests, correlation, and regression analyses were used to assess accuracy and the influence of age and gender. Results: ChatGPT-4.0 correctly diagnosed 61 out of 102 cases, yielding an accuracy of 59.8%. The distribution of diagnostic scores did not significantly deviate from expectations (Chi-squared Statistic: 0.0, P = 1.0). A moderate negative correlation between age and diagnostic scores (r = −0.33) was observed, with age significantly predicting scores (P = .001). No significant difference was found between genders (P = .26). ChatGPT-4.0 performed worst with granuloma and inflammation cases (100% incorrect) and best with mucocele cases (93.3% correct). Conclusion: ChatGPT-4.0 shows moderate accuracy in histopathological diagnosis of oral and maxillofacial lesions, with performance varying by lesion type. Improvements are needed to enhance its clinical reliability.
UR - https://www.scopus.com/pages/publications/85212868463
U2 - 10.1016/j.oooo.2024.11.087
DO - 10.1016/j.oooo.2024.11.087
M3 - Article
C2 - 39709300
AN - SCOPUS:85212868463
SN - 2212-4403
VL - 139
SP - 453
EP - 461
JO - Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology
JF - Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology
IS - 4
ER -