The immunohistochemical (IHC) evaluation of epidermal growth factor 2 (HER2) for the diagnosis of breast cancer is still qualitative with a high degree of inter-observer variability, and thus requires the incorporation of complementary techniques such as fluorescent in situ hybridization (FISH) to resolve the diagnosis. Implementing automatic algorithms to classify IHC biomarkers is crucial for typifying the tumor and deciding on therapy for each patient with better performance. The present study aims to demonstrate that, using an explainable Machine Learning (ML) model for the classification of HER2 photomicrographs, it is possible to determine criteria to improve the value of IHC analysis. We trained a logistic regression-based supervised ML model with 393 IHC microscopy images from 131 patients, to discriminate between upregulated and normal expression of the HER2 protein. Pathologists' diagnoses (IHC only) vs. the final diagnosis complemented with FISH (IHC + FISH) were used as training outputs. Basic performance metrics and receiver operating characteristic curve analysis were used together with an explainability algorithm based on Shapley Additive exPlanations (SHAP) values to understand training differences. The model could discriminate amplified IHC from normal expression with better performance when the training output was the IHC + FISH final diagnosis (IHC vs. IHC + FISH: area under the curve, 0.94 vs. 0.81). This may be explained by the increased analytical impact of the membrane distribution criteria over the global intensity of the signal, according to SHAP value interpretation. The classification model improved its performance when the training input was the final diagnosis, downplaying the weighting of the intensity of the IHC signal, suggesting that to improve pathological diagnosis before FISH consultation, it is necessary to emphasize subcellular patterns of staining.
Nota bibliográficaPublisher Copyright:
© 2023 Spandidos Publications. All rights reserved.