TY - JOUR
T1 - Prediction of molecular subtypes of breast cancer using BI-RADS features based on a “white box” machine learning approach in a multi-modal imaging setting
AU - Wu, Mingxiang
AU - Zhong, Xiaoling
AU - Peng, Quanzhou
AU - Xu, Mei
AU - Huang, Shelei
AU - Yuan, Jialin
AU - Ma, Jie
AU - Tan, Tao
N1 - Publisher Copyright:
© 2019
PY - 2019/5
Y1 - 2019/5
N2 - Purpose: To develop and validate an interpretable and repeatable machine learning model approach to predict molecular subtypes of breast cancer from clinical metainformation together with mammography and MRI images. Methods: We retrospectively assessed 363 breast cancer cases (Luminal A 151, Luminal B 96, HER2 76, and BLBC 40). Eighty-two features defined in the BI-RADS lexicon were visually described. A decision tree model with the Chi-squared automatic interaction detector (CHAID) algorithm was applied for feature selection and classification. A 10-fold cross-validation was performed to investigate the performance (i.e., accuracy, positive predictive value, sensitivity, and F1-score) of the decision tree model. Results: Seven of the 82 variables were derived from the decision tree-based feature selection and used as features for the classification of molecular subtypes including mass margin calcification on mammography, mass margin types of kinetic curves in the delayed phase, mass internal enhancement characteristics, non-mass enhancement distribution on MRI, and breastfeeding history. The decision tree model accuracy was 74.1%. For each molecular subtype group, Luminal A achieved a sensitivity, positive predictive value, and F1-score of 79.47%, 75.47%, and 77.42%, respectively; Luminal B showed a sensitivity, positive predictive value, and F1-score of 64.58%, 55.86%, and 59.90%, respectively; HER2 had a sensitivity, positive predictive value, and F1-scores of 81.58%, 95.38%, and 87.94%, respectively; BLBC showed sensitivity, positive predictive value, and F1-scores of 62.50%, 89.29%, and 73.53%, respectively. Conclusions: We applied a complete “white box” machine learning method to predict the molecular subtype of breast cancer based on the BI-RADS feature description in a multi-modal setting. By combining BI-RADS features in both mammography and MRI, the prediction accuracy is boosted and robust. The proposed method can be easily applied widely regardless of variability of imaging vendors and settings because of the applicability and acceptance of the BI-RADS.
AB - Purpose: To develop and validate an interpretable and repeatable machine learning model approach to predict molecular subtypes of breast cancer from clinical metainformation together with mammography and MRI images. Methods: We retrospectively assessed 363 breast cancer cases (Luminal A 151, Luminal B 96, HER2 76, and BLBC 40). Eighty-two features defined in the BI-RADS lexicon were visually described. A decision tree model with the Chi-squared automatic interaction detector (CHAID) algorithm was applied for feature selection and classification. A 10-fold cross-validation was performed to investigate the performance (i.e., accuracy, positive predictive value, sensitivity, and F1-score) of the decision tree model. Results: Seven of the 82 variables were derived from the decision tree-based feature selection and used as features for the classification of molecular subtypes including mass margin calcification on mammography, mass margin types of kinetic curves in the delayed phase, mass internal enhancement characteristics, non-mass enhancement distribution on MRI, and breastfeeding history. The decision tree model accuracy was 74.1%. For each molecular subtype group, Luminal A achieved a sensitivity, positive predictive value, and F1-score of 79.47%, 75.47%, and 77.42%, respectively; Luminal B showed a sensitivity, positive predictive value, and F1-score of 64.58%, 55.86%, and 59.90%, respectively; HER2 had a sensitivity, positive predictive value, and F1-scores of 81.58%, 95.38%, and 87.94%, respectively; BLBC showed sensitivity, positive predictive value, and F1-scores of 62.50%, 89.29%, and 73.53%, respectively. Conclusions: We applied a complete “white box” machine learning method to predict the molecular subtype of breast cancer based on the BI-RADS feature description in a multi-modal setting. By combining BI-RADS features in both mammography and MRI, the prediction accuracy is boosted and robust. The proposed method can be easily applied widely regardless of variability of imaging vendors and settings because of the applicability and acceptance of the BI-RADS.
KW - Breast cancer
KW - Decision tree
KW - MRI
KW - Machine learning
KW - Mammography
KW - Molecular subtype
UR - http://www.scopus.com/inward/record.url?scp=85063602941&partnerID=8YFLogxK
U2 - 10.1016/j.ejrad.2019.03.015
DO - 10.1016/j.ejrad.2019.03.015
M3 - Article
C2 - 31005170
AN - SCOPUS:85063602941
SN - 0720-048X
VL - 114
SP - 175
EP - 184
JO - European Journal of Radiology
JF - European Journal of Radiology
ER -