TY - GEN
T1 - On the Correlations between Performance of Deep Networks and Its Robustness to Common Image Perturbations in Medical Image Interpretation
AU - Chong, Chak Fong
AU - Fang, Xinyi
AU - Yang, Xu
AU - Luo, Wuman
AU - Wang, Yapeng
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The robustness of medical image interpretation deep learning models to common image perturbations is crucial, as the medical images in clinical applications may be from different institutions and contain various perturbations that did not appear in training data, decreasing the interpretation performance. In this paper, we investigate the correlations of the robustness of 28 ImageNet models under 6 image perturbation types over 10 severity levels on the CheXpert chest X-ray (CXR) classification dataset. The results demonstrate that: (1) If a model has a higher ImageNet accuracy, after fine-tuning it on CheXpert for CXR classification, it tends to be more robust on perturbed CXRs. (2) If a model has a higher CXR classification performance after fine-tuning on CheXpert, it is not necessarily more robust on perturbed CXRs, depending on the severity levels of the perturbations. Under stronger perturbations, lower CXR performance models tend to be more robust instead. (3) The model architectures may be a key factor to the robustness. For instance, no matter how large the models are, EfficientNet and EfficientNetV2 models tend to be more robust, while ResNet models tend to be more vulnerable. Our work can help select or design robust models for medical image interpretation to improve the capability for clinical applications.
AB - The robustness of medical image interpretation deep learning models to common image perturbations is crucial, as the medical images in clinical applications may be from different institutions and contain various perturbations that did not appear in training data, decreasing the interpretation performance. In this paper, we investigate the correlations of the robustness of 28 ImageNet models under 6 image perturbation types over 10 severity levels on the CheXpert chest X-ray (CXR) classification dataset. The results demonstrate that: (1) If a model has a higher ImageNet accuracy, after fine-tuning it on CheXpert for CXR classification, it tends to be more robust on perturbed CXRs. (2) If a model has a higher CXR classification performance after fine-tuning on CheXpert, it is not necessarily more robust on perturbed CXRs, depending on the severity levels of the perturbations. Under stronger perturbations, lower CXR performance models tend to be more robust instead. (3) The model architectures may be a key factor to the robustness. For instance, no matter how large the models are, EfficientNet and EfficientNetV2 models tend to be more robust, while ResNet models tend to be more vulnerable. Our work can help select or design robust models for medical image interpretation to improve the capability for clinical applications.
KW - Chest Radiograph
KW - Image Perturbation
KW - Medical Image Interpretation
KW - Robustness Comparison
UR - http://www.scopus.com/inward/record.url?scp=85185220150&partnerID=8YFLogxK
U2 - 10.1109/DICTA60407.2023.00065
DO - 10.1109/DICTA60407.2023.00065
M3 - Conference contribution
AN - SCOPUS:85185220150
T3 - 2023 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2023
SP - 426
EP - 433
BT - 2023 International Conference on Digital Image Computing
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2023
Y2 - 28 November 2023 through 1 December 2023
ER -