Abstract
Breast cancer significantly impacts women's health, with ultrasound being crucial for lesion assessment. To enhance diagnostic accuracy, computer-aided detection (CAD) systems have attracted considerable interest. This study introduces a prospective deep learning architecture called “Multi-modal Multi-task Network” (3MT-Net). 3MT-Net utilizes a combination of clinical data, B-mode, and color Doppler ultrasound. We have designed the AM-CapsNet network, specifically tailored to extract crucial tumor features from ultrasound. To combine clinical data in 3MT-Net, we have employed a cascaded cross-attention to fuse information from three distinct sources. To ensure the preservation of pertinent information during the fusion of high-dimensional and low-dimensional data, we adopt the idea of ensemble learning and design an optimization algorithm to assign weights to different modalities. Eventually, 3MT-Net performs binary classification of benign and malignant lesions as well as pathological subtype classification. In addition, we retrospectively collected data from nine medical centers. To ensure the broad applicability of the 3MT-Net, we created two separate testsets and conducted extensive experiments. Furthermore, a comparative analysis was conducted between 3MT-Net and the industrial-grade CAD product S-detect. The AUC of 3MT-Net surpasses S-Detect by 1.4% to 3.8%.
Original language | English |
---|---|
Pages (from-to) | 1-12 |
Number of pages | 12 |
Journal | IEEE Journal of Biomedical and Health Informatics |
DOIs | |
Publication status | Accepted/In press - 2024 |
Keywords
- Breast
- Breast cancer
- Breast cancer
- ensemble learning
- Feature extraction
- multi-task learning
- multimodal
- Multitasking
- Pathology
- Ultrasonic imaging
- ultrasound imaging
- Vectors