TY - JOUR
T1 - HTC-Net
T2 - A hybrid CNN-transformer framework for medical image segmentation
AU - Tang, Hui
AU - Chen, Yuanbin
AU - Wang, Tao
AU - Zhou, Yuanbo
AU - Zhao, Longxuan
AU - Gao, Qinquan
AU - Du, Min
AU - Tan, Tao
AU - Zhang, Xinlin
AU - Tong, Tong
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2024/2
Y1 - 2024/2
N2 - Automated medical image segmentation is a crucial step in clinical analysis and diagnosis, as it can improve diagnostic efficiency and accuracy. Deep convolutional neural networks (DCNNs) have been widely used in the medical field, achieving excellent results. The high complexity of medical images poses a significant challenge for many networks in balancing local and global information, resulting in unstable segmentation outcomes. To address the challenge, we designed a hybrid CNN-Transformer network to capture both the local and global information. More specifically, deep convolutional neural networks are introduced to exploit the local information. At the same time, we designed a trident multi-layer fusion (TMF) block for the Transformer to fuse contextual information from higher-level (global) features dynamically. Moreover, considering the inherent characteristic of medical image segmentation (e.g., irregular shapes and discontinuous boundaries), we developed united attention (UA) blocks to focus on important feature learning. To evaluate the effectiveness of our proposed approach, we performed experiments on two publicly available datasets, ISIC-2017, and Kvasir-SEG, and compared our results with state-of-the-art approaches. The experimental results demonstrate the superior performance of our approach. The codes are available at https://github.com/Tanghui2000/HTC-Net.
AB - Automated medical image segmentation is a crucial step in clinical analysis and diagnosis, as it can improve diagnostic efficiency and accuracy. Deep convolutional neural networks (DCNNs) have been widely used in the medical field, achieving excellent results. The high complexity of medical images poses a significant challenge for many networks in balancing local and global information, resulting in unstable segmentation outcomes. To address the challenge, we designed a hybrid CNN-Transformer network to capture both the local and global information. More specifically, deep convolutional neural networks are introduced to exploit the local information. At the same time, we designed a trident multi-layer fusion (TMF) block for the Transformer to fuse contextual information from higher-level (global) features dynamically. Moreover, considering the inherent characteristic of medical image segmentation (e.g., irregular shapes and discontinuous boundaries), we developed united attention (UA) blocks to focus on important feature learning. To evaluate the effectiveness of our proposed approach, we performed experiments on two publicly available datasets, ISIC-2017, and Kvasir-SEG, and compared our results with state-of-the-art approaches. The experimental results demonstrate the superior performance of our approach. The codes are available at https://github.com/Tanghui2000/HTC-Net.
KW - Attention
KW - Contextual information
KW - Deep convolutional neural networks
KW - Medical image segmentation
UR - http://www.scopus.com/inward/record.url?scp=85174729668&partnerID=8YFLogxK
U2 - 10.1016/j.bspc.2023.105605
DO - 10.1016/j.bspc.2023.105605
M3 - Article
AN - SCOPUS:85174729668
SN - 1746-8094
VL - 88
JO - Biomedical Signal Processing and Control
JF - Biomedical Signal Processing and Control
M1 - 105605
ER -