HTC-Net: A hybrid CNN-transformer framework for medical image segmentation

Hui Tang, Yuanbin Chen, Tao Wang, Yuanbo Zhou, Longxuan Zhao, Qinquan Gao, Min Du, Tao Tan, Xinlin Zhang, Tong Tong

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

Automated medical image segmentation is a crucial step in clinical analysis and diagnosis, as it can improve diagnostic efficiency and accuracy. Deep convolutional neural networks (DCNNs) have been widely used in the medical field, achieving excellent results. The high complexity of medical images poses a significant challenge for many networks in balancing local and global information, resulting in unstable segmentation outcomes. To address the challenge, we designed a hybrid CNN-Transformer network to capture both the local and global information. More specifically, deep convolutional neural networks are introduced to exploit the local information. At the same time, we designed a trident multi-layer fusion (TMF) block for the Transformer to fuse contextual information from higher-level (global) features dynamically. Moreover, considering the inherent characteristic of medical image segmentation (e.g., irregular shapes and discontinuous boundaries), we developed united attention (UA) blocks to focus on important feature learning. To evaluate the effectiveness of our proposed approach, we performed experiments on two publicly available datasets, ISIC-2017, and Kvasir-SEG, and compared our results with state-of-the-art approaches. The experimental results demonstrate the superior performance of our approach. The codes are available at https://github.com/Tanghui2000/HTC-Net.

Original languageEnglish
Article number105605
JournalBiomedical Signal Processing and Control
Volume88
DOIs
Publication statusPublished - Feb 2024

Keywords

  • Attention
  • Contextual information
  • Deep convolutional neural networks
  • Medical image segmentation

Fingerprint

Dive into the research topics of 'HTC-Net: A hybrid CNN-transformer framework for medical image segmentation'. Together they form a unique fingerprint.

Cite this