TY - JOUR
T1 - A universal parameter-efficient fine-tuning approach for stereo image super-resolution
AU - Zhou, Yuanbo
AU - Xue, Yuyang
AU - Zhang, Xinlin
AU - Deng, Wei
AU - Wang, Tao
AU - Tan, Tao
AU - Gao, Qinquan
AU - Tong, Tong
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/7/1
Y1 - 2025/7/1
N2 - Despite advances in the use of the strategy of pre-training then fine-tuning in low-level vision tasks, the increasing size of models presents significant challenges for this paradigm, particularly in terms of training time and memory consumption. In addition, unsatisfactory results may occur when pre-trained single-image models are directly applied to a multi-image domain. In this paper, we propose an efficient method for transferring a pre-trained single-image super-resolution transformer network to the domain of stereo image super-resolution (SteISR) using a parameter-efficient fine-tuning approach. Specifically, the concept of stereo adapters and spatial adapters are introduced, which are incorporated into the pre-trained single-image super-resolution transformer network. Subsequently, only the inserted adapters are trained on stereo datasets. Compared with the classical full fine-tuning paradigm, our method can effectively reduce training time and memory consumption by 57% and 15%, respectively. Moreover, this method allows us to train only 4.8% of the original model parameters, achieving state-of-the-art performance on four commonly used SteISR benchmarks. This technology is expected to improve stereo image resolution in various fields such as medical imaging and autonomous driving, thereby indirectly enhancing the accuracy of depth estimation and object recognition tasks.
AB - Despite advances in the use of the strategy of pre-training then fine-tuning in low-level vision tasks, the increasing size of models presents significant challenges for this paradigm, particularly in terms of training time and memory consumption. In addition, unsatisfactory results may occur when pre-trained single-image models are directly applied to a multi-image domain. In this paper, we propose an efficient method for transferring a pre-trained single-image super-resolution transformer network to the domain of stereo image super-resolution (SteISR) using a parameter-efficient fine-tuning approach. Specifically, the concept of stereo adapters and spatial adapters are introduced, which are incorporated into the pre-trained single-image super-resolution transformer network. Subsequently, only the inserted adapters are trained on stereo datasets. Compared with the classical full fine-tuning paradigm, our method can effectively reduce training time and memory consumption by 57% and 15%, respectively. Moreover, this method allows us to train only 4.8% of the original model parameters, achieving state-of-the-art performance on four commonly used SteISR benchmarks. This technology is expected to improve stereo image resolution in various fields such as medical imaging and autonomous driving, thereby indirectly enhancing the accuracy of depth estimation and object recognition tasks.
KW - Autonomous driving
KW - Parameter-efficient fine-tuning
KW - Stereo image super-resolution
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=105001823808&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2025.110703
DO - 10.1016/j.engappai.2025.110703
M3 - Article
AN - SCOPUS:105001823808
SN - 0952-1976
VL - 151
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 110703
ER -