Boosting Light Field Spatial Super-Resolution via Masked Light Field Modeling

Da Yang, Hao Sheng, Sizhe Wang, Shuai Wang, Zhang Xiong, Wei Ke

研究成果: Article同行評審

摘要

Light field (LF) imaging benefits a wide range of applications with geometry information it captured. However, due to the restricted sensor resolution, LF cameras sacrifice spatial resolution for sufficient angular resolution. Hence LF spatial super-resolution (LFSSR), which highly relies on inter-intra view correlation extraction, is widely studied. In this paper, a self-supervised pre-training scheme, named masked LF modeling (MLFM), is proposed to boost the learning of inter-intra view correlation for better super-resolution performance. To achieve this, we first introduce a transformer structure, termed as LFormer, to establish direct inter-view correlations inside the 4D LF. Compared with traditional disentangling operations for LF feature extraction, LFormer avoids unnecessary loss in angular domain. Therefore it performs better in learning the cross-view mapping among pixels with MLFM pre-training. Then by cascading LFormers as encoder, LFSSR network LFormer-Net is designed, which comprehensively performs inter-intra view high-frequency information extraction. In the end, LFormer-Net is pre-trained with MLFM by introducing a Spatially-Random Angularly-Consistent Masking (SRACM) module. With a high masking ratio, MLFM pre-training effectively promotes the performance of LFormer-Net. Extensive experiments on public datasets demonstrate the effectiveness of MLFM pre-training and LFormer-Net. Our approach outperforms state-of-the-art LFSSR methods numerically and visually on both small-and large-disparity datasets.

原文English
頁(從 - 到)1317-1330
頁數14
期刊IEEE Transactions on Computational Imaging
10
DOIs
出版狀態Published - 2024

指紋

深入研究「Boosting Light Field Spatial Super-Resolution via Masked Light Field Modeling」主題。共同形成了獨特的指紋。

引用此