Abstract
Light field (LF) imaging benefits a wide range of applications with geometry information it captured. However, due to the restricted sensor resolution, LF cameras sacrifice spatial resolution for sufficient angular resolution. Hence LF spatial super-resolution (LFSSR), which highly relies on inter-intra view correlation extraction, is widely studied. In this paper, a self-supervised pre-training scheme, named masked LF modeling (MLFM), is proposed to boost the learning of inter-intra view correlation for better super-resolution performance. To achieve this, we first introduce a transformer structure, termed as LFormer, to establish direct inter-view correlations inside the 4D LF. Compared with traditional disentangling operations for LF feature extraction, LFormer avoids unnecessary loss in angular domain. Therefore it performs better in learning the cross-view mapping among pixels with MLFM pre-training. Then by cascading LFormers as encoder, LFSSR network LFormer-Net is designed, which comprehensively performs inter-intra view high-frequency information extraction. In the end, LFormer-Net is pre-trained with MLFM by introducing a Spatially-Random Angularly-Consistent Masking (SRACM) module. With a high masking ratio, MLFM pre-training effectively promotes the performance of LFormer-Net. Extensive experiments on public datasets demonstrate the effectiveness of MLFM pre-training and LFormer-Net. Our approach outperforms state-of-the-art LFSSR methods numerically and visually on both small-and large-disparity datasets.
Original language | English |
---|---|
Pages (from-to) | 1317-1330 |
Number of pages | 14 |
Journal | IEEE Transactions on Computational Imaging |
Volume | 10 |
DOIs | |
Publication status | Published - 2024 |
Keywords
- Light field spatial super-resolution
- inter-intra view correlation construction
- masked light field modeling
- self-supervised pre-training
- transformer
Fingerprint
Dive into the research topics of 'Boosting Light Field Spatial Super-Resolution via Masked Light Field Modeling'. Together they form a unique fingerprint.Press/Media
-
New Computational Imaging Findings from Beihang University Reported (Boosting Light Field Spatial Super-resolution Via Masked Light Field Modeling)
11/10/24
1 item of Media coverage
Press/Media