TY - JOUR
T1 - 3MOS
T2 - a multi-source, multi-resolution, and multi-scene optical-SAR dataset with insights for multi-modal image matching
AU - Ye, Yibin
AU - Teng, Xichao
AU - Yang, Hongrui
AU - Chen, Shuo
AU - Sun, Yuli
AU - Bian, Yijie
AU - Tan, Tao
AU - Li, Zhang
AU - Yu, Qifeng
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Optical-SAR image matching is a fundamental task for remote sensing applications. While existing methods perform well on some popular datasets such as SEN1-2 and WHU-SEN-City, their generalizability across diverse data sources such as satellites, spatial resolutions, and scenes remains insufficiently investigated, hindering the practical implementation of optical-SAR matching in various downstream tasks. Thus, 3MOS, the first multi-source, multi-resolution, and multi-scene optical-SAR dataset, was proposed in our study to address this gap. This dataset consists of 113k optical-SAR image pairs, with the SAR data collected from five satellites and resolutions ranging from 3.5 m to 12.5 m, further categorized into eight scenes, such as urban, rural, and plains through a simple but practical classification strategy. Based on this dataset, the performance of optical-SAR matching methods was evaluated through the data with diverse characteristics. Additionally, extensive experiments were conducted, and the following two findings were obtained. 1) None of the state-of-the-art methods achieved consistently superior performance across different sources, resolutions, and scenes, specifying significant generalization challenges for diverse downstream task data. 2) Training data distribution significantly impacted the matching performance of deep-learning models, highlighting the domain adaptation challenge in optical-SAR image matching. Furthermore, the practical utility of the dataset was comprehensively validated through multimodal change detection experiments, demonstrating its substantial value for a wide range of downstream applications.
AB - Optical-SAR image matching is a fundamental task for remote sensing applications. While existing methods perform well on some popular datasets such as SEN1-2 and WHU-SEN-City, their generalizability across diverse data sources such as satellites, spatial resolutions, and scenes remains insufficiently investigated, hindering the practical implementation of optical-SAR matching in various downstream tasks. Thus, 3MOS, the first multi-source, multi-resolution, and multi-scene optical-SAR dataset, was proposed in our study to address this gap. This dataset consists of 113k optical-SAR image pairs, with the SAR data collected from five satellites and resolutions ranging from 3.5 m to 12.5 m, further categorized into eight scenes, such as urban, rural, and plains through a simple but practical classification strategy. Based on this dataset, the performance of optical-SAR matching methods was evaluated through the data with diverse characteristics. Additionally, extensive experiments were conducted, and the following two findings were obtained. 1) None of the state-of-the-art methods achieved consistently superior performance across different sources, resolutions, and scenes, specifying significant generalization challenges for diverse downstream task data. 2) Training data distribution significantly impacted the matching performance of deep-learning models, highlighting the domain adaptation challenge in optical-SAR image matching. Furthermore, the practical utility of the dataset was comprehensively validated through multimodal change detection experiments, demonstrating its substantial value for a wide range of downstream applications.
KW - Image matching
KW - Image registration
KW - Multi-modal images
KW - Optical image
KW - Synthetic aperture radar (SAR) image
UR - https://www.scopus.com/pages/publications/105018586243
U2 - 10.1007/s44267-025-00091-0
DO - 10.1007/s44267-025-00091-0
M3 - Article
AN - SCOPUS:105018586243
SN - 2097-3330
VL - 3
JO - Visual Intelligence
JF - Visual Intelligence
IS - 1
M1 - 19
ER -