Cross-View Geo-Localization via Learning Correspondence Semantic Similarity Knowledge

Guanli Chen, Guoheng Huang, Xiaochen Yuan, Xuhang Chen, Guo Zhong, Chi Man Pun

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cross-view geo-localization aims at retrieving and estimating accurate geographic locations from ground images in a geo-tagged aerial image database. Existing approaches focus on two independent two-branch models to learn fine-grained representations of perspectives, neglecting to learn more discriminative representations through interactions. In this paper, we propose the GeoSSK method, which adapts the learning process of the model by learning local semantic similarity information between aerial and ground pairs via a new interaction module. We then transfer the semantic similarity knowledge learned during the interaction process to the student model through knowledge distillation. Specifically, we design a Cross-fusion Interaction Module (CIM) based on cross-attention, which learns local semantic similarity information between perspectives to adjust the learning of the model. Meanwhile, considering the presence of visual distractions in complex environments, we adjust the degree of interaction between perspectives by the Contribution Factor (CF) of the local representation to the global representation. In addition, we introduce Semantic Similarity Knowledge Distillation (SSKD) between teachers and students for cross-view geo-localization. The interaction learning model serves as the teacher, transferring its semantic similarity knowledge to the student. At the same time, we designed an Incorrect Knowledge Filter (IKF) to filter incorrect knowledge of teachers. Experimental results demonstrate the effectiveness and competitive performance of GeoSSK.

Original languageEnglish
Title of host publicationMultiMedia Modeling - 31st International Conference on Multimedia Modeling, MMM 2025, Proceedings
EditorsIchiro Ide, Ioannis Kompatsiaris, Changsheng Xu, Keiji Yanai, Wei-Ta Chu, Naoko Nitta, Michael Riegler, Toshihiko Yamasaki
PublisherSpringer Science and Business Media Deutschland GmbH
Pages220-233
Number of pages14
ISBN (Print)9789819620531
DOIs
Publication statusPublished - 2025
Event31st International Conference on Multimedia Modeling, MMM 2025 - Nara, Japan
Duration: 8 Jan 202510 Jan 2025

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15520 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Conference on Multimedia Modeling, MMM 2025
Country/TerritoryJapan
CityNara
Period8/01/2510/01/25

Keywords

  • cross-attention
  • geo-localization
  • image retrieval
  • knowledge distillation

Fingerprint

Dive into the research topics of 'Cross-View Geo-Localization via Learning Correspondence Semantic Similarity Knowledge'. Together they form a unique fingerprint.

Cite this