TY - JOUR
T1 - HA-Pos
T2 - Hierarchical Prompt-Guided Adaptive Detection for Cross-view Visual Positioning System
AU - Zheng, Jiehao
AU - Huang, Guoheng
AU - Chen, Xiaoyong
AU - Fang, Haoran
AU - Zhao, Kaiqi
AU - Yuan, Xiaochen
AU - Ling, Bingo Wing Kuen
AU - Tsang, Kim Fung
AU - Chen, Guanli
AU - Pun, Chi Man
N1 - Publisher Copyright:
© 1975-2011 IEEE.
PY - 2026
Y1 - 2026
N2 - With the rapid proliferation of Location-Based Services (LBS), achieving high-precision self-positioning on consumer-grade mobile devices—such as smartphones and civil drones—remains a critical challenge, particularly in GPS-denied or multipath-prone urban environments. This paper proposes HA-Pos, a novel hierarchical adaptive prompting mechanism enhancing the Cross-view Visual Positioning System (CVPS) for consumer electronics. The proposed method enables target specification via a user-defined click on a query image captured by a consumer terminal, subsequently locating that object within corresponding satellite reference imagery. Unlike traditional methods struggling with cross-view geometric distortions, HA-Pos incorporates a Hierarchical Prompt Query Encoder (HPQE). This encoder provides precise spatial guidance across various depth stages, significantly bolstering the ability to distinguish target objects from distractors. Building upon this, a Geometric Adaptive Decoupled Head (GAD-Head) is designed to improve geometric adaptability and positioning accuracy. The GAD-Head integrates deformable convolutions as a Deformation-Aware Module (DAM) to effectively capture geometric variations while independently optimizing regression and classification tasks. Extensive experiments demonstrate that HA-Pos achieves state-of-the-art performance on the CVOGL benchmark dataset.
AB - With the rapid proliferation of Location-Based Services (LBS), achieving high-precision self-positioning on consumer-grade mobile devices—such as smartphones and civil drones—remains a critical challenge, particularly in GPS-denied or multipath-prone urban environments. This paper proposes HA-Pos, a novel hierarchical adaptive prompting mechanism enhancing the Cross-view Visual Positioning System (CVPS) for consumer electronics. The proposed method enables target specification via a user-defined click on a query image captured by a consumer terminal, subsequently locating that object within corresponding satellite reference imagery. Unlike traditional methods struggling with cross-view geometric distortions, HA-Pos incorporates a Hierarchical Prompt Query Encoder (HPQE). This encoder provides precise spatial guidance across various depth stages, significantly bolstering the ability to distinguish target objects from distractors. Building upon this, a Geometric Adaptive Decoupled Head (GAD-Head) is designed to improve geometric adaptability and positioning accuracy. The GAD-Head integrates deformable convolutions as a Deformation-Aware Module (DAM) to effectively capture geometric variations while independently optimizing regression and classification tasks. Extensive experiments demonstrate that HA-Pos achieves state-of-the-art performance on the CVOGL benchmark dataset.
KW - Cross-view Visual Positioning
KW - geometry adaptability
KW - hierarchical prompting
UR - https://www.scopus.com/pages/publications/105034107740
U2 - 10.1109/TCE.2026.3676565
DO - 10.1109/TCE.2026.3676565
M3 - Article
AN - SCOPUS:105034107740
SN - 0098-3063
JO - IEEE Transactions on Consumer Electronics
JF - IEEE Transactions on Consumer Electronics
ER -