Measuring the State-Observation-Gap in POMDPs: An Exploration of Observation Confidence and Weighting Algorithms

Yide Yu, Yan Ma, Yue Liu, Dennis Wong, Kin Lei, José Vicente Egas-López

研究成果: Conference contribution同行評審

摘要

The objective of this study is to measure the discrepancy between states and observations within the context of the Partially Observable Markov Decision Process (POMDP). The gap between states and observations is formulated as a State-Observation-Gap (SOG) problem, represented by the symbol Δ, where states and observations are treated as sets. The study also introduces the concept of Observation Confidence (OC) which serves as an indicator of the reliability of the observation, and it is established that there is a positive correlation between OC and Δ. To calculate the cumulative entropy λ of rewards in ⟨ o, a, · ⟩, we propose two weighting algorithms, namely Universal Weighting and Specific Weighting. Empirical and theoretical assessments carried out in the Cliff Walking environment attest to the effectiveness of both algorithms in determining Δ and OC.

原文English
主出版物標題Artificial Intelligence Applications and Innovations - 19th IFIP WG 12.5 International Conference, AIAI 2023, Proceedings
編輯Ilias Maglogiannis, Lazaros Iliadis, John MacIntyre, Manuel Dominguez
發行者Springer Science and Business Media Deutschland GmbH
頁面137-148
頁數12
ISBN(列印)9783031341106
DOIs
出版狀態Published - 2023
事件19th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2023 - León, Spain
持續時間: 14 6月 202317 6月 2023

出版系列

名字IFIP Advances in Information and Communication Technology
675 IFIP
ISSN(列印)1868-4238
ISSN(電子)1868-422X

Conference

Conference19th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2023
國家/地區Spain
城市León
期間14/06/2317/06/23

指紋

深入研究「Measuring the State-Observation-Gap in POMDPs: An Exploration of Observation Confidence and Weighting Algorithms」主題。共同形成了獨特的指紋。

引用此