TY - GEN
T1 - Exploring the Capabilities and Limitations of Large Language Models for Zero-Shot Human-Robot Interaction
AU - Olaiya, Kelvin
AU - Delnevo, Giovanni
AU - Lam, Chan Tong
AU - Pau, Giovanni
AU - Salomoni, Paola
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Human-robot interaction (HRI) is an evolving field with a growing emphasis on enabling robots to understand and perform tasks based on natural language commands. Recently, Large Language Models (LLMs) have emerged as a promising tool for such tasks, offering the potential to enable zero-shot learning and flexible interaction without task-specific training. In this paper, we explore the use of LLMs for zero-shot navigation and exploration tasks in robotic systems, specifically evaluating their performance with the PR2 Clearpath and Khepera IV robots in a simulated environment. Our findings demonstrate promising results, particularly in the LLM's ability to exhibit exploratory behavior and iterative reasoning when faced with ambiguous or incomplete visual input. These capabilities suggest a strong potential for LLMs in human-robot interaction. However, challenges were also identified, such as difficulties with target recognition, object misidentification, hallucination of information, and issues with movement execution, highlighting the need for improvements in these areas for real-world applications.
AB - Human-robot interaction (HRI) is an evolving field with a growing emphasis on enabling robots to understand and perform tasks based on natural language commands. Recently, Large Language Models (LLMs) have emerged as a promising tool for such tasks, offering the potential to enable zero-shot learning and flexible interaction without task-specific training. In this paper, we explore the use of LLMs for zero-shot navigation and exploration tasks in robotic systems, specifically evaluating their performance with the PR2 Clearpath and Khepera IV robots in a simulated environment. Our findings demonstrate promising results, particularly in the LLM's ability to exhibit exploratory behavior and iterative reasoning when faced with ambiguous or incomplete visual input. These capabilities suggest a strong potential for LLMs in human-robot interaction. However, challenges were also identified, such as difficulties with target recognition, object misidentification, hallucination of information, and issues with movement execution, highlighting the need for improvements in these areas for real-world applications.
KW - AI for Robotics
KW - Human-Robot Interaction
KW - Large Language Models
KW - Zero-Shot Learning
UR - https://www.scopus.com/pages/publications/105032726424
U2 - 10.1109/ISCC65549.2025.11325771
DO - 10.1109/ISCC65549.2025.11325771
M3 - Conference contribution
AN - SCOPUS:105032726424
T3 - Proceedings - IEEE Symposium on Computers and Communications
BT - 30th IEEE Symposium on Computers and Communications, ISCC 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th IEEE Symposium on Computers and Communications, ISCC 2025
Y2 - 2 July 2025 through 5 July 2025
ER -