TY - GEN
T1 - Natural Language and LLMs in Human-Robot Interaction
T2 - 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, ICHORA 2025
AU - Olaiya, Kelvin
AU - Delnevo, Giovanni
AU - Ceccarini, Chiara
AU - Lam, Chan Tong
AU - Pau, Giovanni
AU - Salomoni, Paola
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Natural language provides an intuitive and accessible way for humans to communicate with robots, fostering more natural and flexible interaction across a range of tasks. This study investigates how effectively users can command a robot using natural language within a simulated environment. By employing Gemini Flash 2.0 as the underlying Large Language Model (LLM) to interpret and translate user prompts into executable plans, we explore both the strengths and limitations of this approach. The experiments evaluated user-generated prompts across multiple predefined tasks, revealing a spectrum of outcomes - from successful task completions to errors such as misinterpretations, spatial failures, and hallucinated behaviors where the robot acted on non-existent information. The results highlight how different communication strategies, combining directive and conversational phrasing, influenced task performance. This work contributes to advancing Human-Robot Interaction (HRI) design by emphasizing the potential of LLM-powered systems while addressing the challenges of ambiguity and error resilience in user-driven command structures.
AB - Natural language provides an intuitive and accessible way for humans to communicate with robots, fostering more natural and flexible interaction across a range of tasks. This study investigates how effectively users can command a robot using natural language within a simulated environment. By employing Gemini Flash 2.0 as the underlying Large Language Model (LLM) to interpret and translate user prompts into executable plans, we explore both the strengths and limitations of this approach. The experiments evaluated user-generated prompts across multiple predefined tasks, revealing a spectrum of outcomes - from successful task completions to errors such as misinterpretations, spatial failures, and hallucinated behaviors where the robot acted on non-existent information. The results highlight how different communication strategies, combining directive and conversational phrasing, influenced task performance. This work contributes to advancing Human-Robot Interaction (HRI) design by emphasizing the potential of LLM-powered systems while addressing the challenges of ambiguity and error resilience in user-driven command structures.
KW - AI for Robotics
KW - Human-Robot Interaction
KW - Large Language Models
KW - Zero-Shot Learning
UR - https://www.scopus.com/pages/publications/105008421973
U2 - 10.1109/ICHORA65333.2025.11016850
DO - 10.1109/ICHORA65333.2025.11016850
M3 - Conference contribution
AN - SCOPUS:105008421973
T3 - ICHORA 2025 - 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, Proceedings
BT - ICHORA 2025 - 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 May 2025 through 24 May 2025
ER -