跳至主導覽 跳至搜尋 跳過主要內容

The Convergent Validity of Computer Operating Systems’ Usability Evaluation by Popular Generative Artificial Intelligence (AI) Robots

研究成果: Chapter同行評審

摘要

This article seeks to examine the convergent validity of (and thus the consistency between) computer operating systems’ (OSs’) usability evaluation by a number of popular generative artificial intelligence (AI) robots. Totally 18 popular OS versions were included in the study, they specifically being the various versions of the three leading OS families of Windows, macOS, and Linux. Usability was evaluated in eight major dimensions, namely, (1) effectiveness, (2) efficiency, (3) learnability, (4) memorability, (5) safety, (6) utility, (7) ergonomics, and (8) accessibility. Experimenting with a handful of generative AI robots, Microsoft’s Copilot, Google’s PaLM, and Meta’s Llama managed to individually accord rating scores to the aforementioned eight dimensions. For each robot of this trio, the minimum, the maximum, the range, and the standard deviation of the rating scores for each of the eight dimensions were computed across the OS versions. The rating score difference for each of the eight dimensions between each pair of these robots was calculated for each OS version. The mean of the absolute value, the minimum, the maximum, the range, and the standard deviation of the differences for each dimension between each robot pair were calculated across the OS versions. A paired sample t-test was then applied to each dimension for the rating score difference between each robot pair over the versions. Finally, Cronbach’s coefficient alpha (α) of the rating scores was computed for each dimension between all the three robots across the versions. These computational outcomes were to affirm whether each robot awarded discrimination in evaluating each dimension across the OS versions, whether each robot vis-à-vis any other robots erratically and/or systematically overrate or underrate any dimension over the OS versions, and whether there was high convergent validity of (and thus consistency between) all the three robots in evaluating each dimension across the OS versions. Among other ancillary results, it was found that the convergent validity of the three robots in evaluating all the eight dimensions was high, and thus such evaluation is trustworthy at least to an extent.

原文English
主出版物標題Applied Human Factors and Ergonomics International
發行者AHFE International
頁面305-315
頁數11
DOIs
出版狀態Published - 2024

出版系列

名字Applied Human Factors and Ergonomics International
120
ISSN(電子)2771-0718

指紋

深入研究「The Convergent Validity of Computer Operating Systems’ Usability Evaluation by Popular Generative Artificial Intelligence (AI) Robots」主題。共同形成了獨特的指紋。

引用此