跳至主導覽 跳至搜尋 跳過主要內容

EVALUATING POPULAR MOOC PLATFORMS BY GENERATIVE ARTIFICIAL INTELLIGENCE (AI) ROBOTS: HOW CONSISTENT ARE THE ROBOTS?

研究成果: Paper同行評審

4 引文 斯高帕斯(Scopus)

摘要

This article intends to investigate the consistency between a few popular generative AI robots in the evaluation of massive open online course (MOOC) platforms. The four robots experimented with in the study were Claude+, GPT-4, Sage, and Dragonfly, which were tasked with awarding rating scores to the eight major dimensions, namely (1) content/course quality, (2) pedagogical design, (3) learner support, (4) technology infrastructure, (5) social interaction, (6) learner engagement, (7) instructor support, and (8) cost-effectiveness, of the 31 currently very popular MOOC platforms. Only Claude+’s and Dragonfly’s rating scores turned out to be amenable to statistical analysis. For each of the two robots, the minimum, the maximum, the range, and the standard deviation of the rating scores for each of the eight dimensions were computed across all the 31 MOOC platforms. The rating score difference for each of the eight dimensions between the two robots was calculated for each platform. The mean of the absolute value, the minimum, the maximum, the range, and the standard deviation of the differences for each dimension between the two robots were calculated across all platforms. A paired sample t-test was then applied to each dimension for the rating score difference between the two robots over all the platforms. Finally, a correlation coefficient of the rating scores was computed for each of the eight dimensions between the two robots across all the MOOC platforms. The computational results were to reveal whether the two robots awarded discrimination in evaluating each dimension across the platforms, whether any of the two robots systematically underrated or overrated any dimension with respect to the other robot, and whether there was consistency between the two robots in evaluating each dimension across the platforms. It was found that discrimination was prominent in the evaluation of all dimensions save Dragonfly’s rating of the dimensions learner support, technology infrastructure, and instructor support, Claude+ systematically underrated all dimensions (p < 0.000 < 0.05) compared with Dragonfly except for the dimension cost-effectiveness, which Claude+ systematically overrated (p = 0.003 < 0.05), and the evaluation by the two robots was consistent only for the dimensions content/course quality, pedagogical design, and learner engagement with the correlation coefficients ranging from 0.445 to 0.632 (p from 0.000 to 0.012 < 0.05). Consistency implies at least the partial trustworthiness of the evaluation of these MOOC platforms by either of these two popular generative AI robots based on the analogous concept of convergent validity for an operationalized instrument to measure an abstract construct.

原文English
頁面329-336
頁數8
出版狀態Published - 2023
事件20th International Conference on Cognition and Exploratory Learning in Digital Age, CELDA 2023 - Madeira Island, Portugal
持續時間: 21 10月 202323 10月 2023

Conference

Conference20th International Conference on Cognition and Exploratory Learning in Digital Age, CELDA 2023
國家/地區Portugal
城市Madeira Island
期間21/10/2323/10/23

指紋

深入研究「EVALUATING POPULAR MOOC PLATFORMS BY GENERATIVE ARTIFICIAL INTELLIGENCE (AI) ROBOTS: HOW CONSISTENT ARE THE ROBOTS?」主題。共同形成了獨特的指紋。

引用此