EXPLORING RANKING CONSISTENCY OF GENERATIVE AI IN MOOC PLATFORM EVALUATION: A NON-PARAMETRIC APPROACH

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper extends a prior study on the consistency of generative Artificial Intelligence (AI) models in evaluating Massive Open Online Course (MOOC) platforms. While the original work focused on the consistency of direct numerical scores, this research investigates the consistency of the rankings derived from those scores. When evaluating platforms, the relative order (i.e., which platform is better than another) is often more critical to a decision-maker than the absolute scores, which may be subject to systematic biases. This study analyzes the scores of 31 MOOC platforms across eight dimensions as evaluated by two AI models, Claude+ and Dragonfly. A suite of non-parametric statistical methods are employed, including Spearman’s Rank Correlation Coefficient (ρ), Kendall’s Tau (τ), and the top-weighted Rank-Biased Overlap (RBO), to measure the concordance of the platform rankings produced by each model. The Wilcoxon Signed-Rank Test is used to assess systematic differences in scoring. Results indicate a moderate to strong monotonic correlation in rankings for dimensions like (2) pedagogical design, (1) content/course quality, and (6) Learner Engagement, reinforcing the original study’s findings of consistency. However, the RBO analysis reveals that this agreement is weaker for the top-ranked platforms, providing a more nuanced understanding of AI evaluation consistency. The systemic scoring bias found in the original study is also reaffirmed here. This rank-based analysis offers a robust alternative to score-based comparisons, mitigating the effects of differing internal scoring scales and highlighting the practical utility of AI evaluations for comparative decision-making. By shifting the focus from absolute scores to relative rankings, this study underscores the practical value of generative AI as a decision-support tool in educational technology evaluation. The findings not only enhance methodological rigor in AI-based assessments but also provide actionable insights for learners and institutions navigating an increasingly complex MOOC landscape.

Original languageEnglish
Title of host publicationProceedings of the 22nd International Conference on Cognition and Exploratory Learning in the Digital Age, CELDA 2025
EditorsDemetrios G. Sampson, Dirk Ifenthaler, Dirk Ifenthaler, Pedro Isaias, Luis Rodrigues
PublisherIADIS Press
Pages53-60
Number of pages8
ISBN (Electronic)9789898704726
DOIs
Publication statusPublished - 2025
Event22nd International Conference on Cognition and Exploratory Learning in the Digital Age, CELDA 2025 - Porto, Portugal
Duration: 1 Nov 20253 Nov 2025

Publication series

NameProceedings of the 22nd International Conference on Cognition and Exploratory Learning in the Digital Age, CELDA 2025

Conference

Conference22nd International Conference on Cognition and Exploratory Learning in the Digital Age, CELDA 2025
Country/TerritoryPortugal
CityPorto
Period1/11/253/11/25

Keywords

  • Generative AI
  • MOOC Platforms
  • Ranking Consistency
  • Relative Rankings

Fingerprint

Dive into the research topics of 'EXPLORING RANKING CONSISTENCY OF GENERATIVE AI IN MOOC PLATFORM EVALUATION: A NON-PARAMETRIC APPROACH'. Together they form a unique fingerprint.

Cite this