TY - GEN
T1 - ARTIFICIAL INTELLIGENCE (AI) ASSISTANTS’ EVALUATION OF ENVIRONMENTAL, SOCIAL, AND GOVERNANCE (ESG)
T2 - 21st International Conference on Applied Computing 2024, AC 2024 and 23rd International Conference on WWW/Internet 2024, ICWI 2024
AU - Chan, Victor K.Y.
N1 - Publisher Copyright:
© 2024 Proceedings of the International Conferences on Applied Computing and WWW/Internet 2024. All rights reserved.
PY - 2024
Y1 - 2024
N2 - This paper aims to investigate how consistent and thus reliable individual popular generative artificial intelligence (AI) assistants are in evaluating the environmental, social, and governance (ESG) performance of the top companies/stocks among the S&P 500. The three assistants employed in the underlying study were Meta Llama, Google PaLM, and Microsoft Copilot, which were independently requested to award rating scores to the three ESG performance components, namely, (1) Environmental, (2) Social, and (3) Governance, of the top 40 companies/stocks among the S&P 500. For each of the three assistants, the minimum, the maximum, the range, and the standard deviation of the rating scores for each of the three components were calculated across all the 40 companies/stocks. The rating score difference for each of the three components between any pair of the above three assistants was computed for each company/stock. The mean of the absolute value, the minimum, the maximum, the range, and the standard deviation of the differences for each component between each pair of assistants were calculated across all the companies/stocks. A paired sample t-test was then administered to each component for the rating score difference between each assistant pair over all the companies/stocks. Finally, Cronbach’s coefficient alpha of the rating scores was computed for each of the three components between all the three assistants across all the companies/stocks. These computational results were to signify whether the three assistants accorded discrimination in evaluating each component across the companies/stocks, whether each assistant, vis-à-vis each other assistant, erratically or systematically overrate or underrate any component over the companies/stocks, and whether the three assistants were consistent and reliable in evaluating each component across the companies/stocks. Apart from some ancillary results, it was affirmed that the three assistants were marginally consistent and thus reliable, at least in a sense analogous to convergent validity and internal consistency, in evaluating all the three components of the top 40 companies/stocks among the S&P 500.
AB - This paper aims to investigate how consistent and thus reliable individual popular generative artificial intelligence (AI) assistants are in evaluating the environmental, social, and governance (ESG) performance of the top companies/stocks among the S&P 500. The three assistants employed in the underlying study were Meta Llama, Google PaLM, and Microsoft Copilot, which were independently requested to award rating scores to the three ESG performance components, namely, (1) Environmental, (2) Social, and (3) Governance, of the top 40 companies/stocks among the S&P 500. For each of the three assistants, the minimum, the maximum, the range, and the standard deviation of the rating scores for each of the three components were calculated across all the 40 companies/stocks. The rating score difference for each of the three components between any pair of the above three assistants was computed for each company/stock. The mean of the absolute value, the minimum, the maximum, the range, and the standard deviation of the differences for each component between each pair of assistants were calculated across all the companies/stocks. A paired sample t-test was then administered to each component for the rating score difference between each assistant pair over all the companies/stocks. Finally, Cronbach’s coefficient alpha of the rating scores was computed for each of the three components between all the three assistants across all the companies/stocks. These computational results were to signify whether the three assistants accorded discrimination in evaluating each component across the companies/stocks, whether each assistant, vis-à-vis each other assistant, erratically or systematically overrate or underrate any component over the companies/stocks, and whether the three assistants were consistent and reliable in evaluating each component across the companies/stocks. Apart from some ancillary results, it was affirmed that the three assistants were marginally consistent and thus reliable, at least in a sense analogous to convergent validity and internal consistency, in evaluating all the three components of the top 40 companies/stocks among the S&P 500.
KW - Environmental
KW - ESG
KW - Generative Artificial Intelligence (AI)
KW - Governance
KW - S&P 500
KW - Social
UR - http://www.scopus.com/inward/record.url?scp=85214142053&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85214142053
T3 - Proceedings of the International Conferences on Applied Computing and WWW/Internet 2024
SP - 285
EP - 292
BT - Proceedings of the International Conferences on Applied Computing and WWW/Internet 2024
A2 - Miranda, Paula
A2 - Isaias, Pedro
A2 - Isaias, Pedro
A2 - Rodrigues, Luis
PB - IADIS Press
Y2 - 26 October 2024 through 28 October 2024
ER -