TY - GEN
T1 - Performance Evaluation of Text Embeddings with Online Consumer Reviews in Retail Sectors
AU - Pang, Patrick Cheong Iao
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Analyzing online consumer reviews is one of many popular applications of natural language processing in retail sectors. Text embedding models can transform the textual content of reviews into numerical representations for downstream analytic tasks and therefore they play an essential role in review analytics. As review analytics is increasingly used in the industries, more empirical research is needed to investigate how text embeddings perform in understanding the thoughts and attitudes of customers. In this study, we examined four commonly used text embeddings: namely TF-IDF, word2vec, sent2vec and BERT, to evaluate their performance in predicting the ratings and the sentiments of online consumer reviews. Drawn on the results, we highlight the strengths of these text embeddings and their desirable use cases. Our findings reveal that BERT and sent2vec can produce stable results in predicting the ratings of retail reviews in general. Besides, word2vec is more suitable for identifying negative sentiment within reviews. From a practical perspective, it is worth analyzing reviews from different product categories separately to achieve better results.
AB - Analyzing online consumer reviews is one of many popular applications of natural language processing in retail sectors. Text embedding models can transform the textual content of reviews into numerical representations for downstream analytic tasks and therefore they play an essential role in review analytics. As review analytics is increasingly used in the industries, more empirical research is needed to investigate how text embeddings perform in understanding the thoughts and attitudes of customers. In this study, we examined four commonly used text embeddings: namely TF-IDF, word2vec, sent2vec and BERT, to evaluate their performance in predicting the ratings and the sentiments of online consumer reviews. Drawn on the results, we highlight the strengths of these text embeddings and their desirable use cases. Our findings reveal that BERT and sent2vec can produce stable results in predicting the ratings of retail reviews in general. Besides, word2vec is more suitable for identifying negative sentiment within reviews. From a practical perspective, it is worth analyzing reviews from different product categories separately to achieve better results.
KW - business analytics
KW - online consumer review
KW - retail
KW - text embedding
UR - http://www.scopus.com/inward/record.url?scp=85139121951&partnerID=8YFLogxK
U2 - 10.1109/ICIS54925.2022.9882478
DO - 10.1109/ICIS54925.2022.9882478
M3 - Conference contribution
AN - SCOPUS:85139121951
T3 - Proceedings - 2022 IEEE/ACIS 22nd International Conference on Computer and Information Science, ICIS 2022
SP - 170
EP - 175
BT - Proceedings - 2022 IEEE/ACIS 22nd International Conference on Computer and Information Science, ICIS 2022
A2 - Yao, Zheng'an
A2 - Xu, Simon
A2 - Ma, Jixin
A2 - Du, Wencai
A2 - Lui, Wei
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd IEEE/ACIS International Conference on Computer and Information Science, ICIS 2022
Y2 - 26 June 2022 through 28 June 2022
ER -