TY - GEN
T1 - Performance Comparison of Deep Learning Text Embeddings in Sentiment Analysis Tasks with Online Consumer Reviews
AU - Yang, Ziyi
AU - Pang, Patrick Cheong Iao
AU - Kan, Ho Yin
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/12/23
Y1 - 2022/12/23
N2 - In order to investigate the effect of various natural language processing models on different data processing, this paper adopted the consumer reviews of two well-known Internet retailing websites: Yelp and Zappos, and used four text embedding methods: word2vec, Glove, BERT, and GPT-2 and two text classification methods: SVM and Neural Network (NN) for text classification, in order to compare the performance of the combinations of these text mining techniques. The result shows that BERT is the best-performing text embedding method overall in both datasets when used with both SVM and NN. It is also found that NN is better than SVM for overall text classification. As an exploratory experiment, we aim to provide a three-dimensional comparison to find the most suitable algorithm for consumer review data, and the implication is that BERT and NN can achieve satisfactory results in most of the scenarios.
AB - In order to investigate the effect of various natural language processing models on different data processing, this paper adopted the consumer reviews of two well-known Internet retailing websites: Yelp and Zappos, and used four text embedding methods: word2vec, Glove, BERT, and GPT-2 and two text classification methods: SVM and Neural Network (NN) for text classification, in order to compare the performance of the combinations of these text mining techniques. The result shows that BERT is the best-performing text embedding method overall in both datasets when used with both SVM and NN. It is also found that NN is better than SVM for overall text classification. As an exploratory experiment, we aim to provide a three-dimensional comparison to find the most suitable algorithm for consumer review data, and the implication is that BERT and NN can achieve satisfactory results in most of the scenarios.
KW - business analytics
KW - deep learning
KW - online consumer reviews
KW - text embedding
UR - http://www.scopus.com/inward/record.url?scp=85152119695&partnerID=8YFLogxK
U2 - 10.1145/3582197.3582198
DO - 10.1145/3582197.3582198
M3 - Conference contribution
AN - SCOPUS:85152119695
T3 - ACM International Conference Proceeding Series
SP - 1
EP - 7
BT - Proceedings of the 2022 10th International Conference on Information Technology
PB - Association for Computing Machinery
T2 - 10th International Conference on Information Technology: IoT and Smart City, ICIT 2022
Y2 - 23 December 2022 through 26 December 2022
ER -