A Comparative Analysis of Text Mining Methodologies for Online Consumer Reviews

Authors

  • Sonya Zhang California State Polytechnic University, Pomona
  • Daniela Lopez California State Polytechnic University, Pomona
  • Lyndon Lam California State Polytechnic University, Pomona
  • Abhishek Yenumula California State Polytechnic University, Pomona
  • Tony Vu California State Polytechnic University, Pomona

DOI:

https://doi.org/10.33423/jabe.v25i7.6724

Keywords:

business, economics, online consumer reviews, Yelp, machine learning, natural language processing, topic modeling, LDA, sentiment analysis, text classification, CNN, Word2vec, GloVe

Abstract

Extracting meaningful insights from the sheer volume of Online Consumer Reviews (OCRs) has been challenging. We aim to explore the most effective methodologies for text mining of OCRs, covering topic extraction, topic classification, and sentiment analysis. Through a comprehensive review of recent research on text mining applied to OCRs, we found that LDA2Vec can enhance the effectiveness of conventional LDA for topic extraction. Additionally, the combination of Convolutional Neural Networks (CNN) and GloVe demonstrates the best performance for topic classification, while CNN and SVM outperform other algorithms for sentiment analysis. Furthermore, the spaCy Natural Language Processing (NLP) proves to be a more effective choice for text pre-processing compared to Natural Language Toolkit (NLTK). Subsequently, we applied these refined models to a Yelp reviews dataset, assessed their performance against conventional models, and provided a comprehensive discussion of the results and limitations. The insights gained from this study can be valuable for developing effective models in OCR analysis.

Downloads

Published

2023-12-31

How to Cite

Zhang, S., Lopez, D., Lam, L., Yenumula, A., & Vu, T. (2023). A Comparative Analysis of Text Mining Methodologies for Online Consumer Reviews. Journal of Applied Business and Economics, 25(7). https://doi.org/10.33423/jabe.v25i7.6724

Issue

Section

Articles