A Comparative Analysis of Text Mining Methodologies for Online Consumer Reviews
DOI:
https://doi.org/10.33423/jabe.v25i7.6724Keywords:
business, economics, online consumer reviews, Yelp, machine learning, natural language processing, topic modeling, LDA, sentiment analysis, text classification, CNN, Word2vec, GloVeAbstract
Extracting meaningful insights from the sheer volume of Online Consumer Reviews (OCRs) has been challenging. We aim to explore the most effective methodologies for text mining of OCRs, covering topic extraction, topic classification, and sentiment analysis. Through a comprehensive review of recent research on text mining applied to OCRs, we found that LDA2Vec can enhance the effectiveness of conventional LDA for topic extraction. Additionally, the combination of Convolutional Neural Networks (CNN) and GloVe demonstrates the best performance for topic classification, while CNN and SVM outperform other algorithms for sentiment analysis. Furthermore, the spaCy Natural Language Processing (NLP) proves to be a more effective choice for text pre-processing compared to Natural Language Toolkit (NLTK). Subsequently, we applied these refined models to a Yelp reviews dataset, assessed their performance against conventional models, and provided a comprehensive discussion of the results and limitations. The insights gained from this study can be valuable for developing effective models in OCR analysis.