Skip to content

Task 1.2

JadenHoch edited this page Aug 6, 2024 · 3 revisions

Executive Summury

I split the Yelp review dataset into two groups: one with positive reviews (4-5 stars) and one with negative reviews (1-3 stars). The main goal of this topic modeling is to identify the key differences in talking points (topics) between the two groups. To achieve this, I used two different topic modeling techniques: Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF). By applying these methods, I aimed to uncover the primary themes and subjects discussed by reviewers in both positive and negative reviews. This analysis helps to highlight the aspects that contribute to a restaurant receiving high ratings versus those that lead to lower ratings.

Observation with low points (<= 3 Points)

Negative topics are mostly related to food and location. It seems that these two topics are not well received in areas with low ratings. They are the main complain areas.

Observation with high points (> 3 Points)

In contrast, positive reviews also frequently mention food and location but in a favorable manner. In my opinion, food and location appear to be the most important factors influencing a restaurant's rating.

Positive Review ratings topics: image

Clone this wiki locally