Helping Restaurants Improve Their Yelp Ratings

The Problem

On average, the Yelp review website gets over 178 million unique visitors every month. Studies show (and common sense says) that bad reviews drive away customers and good ones attract them. All sorts of businesses are listed on Yelp, but the site is most commonly used for reviewing restaurants. It is critical for restaurants to do what they can to maximize their ratings on Yelp to gain a competitive advantage.

On this page we analyze data from the Yelp Dataset about restaurant chains in North America to gain insights into what individual restaurants can do to bring up their average Yelp rating. We consider the following chains: Starbucks, McDonald's, Subway, and Taco Bell.

The Data

The scatter plot below (left) shows the average Yelp rating of a location vs. the number of ratings for that location. (Try selecting a subset of the points by clicking on the plot!) The histogram below (right) shows the breakdown of locations by state. The Yelp Dataset only includes data for a handful of US states and three Canadian provinces.

Pick a restaurant chain: 

The map below shows the locations of the restaurants in the different cities. The markers are color-coded by average Yelp rating (green is high, yellow is medium, red is low). Clicking the marker gives extra information about that location (below in the next section).

Please click on one of the markers in the map above to get more information... Scroll to zoom (data for several cities are available).

An Attempted Solution

We construct a "model" of how various attributes affect a restaurant's average Yelp rating. This is done with a random forest regressor. The model learns which combinations of attributes tend to yield a good rating. Then for each restaurant, given its attributes, we check what change can be made to maximize the model's predicted rating. The top 10 recommended changes are shown in the second table (below).

Selected location:

(please click on a marker)

One must carefully interpret these top 10 recommendations. The individual restaurants may or may not have control over these attributes (e.g. outdoor seating, parking), but they do provide some insight into how valuable these features are, so they are not omitted from the table. Also, the model reflects some correlations which should not necessarily be taken as "recommendations". For example, restaurants with a higher price range tend to do better (maybe they are in nicer neighborhoods or are fancier in some way) but this does not mean that raising prices would automatically give a higher Yelp rating (probably the opposite would be true).

Some restaurants do not have any attribute information (other than the location). For these the top 10 recommended attribute changes tell us what the model "thinks" is particularly important for that type of restaurant. For Starbucks, McDonald's, Subway, and Taco Bell, respectively, RestaurantsPriceRange=$$, RestaurantsPriceRange=$$$, BikeParking=True, and RestaurantsGoodForGroups=True are the most important attributes.

Modified March 11th, 2020. Martin Carrington, Fellow at The Data Incubator.