Which factors influence Airbnb pricing in Boston?

Which factors influence Airbnb pricing in Boston?Tobias GorgsBlockedUnblockFollowFollowingMay 5Boston skyline — Photo by Zoltan Kovacs on UnsplashAirbnb is the flagship of the sharing economy in the area of renting rooms, flats or even whole houses for a limited time frame.

It is a win-win situation for both: the people who rent out available space they don’t need make some additional money, the people who are using that space are saving money in comparison to the traditional hotel approach.

The property owner has to put in a lot of effort to attract potential customers: accurate description and high quality pictures are mandatory, also finding the right price is challenge.

This article describes what factors influence the Airbnb pricing most.

The object of investigation is Boston, the capital of U.


state Massachusetts.

The following three main business questions shall be answered:What is the average price per guest in each neighbourhood?Which neighbourhood has the highest rated listings?Which factors have the most influence on the price?The analysis is based on the Boston Airbnb Open Data dataset available on Kaggle.

It contains activities around almost 4000 properties in Boston: descriptions, amenities, pictures, reviews, temporal information and locations are part of the data.

The dataset is split into three different files, for answering the defined questions a look into the listings data is sufficient.

Question #1: What is the average price per guest in each neighbourhood?The answer to that question can help property owners to define the price for their listing as well as tenants to find the areas that are suitable for their budget.

For answering this question we have to take a look to the following features of the dataset:neighbourhood_group_cleansed: neighbourhood to a particular listingguests_included: number of guests included in a particular listingprice: price of a particular listingNone of the listings had missing values for the three features, so the cleaning of the data was reduced to encoding the price feature to a format where some basic calculations could be made.

After grouping by the neighbourhood feature the mean for the included guests and the price can be easily calculated.

Afterwards the division of the price by the guests included gives us the direct answer to our question.

The following figure shows the average price per guest in each of Boston’s neighbourhoods.

Average price per guest in Boston’s neighbourhoodOf course the neighbourhood is a key driver for the price.

The average price ranges from around 52 dollars in Northgate to around 105 dollars in Downtown.

It makes a lot of sense that the price in Downtown (the city center) is substantially higher than in a outskirt area like Northgate.

Travellers for example are willing to pay more if they are near the main tourist attractions.

Question #2: Which neighbourhood has the highest rated listings?We can assume that the highest rated listings are the ones that best meet the customers expectations.

But what does that mean?.It means that the overall price-performance ratio is good.

Let’s see if major differences exist for Boston’s neighbourhoods.

The following features are helping us to answer that question:neighbourhood_group_cleansed: neighbourhood to a particular listingreview_scores_rating: mean score of all reviews for a particular listingOver 600 listings (around 17%) have missing values for the rating feature.

This is typical for new properties with no reviews so far.

First of all this listings can’t be included in the analysis and have to be removed from the dataset.

After grouping by the neighbourhood feature the mean for the rating can be easily calculated, what gives us the direct answer to our question.

The following figure shows the average rating in each of Boston’s neighbourhoods.

Average rating in Boston’s neighbourhoodsThe average rating ranges from 88.

4% in the University District to 96.

0% in the Central Area.

Especially the gap from the University District to the other neighbourhoods (where the next lowest neighbourhood Cascade has an average rating of 92.

5%) is interesting.

The results show that the possibility for a potential customer to get disappointed in the University District is significant higher.

To find the reason for that further analysis of the data would be necessary, e.


an evaluation of the review texts.

Question #3: Which factors have the most influence on the price?Similar to question #1 the answer to that question can help property owners to define the price for their listing.

The preparation of the data was way more extensive in comparison to question #1 and question #2.

Redundant features as well as host related features weren’t taken into account.

The feature square_feet (self explaining) had to be dropped because of over 97% missing values.

Missing float values were imputed by the mean (bathrooms, reviews_per_month), while missing object values were imputed by the mode (property_type).

For missing ratings the calculated average rating from question #2 was used.

Additionally one hot encoding was applied to several categorical features (neighbourhood_group_cleansed, property_type, room_type, bed_type, instant_bookable, cancellation_policy).

To determine the weight of the features the listing data first was split into explanatory (all except price) and response (only price) variables.

Second the data were split into training and testing data.

Third these data were fitted to a LARS model, the final step to determine the coefficients and therefore answering the defined question.

LARS stands for least-angle regression and is an algorithm for fitting liner regression models to high-dimensional data.

The following table shows the coefficients of the most influential features.

Column est_int shows the feature name, column abs_coefs the feature’s absolute coefficient and therefore the influence on the pricing.

The higher the value the more influence the feature has.

Feature influence on price #1The following figure shows the same data for in a graphical way.

Feature influence on price #2From the values in the table we can directly derive the price drivers.

13 of the 14 most influential features are related to property type (boat, camper/rv, dorm, yurt, other), room type (entire home/apt, shared room) and neighbourhood (as stated already when answering question #1).

An exception is the number of bathrooms that seems to have a huge impact.

The number of bathrooms mostly correlates with the size of the property and can be interpreted as replacement for the dropped square feet feature.

ConclusionIn this article, we took a look at Boston’s Airbnb data to find out which features have the most influence on the property price and how each neighbourhood meets customer expectations.

Here you find the most important takeaways:The property’s neighbourhood is a key driver for the price.

The most expensive area has double the average price of the cheapest area.

Central areas (Downtown) are in general more expensive than outskirt areas.

The average rating and therefore the price-performance ratio for properties in the University District is substantially lower than in all other neighbourhoods.

Besides neighbourhood the most influential features on the price are property type, room type and property size.

In my opinion the results of question #1 and #3 could have been expected, but what explanations do you have for the (in comparison) bad average ratings of University District?.

. More details

Leave a Reply