How I Used Python and Open Data to Build an Interactive Map of Edinburgh’s Beer GardensJessica WalkenhorstBlockedUnblockFollowFollowingJul 2Summertime — Photo by Tomasz Rynkiewicz on UnsplashWith summer finally arriving, I wanted to find out where a good place for enjoying a nice chilled drink (alcoholic or non-alcoholic) outside would be in my hometown Edinburgh.
So I combined an open data set about chair and table permits with some geocoding and created an interactive map of places with outside seating in Edinburgh.
Background and Project DescriptionDuring the last few years, the UK government bodies have been working on open sourcing their data and the Edinburgh City Council is no exception.
info, you can find a list of data sets containing information about many aspects of public life (even though some files could admittedly do with some updating).
For example, this page hosts a file containing details about chair and table permits for the year 2014.
Luckily, an up-to-date version can be found here.
Note that while the file structure is structurally the same for both files, the headers are different, so if you want to look at the historical data, you will need to adapt the code below accordingly.
The file contains names and addresses of premises which have permission to put out chairs as well as some additional information.
This file forms the basis of this project, which is divided into four parts:get and load the permit fileuse the open street map API to get the latitude and longitudes for each establishment as well as premise categoryclean and bin the premise categoriesplot the premises on a map using foliumWithout further ado, let’s get started.
The full notebook can be found on my GitHub.
Step 0: Setting upFirst, we import the libraries.
import pandas as pdimport requestsimport wgetimport foliumfrom folium.
plugins import MarkerClusterStep 1: Getting DataWe use wget to download the file and read it into a pandas data frame.
Make sure to set the encoding since the file contains special characters (lots of Cafes on the list).
filename = wget.
csv")df0 = pd.
read_csv(filename, encoding = "ISO-8859-1")df0.
head()Premises with Table and Chair Permits in EdinburghA quick look at the data reveals that there are a few duplicates in the data.
They are mainly due to multiple permits with different start and end dates.
A good way of cleaning would be to filter on dates, but frankly, I don’t care that much at this point, so I just keep the premise names and addresses and drop the duplicates.
(Note: The file also contains information about the table area, which I might revisit at some point in the future).
After dropping the duplicates, we are left with 389 rows with premise names and addresses.
# dropping duplicate entriesdf1 = df0.
loc[:, ['Premises Name', 'Premises Address']]df1 = df1.
drop_duplicates()# in 2012: 280print(df1.
shape)389A remark on the side: In summer 2014, there were only 280 premises with a chair and table permit.
Open air culture is indeed taking off and this is the data to prove it :)Step 2: Getting latitudes and longitudes for each premiseIf we want to visualize premises on a map, addresses are not enough, we need GPS coordinates.
There are different APIs, which allow you to query for an address and will return latitudes and longitudes (a process called geocoding).
One possibility is to use the Google Maps API, but it comes with caveats.
The OpenStreetMap API provides the same functionality but is free to use and the results are decent enough for my purpose.
We use the Pandas map function to obtain the API response for each row.
After querying the API, we drop all rows, where we did get not get a response.
Again, I am not too bothered about the few premises (about 20) which I am loosing, there are plenty left.
Looking at the JSON fields in the response, we find that in addition to the coordinates, the API also returns a field named ‘type’, which contains the type of premise at this address.
I add this information to the data frame together with the coordinates.
# extract relevant fields from API response (json format)df2['lat'] = df2['json'].
map(lambda x: x['lat'])df2['lon'] = df2['json'].
map(lambda x: x['lon'])df2['type'] = df2['json'].
map(lambda x: x['type'])The most frequent premise types are cafes, pubs, restaurants, tertiary and houses:df2.
value_counts()[:5]cafe 84pub 69restaurant 66tertiary 33house 27Name: type, dtype: int64Step 3: Assigning Premise CategoriesI am mostly interested in distinguishing between two types of premises: the ones that sell coffee and are more likely to be open during the day (like coffee shops and bakeries) and the ones that sell beer and are more likely to open in the evenings (like pubs and restaurants).
I therefore want to sort my premises into three categories:Category 1: day-time places (coffee shops, bakeries, delis, ice-cream)Category 2: pubs, restaurants, fast-food and barsCategory 3: everything elseTo do this I have two sources of information: the premise name and the type returned by OpenStreetMap.
Looking at the data, we find that the type is a good first indicator, but also that many places are labelled incorrectly or not at all.
I therefore apply a two-step approach: i) Assign the category based on the OpenStreetMap type ii) Clean up the data using its name, where this step overwrites step i).
To clean up the data, I decided to overrule the OpenStreetMap classification if the premise name contains certain key elements (such as ‘cafe’, ‘coffee’ or similar for coffee shops and ‘restaurant’, ‘inn’ or similar for restaurant and pubs).
This misclassifies, for example, Cafe Andaluz as a coffee shop, but works decently well in most cases.
Particularly it seems to mostly keep to the pattern of classifying as coffee shops places, which are likely to be open during the day, so it works for my purpose.
Of course, with fewer than 400 entries, one could manually go through the list and assign the correct category to each and every one of the entries.
However, I am interested in creating a process, which can be easily transferred to other places, therefore a manual intervention specifically tailored to Edinburgh’s scenery is not suitable.
Step 3a: Assigning Premise Categories According to OpenStreetMap TypeStep 3b: Overwriting Categories According to Premise NameA quick inspection shows that the reassignment seems reasonable:# show some differences between classification by name # and by type returned by the APIdf2.
is_coffeeshop) & (df2.
type != 'cafe'), ['Premises Name', 'type']].
head(10)I reassign the category for the premises flagged as restaurant or coffee-shop.
Should a premise have been flagged as both, the coffee shop category takes precedence:# reset category if flagged as restaurant or coffee-shop through namedf2.
is_restaurant, 'category'] = 2df2.
is_coffeeshop, 'category'] = 1Step 4: VisualizationFinally, we use Python’s Folium package to visualize our results as markers on a map.
Adding the individual points to MarkerClusters allows us to summarize the symbols into groups if too many symbols are in the same region.
Creating a separate cluster for each category allows us to use the LayerControl option to toggle each of the categories individually.
We use the ‘fa’ prefix to use the font-awesome (instead of the standard glyphicon) symbols.
Since Folium maps do not natively display on Medium, the figure below shows a static version of the map.
You can look at the interactive map here.
A static version of the beer garden map — Find the dynamic version in the original post hereSupplementary Step 5: Saving the map to pngIf not being able to embed the dynamic version of the map here, I at least wanted to embed a static version into this post.
The best way I found (and which is not just taking a screenshot manually) is to save the map in HTML format and then use Selenium to save a screenshot of the HTML.
The following shows how this can be done (credits to this stackoverflow post for the Selenium part).
Note: In order to get this to work you need to install the geckodriver.
Download the file from here and put it into /usr/bin/local (for Linux machines).
SummaryIn this post, we downloaded an open data set containing chair and table permits from the Edinburgh Council.
We then used the Open Street Map API to obtain the types and GPS positions for the premises based on their address.
After some additional data cleaning based on the premise names, we binned the premises into the three categories “coffee shop”, “pub/restaurant” and “other” and plotted them on an interactive map, which we saved in HTML format.
ConclusionWe now have a working beer garden and open-air coffee shop map of Edinburgh and can enjoy the summer sitting outside with a nice iced-coffee or an ice-cold beer.
I have already made use of it and this is me enjoying a post-work drink at one of the premises on the map – Prost! :)Originally published at https://walkenho.
.. More details