First things first, download your payments from Google.
Go to https://takeout.
Once there click “Manage Archives” and then “Create New Archive”.
For our purposes we are only interested in your Google Pay data.
So make sure all options are unticked except for Google Pay.
Click “Next” then click “Create Archive”.
When you’re archive is ready you will receive an email.
When it’s ready, download your archive and get the file in the zip file located at Takeout > Google Pay > My Activity > My Activity.
This is the file we will be scraping the information from.
Save it to an empty folder somewhere.
We’ll use this folder for both the code and storing the input and output.
Go ahead and look through the HTML file to get a feel for the layout.
The first thing I wanted to do was strip the out the following items from each purchase and save it to a CSV file: amount, date, time, latitude and longitude.
To begin with I inspected the HTML file with Firefox.
I found that each payment entry was surrounded by a div with a class called “mdl-grid”.
From there I was able to work out the div for date and time, price, and latitude and longitude.
Now, on to the Python!For this project I’m using Python 3.
6, but I feel any Python 3 version should work (don’t quote me).
First thing we need to do is install beautiful soup.
If you don’t know what it is, Beautiful Soup is a super handy tool for looking through HTML files (online or offline).
It makes it really easy to search for an element based on type, class or ID.
If you want to use a virtual environment go ahead and activate it now.
To install beautiful soup do:pip install bs4Now make a new file called ‘scrapePayments.
Bellow is the code I used to scrape the values to a CSV file:To begin with, this code reads the HTML file into a Beautiful Soup object.
Then from that object we select the items that relate to payment (every object that has the class ‘mdl-grid’).
The way we’ve chosen to select the items gives us an extra element we don’t want, so we just pop it off the list.
After that we go through each payment and feed each one through our ‘extract_purchase_details’ function.
This function will go through and extract all the information from the HTML elements.
The date and time, for example, is pulled from the text element with a class ‘mdl-typography — body-1’.
Now there is actually two elements with this class, so it actually returns a list with two elements.
We just take the first one.
After that we use a string slice to remove the excess text from the values, this leave us with text like the following:Attempted contactless payment<br>8 Jan 2019, 20:47:30 AEDTTo remove the excess we use a string slice which removes everything except the date and time.
Then we split the string at the comma and save the date to the ‘date’ variable and the time to the ‘time’ variable.
The rest of the values are done in a similar method.
Once we have all the values the function returns the values as a tuple which we store in ‘payment_details’.
Finally we append this tuple to our payments list.
This is all wrapped in a try-catch statement.
This is a little bit of a cheat to get rid of the entries that don’t actually have purchases in them (things like promotions).
Because they don’t have the same layout we will cause an exception when trying to access elements of the purchase that don’t exist.
Instead of handling the exception, we’re just ignoring it (just like how I handle all my real life problems!)Once all values are scraped, we then use Python’s CSV library to write the values to a CSV file.
To run this, do the following command:python scrapePayments.
py 'My Activity.
html'Now you have all your purchases in a nice CSV file.
Depending on your currency you may need to adjust line 3 where we define the local currency symbol ‘$’:LOCAL_CURRENCY_SYMBOL = '$'# Might need to becomeLOCAL_CURRENCY_SYMBOL ='£'Funny story, before I added that line to split on the dollar sign, I found out that last time I went to the casino someone charged me 18 Indonesian Rupees!.????.(I live in Australia, that’s like $0.
0018 AUD at the current exchange rate!)Alright, time to heat-map!.To be perfectly honest, for this part I followed a guide written by a guy called Mike Cunha over on his website.
The blog post can be found -> here <-.
We don’t need to follow his method exactly as he adds a boundary to his map.
So for our purposes you’ll need to install only the following python modules:pip install folium pandasPandas is an awesome library for handling data.
It is heavily used in the computer science community.
So, make a new file called ‘mapgen.
py’ and input the following code:Ensure that you change the values for the variable in line 16.
This is where the map will be focused when you open the generated webpage.
Once you’ve finished writing the code run:python mapgen.
csvUpon completion you will have a brand new file called “heatmap.
Open up that file and look at all the locations you have thrown your money away!Unsurprisingly, most of my Google Pay purchases are for my lunch.
From the map, can you guess where I have lunch? ????Thanks for reading!.If you have any tips or thoughts you’d like to share I’d love to hear them.
— LennyEdit 11/01/2019: I got some great feedback by people over on Reddit.
So I’ve implemented some changes they suggested.
This includes changing from camelCase to snake_case to be more inline with Python standards.
Splitting the code that extracts the payment info into its own function.
Finally I broke each of the chained methods I use to extract payment information onto multiple lines to make it easier to follow.