We explore chart-toppers of 2008 and 2018 with some visualizations.
Bored AnalyticsBlockedUnblockFollowFollowingFeb 4The method:We figured a good starting point would be the year-end Billboard Hot 100, which measures the most popular songs by radio airplay, streaming data and sales (shoutout to Billboard, the charts are available here).
We wrote a simple web scraper to get this data.
Using the top 100 songs from 2018 and 2008, our ultimate goal was to dive a little deeper to unearth large changes in the music industry.
In order to do any meaningful analysis, we needed every song’s lyrics and genre.
The website AZLyrics (shoutout!), does a great job of maintaining a very web-scrapable database of lyrics; but out of fear that we were perhaps trying to create a competing site with their precious data banned us when we attempted to scrape the lyrics of these 200 songs.
However, we had the last laugh when we returned with a slightly smarter (read: harder to detect) scraper and a VPN.
[the entire dataset and scraper are up on Github, for those interested]The Results:Some things to note: rather than dealing with multiple sub-genres like trap-rap, EDM-pop etc.
, we decided to bucketize songs into the broad genres of rap, R&B, pop, country and rock (with some Pandas manipulation).
For example, we classified Indie/Alternative songs as rock.
The Hot 100:First off, we have the breakdown of the top 100s by genre: What’s most interesting here — in our opinion — is the emergence of rap and the slow decline of rock.
The pie charts below show the shares of genres in the billboard hot 100 for the years 2018 and 2008.
Rap has almost doubled its share of the top 100, while rock has fallen from a mere 12% in 2008 to 5% in 2018.
In fact, we would argue that rap is now the dominant genre, even over pop (popular music, by definition).
However, these charts are best viewed in conjunction with these scatterplots showing the spread of charting numbers across genres.
The scatters give an idea of exactly how popular the genres were in each year.
Looking at 2008’s plot, we can conclude that only a few rap songs cracked the top 40 – 6, to be precise — but pop had 20 hits in that range.
Now after a decade, the number of rap hits in the top 40 has exploded to 20, and the number of pop songs in the top 40 has accordingly decreased to 16.
Rock, the 4th most popular genre in 2008, had 16 songs in the hot 100; however, it was the least popular genre in 2018, contributing to only 5 of the top 100.
Lyrics:Now for the interesting part: the lyrics of popular songs.
We made boxplots showing the number of words in a typical song of each genre.
As expected, with more upbeat rhythm and frequency of words, rap songs average (median) the most — around 600 words/ song in both 2008 and 18.
Other genres are also consistent between the 2 years: pop and R&B around 400 words/ song, rock and country around 300.
We thought another cool insight would be to compare the number of distinct words in songs (repetitions ignored).
We knew rap would be the genre to reward repetition, but were amazed to find that in spite of having the most words by far, it — in both years — did not average the most number of unique words per song.
In fact, in 2008, rap songs averaged over 100 unique words per song, but this dropped to around 80 in 2018 (we suspect because of the rise of the highly repetitive mumble rap, thanks Migos!)Word Clouds:Finally, we thought it’d be interesting to make word clouds (visualize the most frequently occurring words in a genre’s lyrics).
Our conclusion: it makes sense that parents want to keep their children away from rap.
Interestingly, 2008’s rap word cloud almost seems almost Shakespearean compared to 2018s, which is littered with expletives and derogatory terms.
So it seems rap lyrics are getting worse in terms of expletive content.
Some bonus insights: pop songs are best characterized by words like ‘love’, ‘know’, ‘want’ and ‘feel’, and R&B lyrics are somewhere in between Pop and Rap.
This analysis is obviously far from perfect, but we had fun and think we pulled some interesting insights.
If you are interested in learning how to web scrape or make some of these graphs, all our code and data is up on Github.
Thank you for reading!.Next up: we use Machine Learning to try and predict a song’s genre from its lyrics (with more data, obviously).