For example, even though Real Madrid vs Barcelona is one of the most important games in world soccer, it’s easy to forget how difficult of an opponent Real Sociedad or Athletic de Bilbao can be for Barcelona.
In this post I want to report the methodology and results of a data analysis I did to look at data from all matches from major European leagues across 8 seasons and find which opponent caused the most damage to each team.
Aka, find each team’s nemesis.
I describe the data and the methodology at the end of this post in order to dive into the results immediately (Please feel free to check that out and then return to the results!).
The main thing you need to know to follow the results is that the metric I use is the percentage of maximum possible points that a team got out of an opponent.
The maximum possible points is 3 times the number of games.
Hence, if e.
, Chelsea played Liverpool 10 times and got a total of 15 points, then Chelsea got 15 out of 30 possible points or 50%.
In order to keep the list kinda short I only report the results for pairs of teams that have (almost) 16 encounters with each other.
That’s the maximum possible number of games, since each pair of teams play each other twice a year and we have data for 8 years.
The resultsThe nemesis for each major team studied.
The lower the number, the worst the team performed against their nemesis.
English Premier LeagueArsenal: 25% vs ChelseaAston Villa: 14.
6% vs Manchester UnitedChelsea: 40% vs LiverpoolEverton: 19% vs ArsenalLiverpool: 29% vs ArsenalManchester City: 38% vs LiverpoolManchester United: 38% vs ChelseaStoke City: 17% Manchester UnitedSunderland: 19% vs ChelseaTottenham: 29% vs Manchester UnitedSpanish La LigaAthletic Club de Bilbao: 12% vs BarcelonaAtlético Madrid: 17% BarcelonaBarcelona: 53% Real SociedadGetafe: 13% Real MadridMálaga: 8% BarcelonaEspanyol: 6% vs Real MadridReal Madrid: 29% vs BarcelonaSevilla: 13% vs BarcelonaValencia: 23% vs BarcelonaItalian Serie AChievo Verona: 6% vs MilanGenoa: 20% JuventusInter: 29% vs JuventusJuventus: 47% NapoliMilan: 29% vs JuventusNapoli: 40% vs RomaRoma: 25% vs JuventusUdinese: 29% vs JuventusFrench Ligue 1AS Saint-Étienne: 20% vs Paris Saint-GermainFC Lorient: 21% vs Paris Saint-GermainGirondins de Bordeaux: 38% vs Olympique de MarseilleLOSC Lille: 27% vs Paris Saint-GermainOGC Nice: 29% vs Montpellier Herault SCOlympique Lyonnais: 31% vs Paris Saint-GermainOlympique de Marseille: 31% vs Paris Saint-GermainParis Saint-Germain: 28% vs AS MonacoStade Rennais FC: 27% vs Girondins de BordeauxToulouse FC: 29% vs Paris Saint-GermainHighlightsOk that was a lot.
Let’s process some highlights.
Chelsea is the nemesis of Arsenal.
Given the 8 seasons that this dataset is based upon, this probably doesn’t come as a big surprise.
Didier Drogba and his Blue teammates have been terrorizing Arsenal’s defense for some time now.
Liverpool is the nemesis of Chelsea and Manchester City.
But who can forget that slip from Gerrard that allowed Chelsea to score and sunk the title hopes of Liverpool?Arsenal is the nemesis of Liverpool.
Chelsea > Arsenal > Liverpool > Chelsea.
This is an interesting cycle showing that in soccer things are not hierarchical: any team can win over any team!England’s Premier League is the most egalitarian league.
There are a lot of discussions about La Liga vs Premier League and this blog post can add one more datapoint.
FC Barcelona is the nemesis of 6 out of 9 Spanish teams studied here, whereas in the Premier League no team is dominating as much.
Chelsea and Manchester United are the nemesis of 3 teams each out of 10 studied.
Of course, on the other hand there are 10 teams from the Premier League that were never relegated in the 8 studied seasons but only 9 such teams from La Liga.
So let’s not rush to judge which league is the most competitive one.
Real Sociedad is the nemesis of Barcelona.
This is a great example of a somehow unexpected result since most of the media coverage usually goes to the El Classico.
But fans that follow the Blaugrana closely know about the ‘anoeta curse’ and how difficult of an opponent Real Sociedad has traditionally been for Barcelona.
Barcelona is the nemesis of almost all major Spanish teams.
This might not be a big surprise to many people since Barcelona have been so dominant for the past 8 years in their domestic league (they won 6 of the studied 8 seasons).
Real Madrid is the nemesis of two major Spanish teams, Getafe (a team also based in the Madrid metro area) and Espanyol.
Juventus has been dominating the Italian league.
They are the nemesis of 5 out of 8 studied Italian teams.
Napoli is Juventus’s nemesis.
PSG has been dominating the French league.
They are the nemesis of 7 out of 10 French teams, the highest ratio of all studied teams.
AS Monaco is PSG’s nemesis.
Barcelona is the most resilient team studied.
Barcelona managed to get 53% of the maximum possible points versus their nemesis, Real Sociedad.
That’s the highest across all team/nemesis pairs.
Here’s the list with the highest performances against a team’s nemesis.
Barcelona, 53% vs Real SociedadJuventus, 47% vs NapoliChelsea, 40% vs LiverpoolNapoli, 40% vs RomaManchester United, 38% vs ChelseaManchester City, 38% vs LiverpoolConclusionCan this analysis really help Barcelona fans relax when their team is playing against Real Madrid?.Probably not.
Soccer rivalries are more than just a game and more than 3 points in the domestic league.
Nevertheless, this analysis and others like it can help sort out the noise and human biases to pinpoint to the players and the coach some teams that they should be focused on more.
Is your team not in the above list?.No worries, you can see (and alter) the notebook that produced all the results here or just leave a comment below!Thanks for reading!Appendix: Dataset & MethodologyThe datasetI used this Kaggle dataset of European soccer matches.
It’s a wonderful database with matches from a plethora of European domestic leagues as well as player and team attributes.
The methodologyThe first thing I did was to query the dataset to get all match results from Spain, England, France and Italy’s domestic leagues.
The dataset includes more leagues, so if you really wanna find your team’s nemesis from some league outside those four feel free to take a look at the code or let me know.
Some immediate basic exploration reveals that there are 8 seasons: from 2008/9 up to and inclusive of 2015/6.
Each of these our leagues has 20 teams that play each other twice, hence each team should have 38 observation per season, for every season they were in the league.
European leagues operate on a system of relegation and promotion, hence not all teams will be part of the dataset of every season since some will be relegated.
The dataset seems pretty complete with almost all matches recorded; there is a small number of matches missing (mostly from the Italian league) but since it’s very small I choose to ignore it.
Once all the raw logs are available, I go though each team’s complete list of games and record the opponent and the number of points they got from the game: 3 if they won, 0 if they lost and 1 if they draw.
This dataset can then be grouped by for each team and opponent pair and calculate the sum of the points they got and then divide by the maximum number of points possible.
Hence, for each team we can find the opponent that they got the smallest percentage of possible points.
We call that opponent the nemesis!.