There are parts of the game that are incredibly difficult to observe and understand with any certainty, because our observations are limited, our samples skewed, and because we are not objective viewers. For example, if you listen to any set of fans on match day, they will tell you all about the injustices rained down upon them by the referees, and they’ll be sure to let you know that they are the victims of a systematic assault by the corrupt officiating that is out to get their beloved club. Given that every set of fans is convinced they’re the victim, where does the truth lie? Attempting to explain this using qualitative analysis is going to be negatively impacted by our own biases. Though quantitative analysis is not free from bias (algorithm’s can be prejudice if the person writing them is a bigot), it is better capable of neutrality. That is why we see the increasing popularity of data analysis in football. It’s a perfect complement to the still useful but inherently flawed eye-test.
This article will be the first in a two-part series taking a look at an increasingly popular tool for measuring performance, an area of football that is sometimes difficult to objectively observe. Expected Goals (xG) measures the quality of scoring chances created in a game, and taken as totals for teams or players, can be used to measure performance and explain or predict outcomes. The intention is to demonstrate the value in xG, it’s uses, and what it tells us. The focus of this first article will be on xG’s use for predicting outcomes. A considerable amount of the analysis in this first article, and to some extent the second too, is borrowed from the great work done by others. Rather than trying to reinvent the wheel, I’ll rely on the good work that already exists, while shining a light on articles that deserve it.
I won’t go into great depth explaining the intricacies of xG, as this is already well trodden ground. Tifo Football has this short video explaining the metric in relatively simple terms. There are also some detailed explanations, such as this by Matthias Kullowatz over at American Soccer Analysis, this by 11tegen11. Finally, there are some fantastic interactive tools, such as this one that compares hypothetical shots and shows their xG scores, developed by Ben Torvaney, and this, which simulates outcomes using expected goals, developed by Danny Page.
In brief, xG measures the quality of scoring chances in a game. xG is a measure of the probability of a scoring opportunity being successful. Every opportunity in a game is given an xG value, and these values can be summed to represent the performance by a team or by individual players. A higher xG indicates a better performance, while lower scores than the actual outcome are an indication that a team has either been a little unlucky or has squandered it’s chances (and in cases where it is higher, it may mean luck or clinical finishes). xG uses several different factors to differentiate the quality of a scoring opportunity. Each xG measure tends to be a little different from each other, but the usual variables that go into the equation are distance and angle of shot (in respect to the goal), the type of chance (shot or header, sometimes also which foot), the type of pass to set it up, whether it was in open play or from a set piece, and in some cases the opposition players that are between the ball and the goal.
Instead of focusing too heavily on the method used to calculate xG, I will spend some time showing how it can be used, as I think this is a topic that is typically misunderstood. There’s a lot of interest in xG right now, but with that has come a) some distortion of the measure’s uses and a blurring of the understanding of how xG works, and b) a backlash. There will always be those that will argue that stats are meaningless (usually immediately after stats don’t say what they want them to say), or that maintain the “eye for the game” is better than any statistic. But there are also people that confidently declare what is and isn’t science despite not knowing what science really is, so there are losing battles all around us.
Craig Burley, captain of Totally Informed and Not Remotely Ridiculous FC, perfectly encapsulates the cynicism that people feel regarding xG, in this angry response. Of course, Craig’s position appears borne almost entirely out of ignorance, since it isn’t clear that he actually understands what is happening, where he is, or why he’s shouting, but one thing is for sure, expected goals is nonsense and no one can tell him otherwise. Results are all that matters, and attempting to understand results is “nerd nonsense”. Most alarming, though, is that Craig Burley apparently doesn’t get presents at Christmas. No wonder he’s so angry.
The point is, xG isn’t the holy grail of football statistics. But it has become such a popular stat that it’s importance probably gets overstated. Burley is right, to some extent, that the result is what really matters. But the point he is missing is that xG helps us understand that result.
Expected goals serves as a good measure of performance, and it can be a useful tool in predicting future outcomes. It’s possible for a team to get lucky and win a few games in a row without playing well, but the probability is that the luck will eventually run out, and their results will regress to fit their performances in the near future. If we focus only on the end result (listen up Craig), we won’t be able to identify that this is what is going on, and we lack a full understanding of the situation. Using xG is a method for plugging that gap, and helps us predict outcomes with greater reliability.
There is a rich literature on predicting football, a topic that has been developing in scientific research since Maher (1982) first published on the topic. However, probably the most important contribution was made by Dixon & Coles (1997). Their paper develops a model that can be used for predicting outcomes. The model uses maximum likelihood (a method of estimating the underlying process that generates the data you observe) to estimate the attack and defense parameters of each team, and the expected number of goals scored by both teams is calculated as a function of these parameters. The Dixon-Coles model advances on previous work in a number of ways, but most importantly, it models the probability of the goals scored by either team as dependent on the number of goals scored by their opposition. The number of goals scored by one team is effected by, and effects the number of goals scored by the other. I used this model to predict a hypothetical Bundesliga season, based on actual results from this season, and xG scores from the same games.
I have used the Dixon-Coles model, applied to data taken from Understat, and adapted Ben Torvaney’s article that predicted the rest of a Premier League season using xG (for anyone that wants to delve deeper into the coding for this model and other good football statistical analysis, I’d highly recommend Ben Torvaney’s work). From this, I have first computed estimates of each Bundesliga teams attack and defense parameters, based on the Observed Goals and Expected Goals for each fixture from this season. These parameters are displayed in Figure 1.
As you can see, the estimates of each team’s offensive and defensive strength are strongly correlated. In both cases, Bayern are the strongest (or close to the strongest) team. However, in the defensive case, RB Leipzig give them a close run for their money, and even appear to be a slightly better defense according to the Observed Goals. Regarding the offensive strengths, Bayern are way out front, but Dortmund are the second best team in the league. Interestingly, Dortmund appear to be stronger according to Observed Goals than Expected Goals, which you could either attribute to some good fortune, or (perhaps most likely) offensive efficiency and the team creating a lot of high quality chances. Where they appear to fall down, though, is defensively, where they fall much further behind Bayern (and Leipzig).
Simulations based on xG
Using these parameters, I have simulated each game 10,000 times, and produced the final league table from these simulations. Table 1 lays out the final table using xG as a weighting against the Observed Goals. The model predicts the outcome from each game using Observed Goals, xG, and a calculation of home team advantage. It is basically showing the outcome of a hypothetical season, simulated 10,000 times, based on the calculated team strengths and match probabilities from this season.
Bayern are way out in front, averaging just over 15 points more than 2nd place RB Leipzig. Interestingly, Dortmund drop down to 3rd from their 2nd place finish this season. However, there is only 2 points between 2nd and 3rd. The bottom three is similar to this season’s dwellers, but Stuttgart are going straight down instead of getting toed in by Union Berlin in the playoff. Another interesting difference is that Schalke move up from 14th to 12th, with 6 points more than they claimed this season. This suggests their down year may have been a little worse than they deserved.
Table 1: xG Simulations (x10000) - Final Table
|11||1. FSV Mainz 05||11||8||15||47||60||40|
|12||FC Schalke 04||10||9||15||42||54||39|
|16||1. FC Nürnberg||7||8||19||31||62||28|
Following this, I’ve represented the plots visually, using ridge plots for each team in Figure 2. This shows the distribution of points for each team in the simulations. There are two separate distributions for each team, one representing xG and the other representing the Observed Goals. For the most part, these line up pretty well, however in Nurnberg’s case there is quite a large discrepancy between the points distribution from the simulation of Observed Goals and the simulation including xG. This seems to suggest that they may have been a little unlucky this season (supported by the difference between their actual and simulated league position). The distribution of average points also supports the idea that Schalke were unlucky, as their xG distribution is better than their Observed Goals distribution, and better than the teams around them.
The difference between Dortmund, Leipzig, Leverkusen, and Hoffenheim, is also interesting. The highest points totals appear to be for Hoffenheim, but there appears to be a little more variance, which drags down the mean points total. Dortmund’s Observed Goals is highest of all, but their xG score looks to be pretty similar to the other three teams. Perhaps we got a little lucky this season? Or perhaps the model is missing something.
Simulations of a single game
Finally, Figure 3 looks at one game as an example of the distribution of simulated scoreline probabilities. The left-hand side represents the probability of the number of goals scored by the home and away teams, and the figure on the right-hand side represents the probability of the final outcome. As you can see, the most likely outcome was 1-1, while the actual outcome this season was, as I’m sure you remember, 4-2 to Schalke. While 1-1 has the highest probability, there is also a slightly higher probability that Schalke wins the game than Dortmund, suggesting they were the slightly better team.
However, the difference between the probability of a home win vs away win looks to be ~1-2%. This difference is substantively small, and may not even be statistically significant. The real takeaway from Figure 3 is that 1-1 is the overwhelmingly likely outcome, and that the chance the game is won by either side is pretty close to 50/50.
Ultimately, these findings are built around relatively simplistic models, so they’re not totally reliable, but as you can see, xG does do a pretty good job in the prediction of outcomes. In order to produce a more accurate predictive model, there are a number of other variables that should be included. This is something I will touch on in the part two, where I will use xG, along with other interesting variables, to explain Bundesliga outcomes.