International breaks drag on, don’t they? I’m sure you’re struggling for something to keep you entertained in the absence of some domestic football, so what better than some math? Okay, fine, but maybe you’ll settle for some pretty representation of some math. This week’s Westfalenstats will take a look at some of the good work being done using data visualization to help tell stories about football, but first, some more resources.
Twitter has a lot to offer
Last week I presented a bit of a summary of some of the best places to look for data and for learning about football analytics. This week, I thought I’d point you in the direction of some of the best people to listen to.
For the most part, I wouldn’t recommend delving too deep into Football Twitter, but Football Analytics Twitter is another story! There’s plenty of accounts well worth following, but here’s a short list for your perusal:
- Michael Caley – An early proponent of Expected Goals and an accessible voice for the nerds promoting football analytics
- Mike Goodman – Co-host of the Double Pivot podcast with Michael Caley and former editor for StatsBomb
- Ted Knutson – StatsBomb CEO and the devil that popularized the use of radars to visualize player performance data
- Laurie Shaw – Harvard Data Scientist that also studies sports analytics
- David Sumpter – University of Uppsala Mathematician, author, and the man teaching the Mathematical Modelling of Football course
- Thom Lawrence – Statsbomb’s Chief Technology Officer
- Javier Fernández – Barcelona Data Scientist and a great resource for both academic research on sports analytics and an insight into the real-world application of this research
- Luke Bornn – SFU Statistics professor (formerly Harvard) and Sacramento Kings VP of Strategy and Analytics (formerly Roma)
- American Soccer Analysis — Nerds that are focusing on US soccer, from the national teams to MLS and NWSL. Even if you’re not a massive fan of US soccer, these guys are doing some of the best work of any out there
This certainly isn’t an exhaustive list, but these accounts are a good place to start, at least.
Using Data Visualizations to Tell Football Stories
There is some excellent work being done to develop new ways to present football data and to tell interesting stories in engaging and accessible ways. This is a really fun area of football analytics, because it leads to some really cool design work, and when the data visualization is good, it makes complicated stories that much easier to understand. There’s always a fine balance between a visualization that looks good and that the casual reader can understand, but when that balance is right, it is a joy to behold.
Plenty of work has been done to visualize goals, expected goals, and shots, in part because it is so fundamental to the game and in part because doing so is easier than other areas of the game (because shots all have one intended destination). Visualizing passing is a little harder, because there’s a lot more going on, but it’s still very worthwhile. Benoit Pimpaud looked at passing in two articles, the first visualizing player interactions using pass networks and pass sonars, and the second visualizing expected assists
I really like the general idea with pass networks and sonars, but I think a lot of work is still to be done to make these a little easier for the reader to understand what they’re looking at. They can be a little difficult to read and understand the information being given.
With regard to the expected assists visualizations, I think these are doing a better job. They provide a lot of information in a single figure, and for the most part, I think it’s pretty easy to understand what you’re looking at.
Generally speaking I’m a strong advocate of keeping data visualizations simple. Often overthinking things or trying to do something overly complicated can lead to something that may look very visually appealing, but doesn’t necessarily do a very good job of telling the story.
I really want to like these flower plots, as developed by Eliot McKinley and inspired by similar work done by Todd Whitehead comparing NBA players, because they are beautiful, but I’m not entirely convinced they actually do a great job of telling a story.
The problem with using transparency as a point of variance for representing data is that it’s very difficult for the viewer to quantify what this really means. I can see that, for example, Brian White is doing a lot less than Carlos Vela, but I can’t really identify the marginal difference between the two players across any category.
I wanted to share these as an example of some excellent design work that seems, to me at least, to have lost touch of the visualization’s purpose. It is a really easy trap to fall into, and one I can definitely say I’ve done so on multiple occasions.
Peter McKeever discussed his work for Opta Pro on ball progression, in particular looking at the diamond plot that he uses to identify the players that are high volume carriers and passers.
Peter’s twitter thread discussing the design choices and the logic behind the diamond plot is a good insight into all the thought that goes into really good data visualization. It also offers a lot of really helpful tips on getting this stuff right.
Finally, some more work from Benoit Pimpaud, this time looking at clusters of assists-shots in the Premier League last season. These are clustered pairs of assists and shots for each team, and the more yellow the cluster, the larger it is (the more assists-shots).
You might need to zoom in to get a good look at each cluster, but I think they’re pretty interesting. You can identify some of the typical ways that teams are getting shots, for example Liverpool seem to create a lot of goalscoring opportunities from out wide. You can also identify playing style, to some extent, with teams that play more direct creating shots from longer passes. I think this is a good example of making an attractive visualization that also provides a lot of information, and tells the intended story.
What do you guys think of these visualizations? Have you seen any that you really like (or dislike)? And do you have any Twitter analysts that you’ll highly recommend?