A Stat for Must-See Matches

A constant conundrum for tennis fans is deciding which match to watch at any particular time. In this post, I look at how we can quanitfy the ‘must-seeness’ of a match based on it’s overall quality and competitiveness using player ratings.

As you have been following the Rogers Cup matches this week, there were probably many times when you had to choose one from multiple live matches to watch. Do you know why you made the choices you did? Did you go with whatever was scheduled on the best court? Or with the player with the highest ranking? Or with some gut feeling about which would be the closest match?

It is one of the most FOMO-inducing situations for tennis fans, especially in the early days of an event shared by the men’s and women’s tours. Of course, by toggling among multiple matches we can convince ourselves that we aren’t really having to choose. But I don’t think our attention is really ever fooled.

Maybe it is just a sign of age creeping up on me, but I often find myself wanting some way to simplify my tennis-watching choices. Or, at least, to help me anticipate where the best viewing experience might be at any one time.

My usual practice has been to scan the betting odds of scheduled matches, which tells me which matches the market thinks are going to be close. But I find that my inclination isn’t always for the most competitive match. Overall quality is a key factor as well; basically, I would be willing to accept a bigger gap in the odds if the playing level of both players was still reasonably high.

I’ve chatted about this problem numerous times with Martin Ingram—who shares my obsession about such problems and who consistently finds a clever solution before I do to most of them. While we both agreed that competitiveness and quality weren’t the only factors that setup a great match (style and head-to-head were other big ones, for example), we agreed that these factors alone might still do quite well in separating the wheat from the chaff.

So the basic idea is a stat that combines a measure of quality and competitiveness. Put mathematically,

$$ Must-See = Competitiveness + Quality $$

Since player ratings are our main resource for judging the relative skill of players, they are a natural choice for measuring both the overall ability of competitors (quality) and their gap in ability (competitiveness).

For quality, we can simply take the sum of the ratings of players. This is similar to the idea of a ‘bonus’ as described by Klaassen and Magnus. Below is the distribution of quality for tour matches in 2019. While men’s matches have a slightly higher mean, both tours exhibit skew to the right. That tail, the 5000+ group, is where the highest-quality matchups of the season can be found.

Competitiveness, on the other hand, is all about the difference in player ratings. This is the ‘malus’ in the language of Klaassen and Magnus. In fact, the ratings difference is the only random quantity that goes into match predictions in Elo and other paired comparison forecasting systems.

The competitiveness distribution above shows us that there is little difference between tours. Also, the massive right-skew means that it is more typical to see a close match than a heavily lopsided one. It would be a much less interesting sport if the distribution didn’t exhibit this kind of skewness.

So on the one hand we have match quality with a vast range and the highest quality being quite rare, while match competitiveness has a narrower range and highly competitive matchups are relatively common. How do we combine these two measures considering these opposing properties?

One idea is to focus on quality but introduce a penalty based on the lopsidedness of the match. We can do this by putting the competitiveness in terms of the expected win probability for the stronger player, and then measuring that expectation against the most competitive match (a 50-50 win chance).

$$ Must-See = (Quality - Average) * (1 - (Win Prob - 0.5)) $$

In this way, the quality-over-average is decreased in proportion to the percentage point distance from an even match.

We can get an idea of how this approach works by looking at the top 10 ‘must-see’ scores in 2019 for the men and women. Although the intended purpose is to sort the quality among matches happening in the same week and round, this still gives us some sense of the stat’s performance. The predominance of the Big 3 on the men’s side and Halep, Williams and Barty on the women’s side suggests it is doing something sensible. Looking at the scores retrospectively points to a few matches that didn’t live up to their expected level of interest, the Australian Open men’s final and the Wimbledon women’s final being two cases in point.

Undoubtedly, the ‘must-see’ stat presented here won’t meet everyone’s individual definition of a ‘have-to-watch’ match. But I think, by bringing together two key ingredients of a great match into a single number, it can still be a useful stat to consider when scheduling your own viewing or when looking back on which matches did and didn’t live up to their billing.

Event Player Opponent Match Result Quality Competitiveness Must-See Score
Australian Open Rafael Nadal Novak Djokovic 6-3 6-2 6-3 1661 0.88 1462
Rome Rafael Nadal Novak Djokovic 6-0 4-6 6-1 1591 0.89 1416
Wimbledon Novak Djokovic Roger Federer 7-6(5) 1-6 7-6(4) 4-6 13-12(3) 1469 0.93 1366
Wimbledon Rafael Nadal Roger Federer 7-6(3) 1-6 6-3 6-4 1405 0.95 1335
Mutua Madrid Open Novak Djokovic Dominic Thiem 7-6(2) 7-6(4) 1370 0.93 1274
French Open Rafael Nadal Roger Federer 6-3 6-4 6-2 1660 0.76 1262
French Open Novak Djokovic Dominic Thiem 6-2 3-6 7-5 5-7 7-5 1497 0.83 1243
Mutua Madrid Open/td> Roger Federer Dominic Thiem 3-6 7-6(11) 6-4 1360 0.90 1224
French Open - Paris Rafael Nadal Dominic Thiem 6-3 5-7 6-1 6-1 1667 0.71 1184
Dubai Gael Monfils Marin Cilic 6-3 4-6 6-0 1028 1.00 1028
Event Player Opponent Match Result Quality Competitiveness Must-See Score
French Open Ashleigh Barty Marketa Vondrousova 6-1 6-3 1222 0.98 1198
Mutua Madrid Open Simona Halep Ashleigh Barty 7-5 7-5 1235 0.94 1161
Mutua Madrid Open Simona Halep Kiki Bertens 6-4 6-4 1293 0.89 1151
French Open Marketa Vondrousova Johanna Konta 7-5 7-6(2) 1138 0.98 1115
French Open Petra Martic Marketa Vondrousova 7-6(1) 7-5 1102 1.00 1102
Wimbledon Simona Halep Serena Williams 6-2 6-2 1094 1.00 1094
Mutua Madrid Open Petra Kvitova Kiki Bertens 6-2 6-3 1147 0.93 1067
Miami Open Ashleigh Barty Karolina Pliskova 7-6(1) 6-3 1063 0.99 1052
Stuttgart Petra Kvitova Kiki Bertens 7-6(3) 3-6 6-1 1044 0.99 1034
Australian Open Petra Kvitova Ashleigh Barty 6-1 6-4 1030 1.00 1030
Stephanie Kovalchik avatar
About Stephanie Kovalchik
Blog Founder, Senior Data Scientist at Zelus Analytics
comments powered by Disqus