Have Matches Become Harder to Predict in the Time of Covid?

The pandemic has caused the most sustained disruption to the tennis calendar the sport has ever faced. And, while tennis has returned in some form over the past six months, it has not been a return to normal. With all of the challenges players and events have undergone, many of us are likely wondering whether pro competition has changed in some fundamental way? In this post, I try to shed light on this question by looking at trends in event predictability before and during the pandemic.

It was exactly one year ago that the first major sporting event—the 2020 ATP Masters at Indian Wells—was cancelled due to the coronavirus. In the weeks that followed, the reality of the global health threat began to sink in, and many speculated that no other professional tennis match would be played in 2020.

Fortunately by the time of the US Open, the USTA had a plan for proceeding under much more restrictive, covid-safe conditions. Since that time, many top events, including two other Grand Slams, have operated successfully without any apparent direct harms to staff, players or the local communities where events have been hosted. It is still unclear what long-term harms may arise due to the ongoing financial losses both events and players have weathered.

As tennis has made its comeback, tournaments and players have both had to adapt in significant ways. The absence of fans is the most obvious sign that pandemic tennis is not a return to normal. We can imagine the many other ways that the unobserved routines of tennis players have altered, from the way they manage travel to the way they maintain their training schedule.

Have they adapted well? Or has the pandemic had an impact on the performance of tennis players in any measurable way?

One way to try to measure a shift in the nature of tennis performance is by looking at the predictive performance of some of the better tennis models we have available. The logical here is that, a predictive model that has performed well over a long period in normal times, would be expected to continue to do so unless there was some fundamental shock to the sport that the model hadn’t accounted for. The introduction of new racquet technology that gave some players superhuman forehands would be such a shock. Could the pandemic be another?

For my predictive model of choice, I’ve focused on the match predictions from my Elo-based player ratings, which have had a historical accuracy at Grand Slams of 70-75%. The chart below shows the Grand Slam log-loss using these player ratings from 2010 thru the 2021 Australian Open. The red region marks the pandemic period. We see that the log-loss variance from year-to-year has been within the range of 0.1 in normal times. Only the 2020 French Open had a log-loss that was even more unpredictable than is consistent with historical year-to-year variation. But all other pandemic slam performances have so far been within the usual range of randomness.

Figure 1: Mean log-loss for men's and women's singles matches at Grand Slams from 2010 - 2021.
hardcourt dendrogram

To get a better handle on whether the hard-to-predict 2020 French Open was just an anomaly, I’ve looked at a rolling average of log-loss across the other tournament tiers. On the men’s side, the most events during the pandemic have been at the 250 level. Among these, we again see a similar pattern: surprisingly high log-loss at clay court events but nothing out of the norm on hard courts.

Figure 2: Rolling average log-loss for men's singles matches by tournament tier and surface, 2010 - 2021.
hardcourt dendrogram

On the women’s side, there are fewer data points to go on. The clay court events during the pandemic have been on the higher end of the usual range of log-loss, but nothing as far outside of the historical distribution as we see for the men’s competition.

Figure 3: Rolling average log-loss for women's singles matches by tournament tier and surface, 2010 - 2021.
hardcourt dendrogram

For many players, the 2020 Roland Garros was the first major event they played after a break of more than six months or more. The wintery conditions in Paris were also an additional obstacle that players struggled with. These would be reasons alone that could explain why the clay events at the end of 2020 were more random than other pandemic events have been.

Predictive performance is only one angle in assessing the stability of tennis competition over time. This measure shows a relatively positive picture as most events have not become especially easier or harder to predict. After six months of adjustment to tennis during the time of covid, it is even more reason to hope that things will only get better from here on.