Small sample size is typical of head-to-heads in pro tennis. Both seeding and knockout tournament designs mean that many pro players have played each other no more than a handful of times or sometimes never at all. Still, I find myself frequently surprised when I come across sparse head-to-heads between some seasoned players. It got me thinking if that reaction is even reasonable and how you might quantify how much some matchups are overdue?

If you have ever tried to predict outcomes in tennis, you’ve had to think about the ‘head-to-head’. It is something that commentators also get worked up about, and— Did you factor in their head-to-head?— is likely the first thing they will ask about any model to predict wins in tennis.

The amount of attention on head-to-head has always seemed to me to be out of proportion to the actual information most head-to-heads contain. You look at players rated 1800 or higher at any point in time and the head-to-head is 0.6 on average. So you pick out two top players at random and they are quite likely to have never played each other before.

I think people look at the 50+ head-to-heads among the Big 3 and somehow think this is representative of the sport, when it’s exactly the opposite. I was reminded of that this week when Roger Federer made his return to tour competition in Doha after a 13-month hiatus. With a narrow win over Dan Evans (a H2H = 3 before that match), Federer advanced to play Nikoloz Basilashvili who he had only played once before. With 32 years of pro-level play between them, this seemed to me an unusually sparse history.

But is it? How can we even say what a typical head-to-head should be?

That got me thinking about what factors might be predictive of the length of a head-to-head. Players who can reach later rounds of a tournament make a meeting more likely, so that would suggest the overall skill of two players to be a key factor. And having more years on tour with more events played increases the opportunities for adding to the tally of any possible head-to-head.

Do skill and match age explain head-to-head counts? I gathered the year-end cumulative head-to-heads of all male players with ratings of 1800 or more (a reasonable definition of ‘top’ players) from 2003 to the present. Using the combined ratings as a measure of total skill and the combined career matches played as ‘pro age’, the chart below summarizes what I found.

The plot reminds me a bit of a high-jump ramp, with head-to-heads of 10+ being outliers among the large mass of low counts. But when those more frequent matchups occur, they do appear to be more common among the more experienced and more highly-rated players. The chance of zero or one head-to-head is quite likely for everyone but still less so for stronger players with more years on tour.

Given the non-linear nature of the relationships in Figure 1, I used a GAM model for the head-to-head count with the log of combined skill and log of combined career matches as a bivariate smooth. Letting $Y$ be the cumulative head-to-head for two players in a given season, the expected count is

$$log(E[Y]) = \mu + f(\mbox{log_skill}, \mbox{log_matches};\theta)$$

and $Y \sim Poisson(E[Y])$. The $f(.)$ is a smooth that allows us to capture non-linear patterns that might best describe the relationship between skill or match age and head-to-head counts.

The results from that model are summarized in the heatmap below. Here we see the expected head-to-head count by the total skill and total career matches of the competitors. What should immediately standout is the large swathe of player types expected to have 1 or fewer meetings. Even a head-to-head of 5 would be too small to say much about matchup effects, yet you would need players with 1600 matches or more played and 4000+ combined rating (both the upper 25th percentile of these attributes among possible matchups) to even expect to get that much history between players.

When they met in Doha, Federer and Basilashvili has a combined rating of +4400 and career tour-level matches played just over 1,700, which would suggest an expected head-to-head closer to 6 than 1.

With a model for expected head-to-head, it is easy to then look for those head-to-heads that are actually well below expectation. I used the model to find the most surprising 0 counts among head-to-heads. The following table are 10 of the most surprisingly sparse. We see the Big 3 among all of these, as the long experience and high-rating of these players means they should have played just about every good player out there at this point. Yet we’ve so far been denied clashes between Andrey Rublev and Djokovic, as well as Dustin Brown and Roger Federer. Interestingly, Ricardas Berankis has escaped that challenge twofold, having never met Roger Federer or Rafa Nadal so far in his career. A pro player of his talent and experience would have been expected to have played them each 3 to 4 times by now.

Player Opponent Expected H2H
Novak Djokovic Pablo Cuevas 4.4
Ricardas Berankis Roger Federer 4.2