Today, as part of the final R16 lineup at the French Open, Serena Williams will have the opportunity to bring her near-perfect record over Maria Sharapova to a staggering 20 to 2. Many have wondered how much Williams' dominance over the ‘Unstoppable’ author is fuelled by more than pure ability. In this post, we attempt to understand the inexplicable part of their head-to-head.
Although many were skeptical of their chances of advancing to the fourth round, all eyes will nonetheless be on Maria Sharapova and Serena Williams when they battle for a spot in the 2018 French Open quarterfinals. Still, they may say little about the quality we can expect from their match, as much of the interest in the rivalry has more to do with the drama of their polar personas and than the competitiveness of their matches.
How Lopsided, Really?
In their 21 meetings, Sharapova has had just two wins, both in 2004, when Sharapova skyrocketed to the top of the tour as a teenager. Many have been baffled by Sharapova’s drought since 2004, given that she was one of the best players in the world for many of those meetings. Four of Sharapova’s losses came when she was ranked No. 2 in the world and Serena was World No. 1; six matches in total were when Sharapova and Williams were separated by just 1 spot in the official rankings.
Unsurprisingly, those stats have lead to all kinds of theories about the lack of rivalry between Williams and Sharapova. Is it a bad matchup? Is Serena more determined to win against Maria? Does Maria lack belief to win against Serena?
None of these theories consider the possibility that Serena’s ranking, as seems to have been the case at this year’s French Open, simply fails to capture what she has been capable of in the past. If we looked at a more accurate measure of Serena’s win ability when she has faced off against Maria, perhaps her wins would be totally unremarkable.
Using surface-weighted Elo ratings, we can get a better picture of how unlikely each of Serena’s 19 wins might have been. Below, the chart shows that, for most of her head-to-head versus Maria, Serena had win expectations between 60 and 80%, so not as competitive as their rankings might suggest.
What does this say about the impressiveness of Serena’s 19 wins?
Using a simulation, we can say how probably a record of 19 or better would be given Serena’s win predictions above. Below, the distribution of each number of wins Serena could have achieved, shows that 14 was the most likely, even accounting for Serena’s historical advantage in her meetings with Maria.
The chance of 19 wins or better is just 1.5%. So, given what Serena was expected to do based on her Elo-rated ability, it seems that she has gone well above expectation and there is likely more to her record over Maria than ability alone.
To get a better grasp of what exactly Serena does exceptionally well or Maria does especially poorly when these two players meet, we can do a matched comparison. Essentially, we want to look at what Serena and Maria have done historically in similar situations in terms of opponent difficulty. For Maria, for example, this would mean looking at matches where she has lost and been within 150 Elo points of her opponent, as this has been the ‘typical’ scenario for Maria when facing Serena.
If Serena and Maria played other ‘equal’ opponents in the same way they have against each other, we would expect to see very similar match statistics as we see in their own head-to-head. On the other hand, if there is something about their dynamic (strategic, psychological, or otherwise) that makes the way they play each other fundamentally different than the way they play any other tough opponent, we would expect to see a different match profile when comparing their head-to-head stats to the comparison group.
So what exactly is a reasonable comparison group?
Because Serena tended to be the player with the higher performance rating when she faced Maria, we look at matches she won against opponents within 150 Elo ratings point (again, the median difference between her and Maria across their 21 matches). For Maria, we look at the same gap but in matches she lost, the more common scenario for her matches against Serena. This leaves a sample of 29 ‘comparable’ matches (including opponents like Venus Williams and Victoria Azarenka) for Serena and 32 ‘comparable’ matches1 for Maria (including opponents like Caroline Wozniacki and Na Li).
Considering some basic serve and return stats, is their evidence that Serena and Maria play differently against each other than against other top opponents? The chart below makes this comparison for a set of serve and return stats. The differences are summarised as an ‘effect size’, which is the difference in the mean for that performance stat (their H2H versus their mean versus comparable opponents) divided by the standard deviation so we can compare the relative importance of the effects across all of the statistics.
Positive effect sizes tell us when Maria or Serena performed better against each other than other top players; negative effect sizes tell us when they tended to perform worse.
We found that Maria underperformed in multiple areas, the most notable being on first return points won. First serve in and double fault rates were also notably worse in her matches against Serena compared to what she was able to do in her losses to other top players.
Although we are comparing Serena’s H2H performance to a different set of matches than Maria’s, it is intriguing to see that she has tended to raise her level in the very same areas where Maria falters: return points won and first serve percentage in. The only area where we see negative effects for both players is in the double fault rate, which might be an indication that both of them feel an unusual level of pressure when facing each other.
A lot has changed since Maria and Serena last met at the 2016 Australian Open: Maria’s 15-month drug ban, the publication of Maria’s controversial book, and Serena’s first child’s birth. Just on the basis of results, Maria would seem to have an advantage, but, as the above analysis suggests, that could all fall away if Serena has a similar impact on her game as she has in the past. Looking at who controls the return game could be a key indicator of which direction the match will go and whether Maria will have any chance of overcoming history.
These are matches that meet the competitiveness criteria and had match statistics available from public data sources ↩︎