If you have been following the ATP Tour’s China Open this week, you have probably seen references to Novak Djokovic’s winning streak. He will enter the final against long-time rival Rafael Nadal with a 28-0 record over 6 total appearances at the Beijing 500.
Streaks, whether hot hands or slumps, have fascinated fans throughout sport’s history. (No doubt there are petroglyphs somewhere recording the longest consecutive spearings of wooly mammoths!) Entire books are dedicated to the topic and many academics have spent years perfecting the calculation of some of the most standout streaky performances.
There is general agreement that one of the most miraculous displays of streakiness in sport was Joe DiMaggio’s 56-game hitting streak in the 1941 baseball season. In a 2009 article by statistician Don Chance (no, I’m not making this up), DiMaggio’s feat is estimated to have had a 1 in 3,650 probability of happening in his career, accounting for his batting average and hit opportunities.
With all the talk of streakiness in the tennis media of late, I was curious how much Djokovic’s achievement compares to the Yankee Clipper.
Taking a simple approach that treats Djokovic’s wins at the China Open as the outcomes of 28 tosses of a coin with probability $p_m$ on the $m$th match, we can estimate Djokovic’s achievement as the product of his probability of winning at the beginning of each of the 28 match-ups.
To estimate $p_m$, I computed Djokovic’s 9-month Pyhtagorean win expectation (based on the $BP^2$ model) and the same for his opponents. I then took 1 minus the ratio of these win expectations (a Bradley-Terry model) and add a correction factor so that the overall mean of Djokovic’s win probabilities equaled the mean of his $BP^2$ expectation.
Figure 1 shows how unlikely his China Open streak has become with each match. These are on the natural log scale so you would exponentiate each value to get the chance of the streak on the probability scale. For example, with his 2013 win over Rafael Nadal, Djokovic earned his 19th consecutive win at the tournament and raised the surprise factor to a probability of 3%. With his 28th win, his streak now has a probability of 0.6%.
Given an average match win probability of 84% at Beijing, a rough calculation would suggest that Djokovic would have to win 18 more consecutive matches (basically, 4 more perfect tournament appearances) at the China Open to beat Joltin' Joe’s record.
But there is another record Djokovic is known for that could contend with the 56-hit phenomenon. This is Djokovic’s 43-match winning streak that began with straight wins at the 2010 Davis Cup Final against France and continued thru the 2011 season until he lost to Roger Federer in the semifinals of the French Open. Figure 2 shows the streak probability progression using the same approach as described above (Note that walkovers, like Djokovic’s advance to the semis at Roland Garros due to Fabio Fognini’s withdrawal from the quarterfinal, are excluded). This shows that beating Richard Gasquet on red clay and earning the 43rd win of the streak brought the chance of his run to 0.05%—a likelihood approaching impossibility but still nearly two-fold more likely than Joe’s run.
The fact that the numbers tell us that Djokovic’s streak falls short of DiMaggio’s might seem surprising considering that Joe only had to get one hit each game and not consecutive hits. But we have to remember that hits are much harder to come by in baseball, where a “good hitter” is getting as few as 3 hits of every 10 at bats. In tennis, the tournaments stack the cards much more in favor of the higher ranked players with the way seeds are separated in the draws, making early round wins a near certainty for the best players in many cases.
Still, to catch Joe, Djokovic would have had to continue his run for only 4 more matches. And the way he has been playing this season, he might just catch him yet.