Initial Thoughts on Developing WAR for Tennis

If, based on the title of this post, you were expecting some kind of jingoistic fantasy of tennis players armed with bazookas, you were mistaken. This post isn’t about that kind of war (as much as ESPN might be interested in such a gladiatorial venture); it’s about the sabermetric kind of WAR.

WAR, or wins above replacement, is a statistic used in major league baseball that attempts to measure an individual player’s value. As its name suggests, value here is measured by the estimated wins a player contributes to his team relative to a baseline that is defined by the skill of a typical player in the minor leagues, the so-called ‘replacement player’.

Although the mechanics of the calculation of WAR can be quite complex, the complexity hasn’t been a barrier to its popularity. In fact, in recent years, it has been the Drizzy of baseball statistics— stirring up controversy while gaining a greater hold on the conversation of player value. Topps baseball cards have even begun to feature WAR, making it the second of only two statistics (the other being OPS) since 1981 to be added to the cards. Now fans have one more reason not to rely on name recognition alone when judging a player’s prospects.

One of the reasons WAR has caught on is that it provides a solution to a basic conundrum in team sports: measuring how many wins an individual player earns for his or her team. Individual sports, like tennis, don’t have this problem. Since tennis players are responsible for 100% of their wins, once could just count up wins to measure a player’s value. Indeed, common record counts like Major titles won or total titles won are versions of this. Ranking points are another variant that represent a 52-week running tally of wins weighted by an ad hoc measure of match importance.

So, at this point, tennis would seem to know the answer to Edwin Starr’s eternal question: `WAR: What is it good for?' But some recent thoughts have made me wonder we should leave it at that. Perhaps there might be some good for tennis that could come out of WAR.

This line of thinking started after Novak Djokovic got many writers and fans thinking about best seasons, when, after winning the Shanghai Masters last month, the world number 1 called 2015 the best season of his career. Since it was only four years ago that Djokovic had another superhuman run of wins, his comment immediately set off debate about whether his achievements in 2015 are really better than those of 2011.

One could try to quantify the best by looking at win-loss record. But a major problem with win-loss as a measure of a player’s strength is that it doesn’t account for the strength of the opponents a competitor has faced. Even if Djokovic had an equal number of wins and losses in 2011 and 2015 (the actual record was 70-6 in 2011 and 77-5 to date in 2015), 2015 might be less impressive if, for example, more of those wins were against unseeded players compared to 2011.

And this got me thinking about WAR. If each win in tennis isn’t equally impressive, perhaps a WAR-like statistic could provide a way to standardize the impressiveness of wins and allow direct comparisons between wins against a different mix of opponents, effectively adjusting wins for opponent strength.

How could this work? Using the same conceptual approach as WAR, one could compare a player’s wins against some baseline. For the MLB, the baseline is defined by the replacement players. The closest equivalent in tennis would be qualifiers, the players who do not get a main entry into a tournament draw and the pool from which lucky losers would be drawn in the event of a withdrawal of a player in the main draw. In using the strength of the average qualifier as the baseline, the basic assumption would be that qualifier strength follows the trends of the field, that is, if one season has a particularly strong cohort of players this will be reflected in the performance of qualifiers.

Once a definition of a replacement player is determined, the next step is determining the expected wins of the baseline player against the opponents of the matches in question. This could be done with your tennis prediction model of choice. My own preference would be one based on $BP^2$ Pythagorean expectation, as it is a point-level measure of performance that is more stable than win-loss and avoids some of the pitfalls of models based on player ranking.

Curious what WAR has to say about Djokovic’s best season? Next week I will try to bring the ideas in this post together to find out.