One constant of NHL hockey is that teams that are losing, on average, dominate play. They have the puck more, they take more shots, they score more goals, even though they usually still lose. These effects are broadly called "score effects", and I have spent a great deal of time trying to understand precisely what causes them. This article summarizes my most-recent attempt, and my conclusion is:
We observe that coaches deploy their players differently according to the score. Players who are trusted for scoring play more when their teams are losing, and players imagined to be defensively strong are given minutes when their teams are leading. My first flimsy foray into trying to explain score effects, five years ago, asked the question: is this variation in icetime enough to explain observed score effects on shot rates? Though the methods I used there are clumsy, they suggested that the variation caused by changing player deployment could only explain around a tenth of the observed effect. That effect size is too small to neglect but evidently much remains to be explained.
A much more sophisticated effort the following year examined score effects through the lens of leverage (there called pressure), a compound metric blending the score and the time remaining in the game. That work was difficult to interpret but one tentative finding was that coaches preferred to find "safe" (that is, low-leverage) minutes in which they could deploy their weaker players rather than deliberately seeking to deploy their stronger players in high-leverage minutes.
Taking both of these first attempts in mind, I have made a model that I think explains some of the key features of score effects, accounting both for changing player deployment, the pressure of the clock, and the details of specific score patterns.
In the hopes of isolating the effect of the score itself on the game, I took an approach which accounts for player ability, the time in the game, and the specific details of the score. Specifically, I used a linear regression model where:
Up to this point the model I used closely resembles my "flagship" shot rate model that I use for generating predictions, among other things. The crucial difference for the model used here is my treatment of score states. Instead of using the score difference as a covariate, I include a raft of score terms, one for each of the sixty regulation minutes of a game, for every pattern of previous goals scored, up to five.
For simplicity I restricted this analysis to the most recent three full regular seasons (2016-2019), using all those passages of 5v5 play in which at most five goals had been scored. Such low-scoring situations represent almost exactly two thirds of the 5v5 play from those seasons, making 122,589 minutes of data used here.
The reader is doubtless worried that including such a raft of terms might well lead to an overfit model. To escape this danger, I fit the model with so-called "ridge penalties", that is, by specifying my prior assumptions about what sorts of values I expect the covariates could reasonably take. The penalties are of two forms. Some penalties are so-called "diagonal" penalties, these punish deviation from specific pre-specified values, suitable for terms that correspond to player or coach ability, encoding our knowledge that NHL players play in the NHL and are therefore, by definition, broadly of NHL quality.
The strength of the fusion penalties is two hundred thousand, much stronger than the diagonal penalties. We are very sure that time is continuous. These penalties effectively reduce the number of terms in the model, reducing the danger of overfitting somewhat. The relative and absolute values of the penalties are largely ad hoc; I have taken the penalties used in Magnus 2 as my starting point and then the rest is a matter of (my) touch and intuition, such as it is.
The units of the response and the covariates in this model are maps of shot rates. For example, the term corresponding to play in the second period is the following map:
which, for display purposes, I've coloured using a map which shows areas with larger shot rates than average in red and less than in average in blue. As you can see, the impact on playing in the second period is, roughly, to increase shots slightly in the "channels" between the points and the net, and decrease shots from the slot and the points themselves. I am very fond of such maps, which I think are filled with interesting information but for the purposes of this article I'll be mostly using a summary of them which I call "threat", which is obtained by taking the weighted sum of all of the values in the map, according to the league-average shooting percentage at those locations. In this example, the result is +1.1%, which means that this pattern of shot rates would result, if all of the shots were taken by league average shooters and faced by league average goalies, in a rate of goals 1.1% higher than the league average 5v5 rate. This particular effect is quite minor.
The score terms of the model are the primary interest, and there are many of them---thirty-one non-trivial scores states with no more than five total goals, and sixty (mostly fused together) terms for each of those. To start, let's consider tied scores.
Each line here corresponds to one given score pattern, shown in the legend. The symbol "h" refers to a goal scored by the team whose shots are being considered, the symbol "a" to a goal scored against them. Home and road teams are aggregated together. Thus, the green line labelled "haah" refers to the shots taken by a team who has scored, then allowed two goals, then scored again to to tie the game at two. The two different 1-1 states are shown with dots and the six different 2-2 states are shown solid. No smoothing is done here, apart from the fusion penalties mentioned earlier which effectively smooth the lines. Terms for which there weren't at least a thousand seconds over the three seasons in question aren't shown, so none of these lines (which require at least two goals and sometimes four) begin at minute 0. The long-change term has been added to the terms between 20 and 40 minutes.
There are many interesting features even just looking at tied scores; most obviously that the second period contains much more shooting than either of the other periods. Also striking is the slow decline in offence throughout the third period, and also the sharp drops for every single tied state coinciding with the second intermission. The only exception to the third-period decline is one specific score-line: when the team in question was down two-nothing but rallied to tie; such teams alone are immune to having their offence steadily dwindle over time, although its steady value is still 10% less threatening than league average play.
This graph is prepared in the same way, for simplicity I've included only 1-0 and 2-1 aggregate scores. Here the same pattern as in the tied scores is repeated: the second period contains more offence, and offence drops sharply as the third starts and continues to drop throughout the third. The drop is most precipitous in games where a team with a two-nothing lead surrenders a goal; such teams by the end of regulation are themselves generating a pattern of shots 30% less threatening than league average play.
Next let us turn to teams that are trailing by a single goal.
The second period continues to contain more offence, and the third period offence shows universal declines. On the whole, the overall effect is for teams to generate more offence---teams who were in a 2-0 hole but then score to begin a comeback, perhaps, generate 10-15% more threatening patterns of shots throughout the second period. Comparing the trends for teams trailing by one (this graph) and their opponents leading by one (the previous graph), we see our first suggestion that leading teams are driving the effects we see. For tied teams, a reduction in offence as the third wears on suits both teams - they will split three points if they can shepherd their game to to overtime, because of the gimmick points handed out for winning an overtime or winning a shootout. However, for one-goal deficit games, matching reductions in offence clearly benefit the leading team. This is consistent with leading teams being the primary driver of score effects even if the fraction of total offence shifts towards trailing teams (which it does), since the marginal benefit to the leading team of a goal, in standings points, is so much smaller than the marginal value of a goal to the trailing team.
A shop-worn aphorism of NHL broadcasts is that scoring goals generates (or perhaps merely demonstrates) "momentum" for a team that can be observed by some increase in the fortunes of the scoring team. On its face this seems counterintuitive to score effects at the broadest level --- that trailing teams dominate. However, with our panoply of score terms we can now investigate this idea directly.
The lines in the above graph are labelled by overall score difference. Each line is the weighted average of all of the terms with that score difference where the final goal is one scored minus the weighted average of all the terms with the same score difference where the final goal is one conceded. So, for instance, the "tied" term is the weighted sum of the terms labelled "ah", "ahah", "aahh", and "haah", less the weighted sum of the terms labelled "ha", "haha", "hhaa", and "ahha". In general, each line shows the expected future advantage of being the most recent team to score, given the prevailing score. Surprisingly (to me), it is nearly always beneficial to be the most-recent scorer, although the effect is a modest 5% or so.
More intriguingly, we can also examine the direct impact of a goal on score effects, both on the team that scored the goal and the team that conceded it.
First, let's look at the impact of a goal scored on the offence of the team that scored it. Here each line is labelled by the change in score difference; for instance, a team that is tied before they score have moved to being up one; that line is labelled here in blue. The values are computed similarly to the most-recent-goal chart: for each minute the blue line, for instance, we can compute the weighted sum of the difference between the "ha" and the "hah" state, with the difference between the "haha" and "hahah" state, and so on through all of the states that match the pattern.
Here the striking fact is that nearly all of the lines are negative---that is, most teams, in most situations, respond to scoring a goal by lowering their offensive output in the following minutes. It might be equally true to describe this "the other way around" by saying that teams that concede goals respond by tightening up defensively. However, given what we've already seen above about leading teams driving changes as time evolves within score states, I'm inclined to suspect that this effect is also primarily driven by the scoring team, especially when the sharpest effect is the red line in the third period; where teams who score to tie the score late immediately start generating 10%-15% less threatening shot patterns despite the fact that another goal would be beneficial to them.
Also notable is the one strongly positive effect is that of scoring in the second period to cut a two-goal deficit in half. Perhaps this is the germ of truth behind the (preposterous on its face) aphorism that a two-goal lead is "the worst lead in hockey".
This chart is made in the same way as the previous chart, except it shows the impact on the offence of the team who conceded the goal. Here there is little effect, with nearly all lines confined in the -5% to +5% band near league average. However, there is one obvious exception: teams who concede a third-period go-ahead goal immediately generate a notable bit of offence, suggesting that teams in such circumstances are keeping their powder dry and can "turn it on" when necessary. In general, though, it seems that teams respond offensively to scoring (by generating less offence) and perhaps also defensively to being scored on (by allowing less); but mostly do not respond offensively to being scored on except near the end of games.
It appears as though score effects are driven primarily by leading teams and, while present throughout the game, have quite different characters in each period, and are most obvious in the third period.