Although shots are the basic currency of hockey, shot quality varies substantially. I am interested in measuring what is likely to happen when a player shoots the puck: will their shot be blocked? Will it hit the net? Will they score? In order to answer all of these questions, I have made three interlinked models:
Throughout this article, I use the word "shot" to mean any shot, not just a shot on goal. It is conceptually helpful to think of a shot as frozen at the moment when the puck leaves the stick, at the moment when it is "just a shot" and before we know the result, that is, before we know if the shot will be blocked or missed or saved or scored. Shots taken against teams with no goalies on the ice, or against teams with no skaters on the ice (penalty shots, shootout shots) are not considered here.
The quantity that we want to measure is the probability of this or that shot outcome; that is, a number restricted to be larger than zero and less than one. These restrictions are technically inconvenient for fitting models, so we employ for all three of our submodels a technique called logistic regression, named after the logistic function: $$ l(x) = \frac{1}{1 + \exp(-x)}$$
This function smoothly maps all real numbers \(x\) to probabilities, like so:
Using the logistic function in this way means that we can use all real numbers to represent the effects of the various terms in the models; positive terms mean "more likely to succeed" (that is, more likely to be at the net rather than blocked, more likely to hit the net rather than be missed, more likely to be scored rather than saved) and negative terms mean "less likely to succeed". Any number of terms can be added, without restriction, and then the resulting value can be turned into a probability by applying the logistic function. For lack of a better word these inputs to the logistic function are conventionally described as having units of "logits". On the graph above I've marked the value of the constant terms in each regression, which measure the average likelihood of success for each of the submodels. Around 15% of shots are blocked, so the success probability of a shot at the first hurdle is 85%, corresponding to just under 2 logits. Once we know that a shot is not blocked, the chance of it being on net is just under 75%, corresponding to just over one logit. A shot that is known to be on net becomes a goal just over 5% of the time, corresponding to around -2.2 logits. Throughout this article, the units for every term will be given in logits; if you find the units disorienting please refer back to this figure to get an idea of scale.
This year I've decided to explain the results of the fitted model first, and defer the technical details about how the models are fit until later. One big improvement in Magnus 7, however, is that the fundamental unit of time is pairs of seasons, because it improves out-of-sample predictions, especially in the estimates of shooter and goaltender talent. The figures displayed in this write-up are the results of fitting on the 2021-2022 and 2022-2023 regular seasons; when you see references to "prior estimates", these are the model outputs from the same model fitted to the 2019-2020 and 2020-2021 seasons, and so forth making a chain back to 2007-2008, the first season for which I have data.
PP | PPv3 | SH | EA | EV | 3v3 | At net vs Blocked | +0.03 | +0.26 | +0.29 | -0.23 | -0.01 | +0.37 |
---|---|---|---|---|---|---|
On Net vs Missed | +0.01 | -0.06 | +0.16 | -0.01 | -0.01 | +0.11 |
Scored vs Saved | +0.26 | +0.64 | +0.11 | +0.22 | -0.07 | +0.36 |
Through a technical device, these terms are what I call centred for each submodel; which is to say that the sum of these terms, weighted by how many shots are in each category, is zero in each submodel. This means that each term can be interpreted as the impact of that strength state relative to an average shot.
Although it is defensible to include terms for any division of the source data for which we observe different outcomes, it's best when we can perceive (or guess) mechanisms for how and why the terms affect the shooting process that we are interested in. The most obvious mechanism here is the simple number of bodies on the ice; when there are fewer defenders in shooting lanes the shots are more likely to succeed, when there are fewer in passing lanes the shots are more likely to be preceded by the sort of puck movement which causes a goalie to have to move, and so on. If we lived in the most mellifluous of all possible worlds, and all the variegated datums of existence were given to me in a convenient machine-readable format, I would be rid of all proxy terms, using the positions of all the defenders, the recent trajectory of the puck, and so on, directly. These factors are not ignored or missing from our approach, though, simply because they do not appear by name; we craftily attempt to gain exposure to the things we do not have through the things that we do.
Out of Zone | Below Goal Line | Wraparound | Wrist/Snap | Backhand | Tip/Deflection | Slapshot (Special Teams) | Slapshot (EV and EA) | At net vs Blocked | +3.70 | +2.38 | +0.13 | -0.36 | +1.81 | +1.40 | -0.74 | -0.38 |
---|---|---|---|---|---|---|---|---|
On Net vs Missed | +1.28 | -0.04 | +0.20 | +0.07 | +0.20 | -0.77 | -0.26 | -0.14 |
Scored vs Saved | -3.56 | +0.31 | +0.03 | +0.09 | +0.29 | +1.02 | -0.13 | +0.21 |
The single most important factor in shot success is shot location, as we shall see. Shots taken from below the goal line are blocked very rarely but score slightly more often; these sorts of shots are rare. Shots from outside of the offensive zone are also rare, these nearly always hit the net without being blocked and only a handful of such shots result in goals each season. The geometric character of the patterns of in-zone shots, though, depends strongly on the type of shot employed, a fact whose importance was made clear to me by Michael Schuckers, for which insight I am still very thankful. Furthermore, slapshots are employed in geometrically different ways for teams at even strength (here also including extra attacker shots) and on special teams, where they primarily manifest as one-timers. The specific geometric details will be discussed in the following section, for now we look at the overall success patterns for such shots. Wrist shots are blocked more often than backhands or tips; tips are unpredictable by nature and backhands, one imagines, are sometimes chosen by shooters specfically to elude a shot blocking attempt. Slapshots are also blocked more, especially on special teams, where they naturally fall into common shooting lanes. Tips miss the net more frequently than the average shot, as one might expect.
These terms are also centred; that is, each row of impacts sums to zero when weighted by the number of shots of each type.
Within each shot type, we can break down the offensive zone to see how location affects shot success. Notice first of all the scale of the impacts, from about -2.5 logits to +2.5 logits, much larger than any other common term in any of the submodels. Furthermore, we see that in addition variation between different shot types, there is also substantial variation between each submodel within each shot type. A single-model approach, which does not distinguish between different shot outcomes (like the ones made by me until now) has no way to avoid smushing all this geometric variation together.
The fabric of hexagons that I use for wrist and snap shots is the simplest; every shot is simply assigned to the nearest hexagon as indicated. This year I am using hexagons with sides of 88cm or so, so each hexagon covers an area of about two square metres (2.9 ft and 22 square feet), a considerably coarser grid than in years past. This is partially for computational efficiency, partly for simplicity, and partly for what we might call "data honesty", for lack of a better term, since in spot checking I've observed that the league's recorded shot locations are fairly reliable to this accuracy, the same cannot be said for the much smaller hexagons than I used to use. In addition to this coarseness, the fabric is artificially symmetrized across the split line; so that shots from both sides of the rink are gathered together for model fitting. This means that the fabric is necessarily symmetric about the split line; the improved data support and smoothness is why I did it, but if there are non-trivial left-right disparities at the league level I will not discern them.
For slapshots, in addition to the hex fabric, I include a special term which I call "seam target". A large fraction of slapshots are one-timers, and, especially on special teams, these one-timers tend to cluster at two spots inside the faceoff circles near the net. Somewhat arbitrarily I've chosen to mark these spots at fifteen feet from the split line and fourteen feet from the goal line; for each shot I compute the exponential of the squared distance of the shot from this point, multiplied by -1/100, to give an estimate of how likely a given slapshot is to have arisen from a one-timer, the bulk of which appear to come from so-called "seam" passes, often from near the top of the opposite circle. Thus, a shot from precisely this spot will have a seam target value of one, and shots from farther away will have smaller values; shots from entirely different parts of the zone have seam target values near zero. The graphic above shows each hex coloured by the sum of its own value added to the seam target value for a shot from the centre of that hex; the value of the seam target term itself is as follows:
EV and EA | Special Teams | At net vs Blocked | -11.18 | -8.17 |
---|---|---|
On Net vs Missed | -0.23 | +0.18 |
Scored vs Saved | +1.55 | +1.01 |
Slapshots from close to the seam target are often blocked, especially at even-strength, but they are also scored a good bit more often.
For tips and deflections, I use a restricted fabric as shown, from faceoff dot to faceoff dot extended out to the top of the slot. Careful students of the NHL-provided data will have noticed that shots that are tipped are recorded somewhat idiosyncratically. When recorded correctly, the original shot location is not listed, instead the tip location is recorded and the shot type is marked as "tip/deflection". However, it is also common for shot locations and descriptions to be revised after a time (sometimes this process is done in minutes, other times after a delay of a day or two); one common revision is for a shot to be recorded first as a slap or wrist shot from a distant spot and then changed to be listed as a tip/deflection with a different shooter and a different location. However, extensive spot-checking has revealed that many such tips are still listed as having locations corresponding to the original shot; with the actual tip location unrecorded. This particular pattern of mistake is especially common when such tips lead to goals (since such players are understandably keen to ensure they get credit) so leaving such incorrect data untouched is untenable but also difficult to fix. I have settled on an unpleasant compromise: for every shot recorded as a tip with a shot location outside the tip fabric specified above, I impute a plausible actual tip location by moving the location four-fifths of the way from its existing location towards the far post (or, in the very rare case of shots listed as precisely on the split line, four-fifths of the way to the middle of the goal line). This surgery, however unpleasant, seems to match the games I've checked considerably more closely than simply using the mostly-incorrect locations as given.
Tips from the outside left and right edges of this fabric hit the net more than one might expect; I suspect that many of the tipped shots from these areas which do not hit the net are simply completely unrecorded; there is a natural but regrettable tendency for scorers (who are humans after all) to notice tips more when they are more salient.
Backhand shots are tightly concentrated in the smaller fabric, and shooters who employ them frequently miss the net except from immmediately in front of the net. Backhands taken outside of this fabric are assigned a location of "perimeter", I do not consider them to be serious attempts to score and their success rates are small at every step.
An additional data wrinkle comes from the fact that the league does not record, for blocked shots, the location of the original shot, but instead the location of the block itself. For my purposes, I require the original shot location, and so for model training I've employed an imputation process where I make an educated guess of where a shot could have come from given where it was blocked.
Each one of these fifteen fabrics is centred, so each set of terms sums to zero when suitably weighted.
An important class of proxy terms can be found by looking at what has happened immediately before the shot is taken. If a shot is taken in the immediate aftermath of a hit, or a different shot, that tells us something about where the players on the ice are likely to be when the shot of interest is taken; how likely they are (including the goaltender) to be "in position", how much pressure they do or do not feel.
5s of nothing | Shot (this zone) | Shot (other zone) | Turnover (this zone) | Turnover (other zone) | Hit (this zone) | Hit (other zone) | Faceoff (this zone) | Faceoff (other zone) | At net vs Blocked | +0.00 | -0.34 | -0.02 | +0.19 | +0.50 | +0.07 | +0.46 | +0.18 | +0.17 |
---|---|---|---|---|---|---|---|---|---|
On Net vs Missed | -0.05 | +0.10 | +0.70 | +0.12 | +0.28 | -0.02 | +0.36 | +0.11 | +0.42 |
Scored vs Saved | -0.09 | +0.27 | +0.36 | +0.55 | +0.42 | -0.04 | -0.09 | +0.01 | -1.10 |
These terms replace the somewhat clumsier "rush" and "rebound" terms that I've used previously. I've arbitrarily fixed a window of five seconds as "recent context"; of course for most shots nothing is recorded in league play-by-play during these five seconds. Shots from "open play" in this sense are slightly less likely to succeed. More interesting are "rebounds", that is, shots immediately after other shots (whether or not the previous shots were blocked or missed or saved). When these previous shots are taken by the shooting team (the more common happening), the follow-up shots are blocked more often but also scored more often. Shots taken at the other end of the rink also sometimes lead to rush chances, these are even more likely to be scored and are missed less. Even more beneficial to scoring is when the most recent event is a turnover, in either zone, although same-zone turnovers are more common. By "turnover" here I mean an event recorded as either a giveaway or a takeaway, in my experience these events cover most sharply-defined changes in possession but the distinction betweens gives and takes does not seem trustworthy to me, hence their combination here.
When shots are preceded by hits (by either team), the effect on goal scoring is small, but when these hits are not close to the net at hand (so that there is a rush between the hit and the shot) they do lead reliably to saves. In-one faceoff plays that lead directly to shots produce saves also, but not goals to speak of; when the faceoff is in a different zone the effect (not blocked, not missed, and not scored) is even more pronounced. These terms are also centred.
For rebound shots, though, there is more structure to uncover. First of all, it makes a surprising difference to shot success precisely how far apart the two shots are.
Delay in seconds | 0 | 1 | 2 | 3 | 4 | 5 | At net vs Blocked | -0.01 | -0.23 | -0.09 | +0.10 | +0.19 | +0.22 |
---|---|---|---|---|---|---|
On Net vs Missed | +0.35 | +0.14 | -0.03 | -0.04 | -0.14 | -0.18 |
Scored vs Saved | +0.84 | -0.17 | +0.35 | +0.07 | -0.39 | -0.39 |
Bang-bang plays, where the second shot follows so closely after the first shot that they are both recorded as having occurred during the same game second, the scoring likelihood is very high. A delay of exactly two seconds is also good; but a delay of precisely one second is bad for shot success both at the level of blocks and also scoring. I do not fully understand these patterns; I suspect that the slightly longer delays are long enough to allow for a little bit of puck movement but not quite long enough for a corresponding goaltender movement.
These terms are centred.
Previous Shooter | self | other | At net vs Blocked | -0.04 | +0.01 |
---|---|---|
On Net vs Missed | +0.02 | -0.01 |
Scored vs Saved | -0.35 | +0.11 |
A surprisingly simple refinement was suggested to me by Cody Magnusson, namely, to consider if a rebound shot was taken by the same person as the shooter of the original shot, or by a teammate. Here we see that rebounds that are taken by other shooters are considerably more successful. Presumably the goaltender who made the first save is more likely to remain in position to make a second save on the same shooter, but necessarily less likely to be in position to make a save on a different shooter who will nearly never be in the same position as the first shooter was.
These pairs of terms are also centred.
Since it's so commonly repeated among league-adjacent folks that goals conceded in the first or final minutes of periods are uncommonly emotionally unpleasant to suffer, I decided to include terms to that effect:
Minute of Regulation Period | first | middle | final | At net vs Blocked | +0.13 | +0.00 | -0.10 |
---|---|---|---|
On Net vs Missed | -0.02 | +0.00 | -0.06 |
Scored vs Saved | -0.05 | +0.01 | -0.06 |
Even on a per-shot basis, scoring in the first and final minute is slightly less likely. Dividing the "middle" term into eighteen per-minute terms does not improve prediction accuracy, so I have kept them consolidated. These terms are centred.
Score State of Shooting Team | Leading (1st) | Leading (2nd) | Leading (3rd) | Tied (1st) | Tied (2nd) | Tied (3rd) | Trailing (1st) | Trailing (2nd) | Trailing (3rd) | At net vs Blocked | -0.00 | +0.03 | +0.10 | -0.03 | +0.01 | -0.04 | -0.01 | -0.01 | -0.03 |
---|---|---|---|---|---|---|---|---|---|
On Net vs Missed | -0.02 | -0.00 | -0.01 | +0.01 | +0.01 | -0.02 | -0.01 | +0.03 | -0.02 |
Scored vs Saved | -0.06 | +0.06 | +0.08 | -0.06 | +0.03 | -0.06 | -0.01 | +0.04 | -0.04 |
Since the score strongly affects shot rates, it seems plausible to imagine that it might affect per-shot outcomes also. For blocked shots the only notable change is that teams who are leading in the third have a larger proportion of their shots blocked. I expected to see a matching effect for trailing teams in the third, imagining that teams with a lead to defend would block shots more assiduously, but in fact it is not so. Impacts on scoring are fairly small, with a bump for leading teams in the later periods and a small decrease for tied teams in the third, consistent with the cautious approach that the current 2-2-1-0 point system incentivizes. Since we know that leading teams have notably better raw shooting percentages than trailing teams, the general meagreness of these terms suggests to me that the mechanism of why leading teams are more efficient is largely captured in the terms already described.
Although leading in the third is most auspicious, it seems as though there is a broad per-shot goal-likelihood bump for teams playing in the second period compared to the first and third, matching the ~10% increase that we see in shot rates in the second. This is presumably a result of the long change in the second period leading to poorly-positioned defenders.
These terms are centred.
Last but by no means least, individual people can affect shooting success in a purely intrinsic way. Comparing two shots which are otherwise the same (in the sense of the terms already discussed - both from the same location, of the same type, at the same game time, and so on) can have quite different results depending on who shoots them, who attempts to save them, who passes the puck to the shooter, and, indirectly, the coaches who design and enforce the offensive and defensive schemes.
The most obvious individual skill difference that we observe in shooting is that of the shooter themselves. Some players aim better,
shoot harder, have more deceptive releases, and so forth, than others. By modelling shooter impact with a triple of terms in this way
(one for each submodel), I can estimate their average tendency to have their shots blocked or not, missed or not, scored or not.
I do not include teams in the model in any way, but I've listed the players according to (all of) the teams for which they played in the 2021-2023 seasons. The values are for the full time periods, so players who played for multiple teams will appear with identical values for all of the relevant teams. On the one hand, the bulk of the league falls in the roughly -0.75 to +0.75 logit range (with names suppressed for clarity) but most teams have several players who are considerably better shooters at a per-shot level, for every submodel. Perhaps unsurprisingly, the best shooters at making sure their shots are not blocked are almost all defenders, especially ones frequently described as "mobile" of "offensive".
These terms, collectively for the whole league, are centred.
For shots being missed or not, the overall range is similar to that for blocks. Here neither the very best nor the very worst in the league strike me as sharing any obvious similarities.
These terms are centred.
For shots scored versus shots saved (the measure of most salience) there is a noticeable right-skew, where the best shooters are considerably stronger than the worst shooters are weak. This is consistent with shooting talent being a non-trivial component of the input to whether or not players are even in the league. The best shooters are nearly all forwards, with very few exceptions, even after controlling for (many of) the stuctural features of why defenders take lower quality shots, as we have done with the terms above (especially the geometric terms).
One change this year is that I no longer include shooter identities for tips or deflections. Long ago I used single shooter terms for all shots taken by a given player; last year I divided each player into two shooting terms (a tipping term and a term for all other shots) but this year I have discovered that the best predictions are obtained by assuming that the person tipping the puck does not, in general, do so according to any particular skill. The improvement to predictions by forgetting in this way suggests that the few players in the league who are well-known for tipping the puck well are exceptional rather than simply at one end of a skill distribution.
These terms are centred.
An important aspect of shot success is the quality of pass which immediately precedes shots. Passing is fundamental to the sport, but many passes are of a specifically threatening type: at the moment of the pass it is evident that the "final" attacking manoeuvre has begun, final in the sense of being definitely intended to bring about an end to the current possession by shooting. Purloining a suitable word from volleyball, whose possession rules make the second-to-last touch of the ball unusually important, I call this skill setting. The most obvious form of setting passes are the ones that precede one-timers, but breakaway "stretch" passes also fit the bill directly, as do the kind of discombobulating passes that put the defenders into confusion and allow the new possessor of the puck the time and space to shoot with less pressure than they might usually experience for a shot of a given description.
The setter of a shot, if any, is not directly recorded by the league except when the shot is scored. Hence, I employ an imputation scheme to infer who might have been the setter, if any, for shots that are not scored. Adding setting terms to the submodels for blocks and misses increases their predictive power on out-of-sample shots by such a small amount that I felt it better not to include them at all, simplifying those submodels by a thousand or so terms.
Here the best setters are a mixture of forwards and defenders, and again we see the right-skew that is consistent with this ability being one of selection importance for the league as a whole. This effect is accentuated, to some extent, by the imputation scheme which assigns credit for shots that are scored more precisely and assigns blame for shots that are saved more diffusely.
These terms are centred.
For goaltenders, it would be silly to include terms in the block-vs-at-net submodel, since goaltenders don't exert any influence
in practice on whether or not the skaters on their team block shots. However, they do influence if opposing shooters hit the net
or not, and, much more obviously, if opposing shooters score or not.
For consistency I've decided to keep the upwards direction on my vertical axis to be "more shot success", so here the stronger goaltenders appear at the bottom of the plot. While this ability of goaltenders to cause (or fail to cause) shooters to miss the net appears to be a "real" skill (since it both increases prediction accuracy and is also mildly repeatable), it does not appear to be particularly correlated with the considerably more important skill of preventing goals. Notice that the range of impacts for goaltenders is a fair bit tighter than that for shooters and setters; we will have more to say about this when we discuss penalties. Presumably the primary mechanism of this impact on misses is due to positioning, a person more familiar with different goalie-specific traits than me might have more to say.
These terms are centred.
For impact on goal probabilities directly, we see again the skew towards stronger outcomes that we saw for setting and shooting impact on goals, suggesting direct selection pressure on this skill. I suspect that some of the apparent "randomness" in the position is an artifact of this selection pressure. Many goaltenders cluster tightly close to or slightly weaker than average, especially those with few minutes.
Throughout this project, we've used logit scale for our impact measurements primarily for technical reasons, to allow the use of typical fitting techniques. However, there is a pleasant conceptual fit also: every term is implicitly more heavily weighted on shots whose success probability is closer to 50%. For goaltenders this amounts to weighting "high-danger" (~20-30%) chances more heavily than the typical ~5-10% fare which makes up the bulk of the work. This implicit weighting fits with my intuition that these high-danger shots allow more latitude to perceive a goaltender's skill (or lack thereof), while routine shots reveal very little that we do not already know, given that the goalie in question was chosen by a professional coach to appear in a professional game.
These terms are centred.
Coaching staffs broadly have a peculiarly tricky impact to discern, on every aspect of the game. On the one hand, they don't have any direct control over any on-ice results, since they aren't permitted to be on the ice; but they exert considerable indirect control by devising a variety of offensive and defensive schemes, choosing which combinations of players will play together, and, most importantly, controlling player behaviour by allocating icetime itself to players who follow coaching instructions more or less closely. As a simple first step to measuring this impact on per-shot outcomes, I've introduced new terms in each submodel for each head coach, one for the impact on their teams shots and one for the impact on their opponent's shots. Though they are labelled by the head coach's name, these terms are meant to encapsulate the relevant decisions of the full coaching staffs, answerable as they usually are to the head coach.
Some coaches evidently care about their players making sure their shots make it through towards the net, like Bruce Cassidy, Bob Boughner, and Tony Granato. Dually, some coaches ask their players to block a larger proportion of shots (St. Louis, DeBoer, McLellan), while Granato visibly does not; nor indeed do some "old-fashioned" coaches like Darryl Sutter. The range of impacts on shots taken is narrower than the range of impacts on opponent shots, as seems plausible. Each of these sets of terms is centred.
For misses-vs-shots-on-goal, we see some coaches place strong emphasis on hitting the net (Jared Bednar, Andrew Brunette) while others seemingly prefer a shots-as-indirect-passes style, like Rob Brind'Amour and Rick Bowness. There is a small (positive) correlation between the attacking and defending impacts. Each of these sets of terms is centred.
For impact on goals themselves, the coaching terms are considerably smaller. Each of these sets of terms is centred.
The observation vector \(Y\) is 1 for goals and 0 for saves or misses. The model itself is a generalized linear one: $$ Y \sim l\left(X\beta\right) $$ where \(\beta\) is the vector of covariate values and \(X\) is the design matrix whose columns are the terms described above, and \(l\) is the logistic function $$ x \mapsto l(x) = \frac{1}{1 + \exp(-x)}$$ (extended here to vectors pointwise) after which the regression is named.
The model is fit by maximizing the likelihood of the model, that is, for a given model, form the product of the predicted probabilities for all of the events that did happen (90% chance of a save here times 15% of that goal there, etc.). This product, with one factor for every shot in the data at hand, is called the likelihood, \(L\). Large products are awkward, so instead of maximising \(L\) directly we instead solve the equivalent problem of maximizing the logarithm of the likelihood, denoted by \(\mathcal L\).
Before we compute the covariate values \(\beta\) which minimize the log-likelihood \(\mathcal L\), we bias the results with so-called ridge penalty terms. These penalties encode our prior knowledge about the terms of the model before we consider the data at hand. Traditionally, ridge penalties have been constant multiples of the identity matrix, so that every term is biased towards zero. This avoids some pathological results, especially when pairs of terms that routinely appear together might overfit the data by assuming outlandishly large positive or negative values. The passage to generalized ridge regression amounts to realizing that one can instead use any penalty matrix which is positive semi-definite. Such matrixes can be characterized as enforcing any number of constraints of the form $$ \sum_i c_i k_i = 0$$ where the \(c_i\) are arbitrary constants and the \(k_i\) are model terms. Some of the penalties we use encode "static" information (things that we know about hockey generally) and others encode "dynamic" prior information, that is, things learned about the covariates from previous years. By fitting the model with penalties, we obtain estimates which represent a compromise between the data at hand and our pre-existing understanding of what the covariates in the model mean.
The "static" penalties are the most important ones, where we encode our prior understanding of how much the terms of each type can collectively affect shot success. First, I begin with a "global" penalty of 100, all of the other penalties are multiplied by this.
What is considerably more interesting is the relative strenth of the penalties applied to terms describing people directly, that is, the shooting, setting, goaltending, and coaching terms. Higher penalties result in narrower distributions, so here the relatively low value of the shooting penalty relative to the goaltending penalty amounts to constraining the range of goaltending talent to be, roughly, six times narrower than the range of shooting talent; or, to put the same thing another way, encoding a belief that each shot outcome reveals about six times more information to us about the skill of the shooter than information about the skill of the goaltender.
While a thorough exploration of this (very large!) parameter space is not plausible without supercomputers, I have verified that each of the penalty values for the human terms produces better out-of-sample predictions than at least one nearby larger and smaller value. While unavoidably technical, these penalties have very substantial impacts on the model measurements, in particular the relative tightness of goaltending talent compared to shooting and setting talent is a sharp departure from my previous models (when they were the same).
In addition to the diagonal penalties above, we can enforce some smoothing on the geometric terms, that is, on the hexagons in the in-zone fabrics. Without knowing anything about the details of the sport, we expect that nearby shots should have similar results for that reason, purely on physical grounds. Suppose that \(p\) and \(q\) are adjacent hexes; we can consider the triangle formed by the centre of \(p\) and the two goalposts, as well as the triangle formed by the centre of \(q\) and the two goalposts. These two triangles will overlap to some extent, and the more they overlap, the more similar we expect the shot results from the two hexes will be. The ratio of this overlap area to the average area of the two triangles gives a suitable measure of similarity; this number \(c_{pq}\) is the strength of a so-called "fusion penalty". Specifically, if we want the covariate \(\beta_p\) corresponding to \(p\) to be similar to \(\beta_q\), that is the same as wanting \(\beta_p - \beta_q\) to be small. It is equivalent to ask that the square of this quantity be small, which is to say, asking that $$ \beta_p^2 - \beta_p\beta_q - \beta_p\beta_q + \beta_q^2$$ should be small. The strength of our desire is \(c_{pq}\), so to accomplish this we can add \(c_{pq}\) to the \((p,p)\) and \((q,q)\) entries of the penalty matrix \(K\) while subtracting \(c_{pq}\) from the \((p,q)\) and \((q,p)\) entries of \(K\).
It may be helpful to think of attaching elastic bands to each covariate. The diagonal penalties are strong bands which attach each player value to zero, very weak bands that attach each hex value to zero, and medium-strength bands attaching each hex to its neighbours. Then, after the bands are prepared, the data is permitted to flow in, to push the covariate values while the tension in the bands constrains the resulting fit.
The substantial diagonal penalty for shooters and goalies encodes our prior understanding that all of the shooters and goalies are (by definition) NHL players, whose abilities are understood to be not-too-far from NHL average (that is, zero). The hex penalties work together to allow slow variation in the impact of geometry, encoding our prior belief that shots from nearby locations ought to have similar properties for that reason. In particular, fusing these many hex terms to one another in this way effectively lowers the number of covariates in the model, presumably helping to mitigate over-fitting.
In addition to the above penalties, I want to suitably accumulate specific knowledge learned from previous years when I form estimates of the impacts of the same factors in the future—in particular we imagine that our estimates for players describe athletic ability, which varies slowly. If you were truly interested only in a single season (or any length of games, thought of as a unity for whatever secret purposes of your own), these dynamic penalties would not be relevant. However, I am interested in fitting this model over each season from 2007-2008 until the present, and so I want to preserve the information gained from each season as I pass to the next. As we shall see, fitting this model produces both a point estimate and an uncertainty estimate for each covariate. The point estimates can be gathered into a vector \(\beta_0\) of expected covariate values, and the uncertainties can be gathered into the diagonal of a matrix \(\Lambda\), and then a penalty $$ (\beta - \beta_0)^T \Lambda (\beta - \beta_0) $$ can be subtracted from the overall likelihood. For players for whom there is no prior (rookies, for instance, but also all players in 2007-2008, since that is the first season in my dataset), I use a prior of 0 (that is, average) with a very mild diagonal penalty of 0.001.
In the end, then, the task at hand to fit the model is to discover the covariate values \(\beta\) which maximize the penalized log-likelihood, $$ \mathcal L - \beta^T K \beta - (\beta - \beta_0)^T \Lambda (\beta - \beta_0) $$
Simple formulas for the \(\beta\) which maximixes this penalized likelihood do not seem to exist, but we can still find it iteratively, following a very-slightly modified treatment of section 5.2 of these previously mentioned notes: Beginning with \(\beta_0\) as obtained from prior data (or using the zero vector, where necessary), we will define a way to iteratively obtain a new estimate \(\beta_{n+1}\) from a previous estimate \(\beta_n\). Define \(W_n\) to be the diagonal matrix whose \(i\)-th entry is \( l'(X_i \beta_n) \), where \(X_i\) is the \(i\)-th row of the design matrix \(X\) and \(l'\) is the derivative of the logistic function \(l\) above. Similarly, define \(Y_n\) to be the vector whose \(i\)-th entry is \(l(X_i \beta_n)\). Then define $$ \beta_{n+1} = ( X^TW_nX + \Lambda + K )^{-1} X^T ( W_n X \beta_n + Y - Y_n ) + \Lambda ( X^TW_nX + \Lambda + K )^{-1} \beta_n $$ Repeating this computation until convergence, we obtain estimates of shooter ability, goaltending ability, with suitable modifications for shot location and type, as well as the score and the skater strength.
Curious readers may wonder if such convergence is guaranteed; it suffices that \(K\) and \(\Lambda\) be positive semi-definite, which they are.
There are two natural tests for including terms in a model of this type, either they have clear intrinsic explanatory plausability, or they have extrinsic predictive power; when all the terms have both qualities then we have a satisfying "scientific" model where we have both an ability to measure what is likely and a coherent explanation of how those outcomes are produced. Each model iteration, however, contains some number of ideas for terms or term patterns that are discarded; usually for lack of predictive power. This year I investigated many different ideas that did not make it to this final version of the model, some of which I have mentioned along the way and others of which are interesting enough to chronicle here.
I wondered if perhaps shots were more or less successful at different times in players shifts; perhaps (relatively) tired defending skaters late in shifts would be out of position more, or perhaps teams might pass up low-quality looks in favour of a chance at a higher-quality chance early in a shift, or perhaps shots late in shifts might be saved more often as teams chase faceoffs for their replacements. To this end I considered the average time-in-shift of the shooting team as a term, as well as the average time-in-shift of the defending team, in seconds; in a variety of permutations, in no instance did these terms improve prediction accuracy to speak of.
It's sometimes said that it's unpleasant for a goaltender to go for long stretches without facing any shots, and that when such stretches do occur they augur ill for the chances of saving the next shot which does eventually appear. I could not find any such effect; it's possible that the effect does exist but is already captured by the other terms (the "event in other zone leading to a rush chance" terms spring to mind).
Individual defenders and attackers are well-known to impact the rate of shots that their opponents in an number of ways, I wondered if in addition to this they might have a general impact on per-shot danger. Specific terms of this type (shooting, setting) are obviously useful; I tried including "general" terms for each player that applied to every shot taken when they are on the ice, one for defensive impact, one for offensive impact (over and above the specific terms). Including such general terms makes predictions of future results weaker; in particular I found no evidence that defending players have a general impact on per-shot success. Instead of thinking about defence as primarily geometric (where might my opponents shoot from, how can I confine them) it might be better to think of it as primarily temporal (my opponents have no good shots available now, I can make sure that situation remains true for a long time). This is consistent with a visible impact in rates, where time is central, but no impact in per-shot quality, where time is immaterial.
Instead of the simple leading/tied/trailing terms that I settled on, I explored using specific score states (1-0, 2-0, and so forth). This improved nothing. I also considered a "garbage time" term, imagining that shots taken (by either team) late in blowouts might be affected by the lack of leverage, but found nothing of value here.
Every year contains more ideas than I have the time or computers to properly explore; one thing that I have deferred until 2024 is the question of detailed save outcomes. Specifically, not every save has the same immediate aftermath. When goalies freeze shots the result is a faceoff, nearly always in their own zone. On the other hand some saves are deflections, some of these go out of play and result in faceoffs, the others create puck battles, some easier to win for the defending team than others, and some of greater danger to the goalie at hand if their team does not win this battle. I've decided to defer this primarily out of lack of time, but also because I mean to explore goaltender impact on rebounds through my shot rate model also in the meantime.
Some shots are designed to score, others are indirect passes, of a sort; still others are primarily designed to generate stoppages. I have a model in preparation with which I mean to estimate skater impact on stoppages in all three zones; when it is ready I expect it will have some useful links to this model that I mean to include next year.
The data I have to hand to measure who is making setting passes is limited, so to include these terms in the model some imputation and imprecision is required. For shots which result in goals, the league records the name of the player who most-recently touched the puck before the shooter (the so-called "primary assist"); where such primary assists are recorded I mark this player as the setter. For shots which do not result in a goal, the setter (if they exist) is not recorded by the league. However, I do know which other players are on the ice, and I can impute a probability that each one was the setter for a given shot.
So, for example, a goal scored by F1 assisted by D2 would be encoded with a value of "1" for
shooter_F1
and a value of "1" for setter_D2
with 0 values for all other
shooters and all other
setters. On the other hand, a missed shot taken by F1 would be encoded with a value of
"1" for shooter_F1
but with setter values of:
setter_F1
: 0setter_F2
: 2/6setter_F3
: 2/6setter_D1
: 1/6setter_D2
: 1/6shooter_D1
and setter values of:
setter_F1
: 3/10setter_F2
: 3/10setter_F3
: 3/10setter_D1
: 0setter_D2
: 1/10Finally, to account for how approximately 20% of non-goal shots are unassisted (again taken from Corey's data), these imputed probabilities are all multiplied by 80%. If you like, you can think of this encoding of setters as a position-weighted superposition of all the possible setters (including there being no setter), combined together into a single data row.
The league records the location of shot blocks, but I would like to know where the original shot was taken, before it was blocked. I guess this original location as follows: if a shot is blocked at location \((x,y)\), I consider all of the hexes whose centre \((a,b)\) has second coordinate closer to centre-ice than \(y\). For each such hex I compute a score \( \exp(-s^2/30 )\exp( -d^2/100) \), where \(s\) is the distance between the centre of the goal and the intersection of the goal line with the line joining \((a,b)\) to \((x,y)\) and \(d\) is the distance between \((a,b)\) and \((x,y)\). I assign each hexagon as possibly being the original shot location with probability equal to its score divided by the sum of all of the scores. The chance for a given hex is thus both weakly proportional to how close it is to the block location, since most shot blocks are caused by proximity pressure, and strongly proportional to how "on line" a shot from such location would be.
There are many sets of quantities which we know ahead of time have no net total effect, even though the individual terms may differ. For example, the full set of shooters who appear in the league in a given stretch of time have, by definition, an average total impact on shot success, even though some shooters are more skilled than others. In order to enforce this group behaviour, we can use so-called "centering penalties" (I am the person who calls them this, if you know of existing names in the literature of penalized regression for this or similar things, I would appreciate being told). Specifically, let \(S\) be a set of terms whose net impact is known to sum to zero, let \(c_s\) be the prevalence of an element \(s\) of \(S\) in the data; so, for instance, if \(s\) were a shooter then \(c_s\) would be the number of shots taken by that shooter. Then we could like $$\sum_{s \in S} \beta_s c_s = 0$$ to hold, one way to do this is to insist that $$\left(\sum_{s \in S} \beta_s c_s\right)^2 = \sum_{s,t \in S} \beta_s \beta_t c_s c_t$$ be close to zero, so to do this we can add a very large multiple of \(c_s c_t\) to the entry of the penalty matrix \(K\) corresponding to the \(s\)-th row and \(t\)-th column. I used \(10^{18}\) for this purpose.