HV Article

Estimating Shot Difficulty

This is the updated shooting and goaltending model, which is closely related to the model I introduced in the summer of 2021. Much of the exposition here is copied from that write-up.

Although shots are the basic currency of hockey, it is obvious that not all shots are equally difficult to create, nor equally difficult to score, nor equally difficult to save. The model described here estimates the goal likelihood of different patterns of shots, our main interest. Along the way we will also obtain estimates of shooting and goaltending impact.

Vital Theoretical Preliminaries

My animating assumption is that all of the skaters (between six and eleven, in total, usually) are working together both to generate shots for their team and to suppress the generation of shots by the other team. In principle, I consider all of the skaters equally able to affect both processes, in the long run, even if a given skater (for either team) might be only minimally involved with a given shot. All of play in all three zones leading up to and including the decision by a given player to shoot the puck, I understand to be the product of the (combined) ability of all of the skaters, and I call the goal likelihood of shots generated, at the moment the decision to shoot is made, the "expected goal probability", or xG for short. Then, in addition to the xG of the pattern to which a given shot conforms, the shooter themself can, in principle, affect the goal likelihood of the shot, by shooting the puck skilfully, or perhaps (as I might) by woefully butchering whatever chances they and their teammates (and opponents) have conspired to generate. This ability I call "finishing impact" or "shooting talent". Finally, a goaltender can, in principle, affect the goal likelihood of a shot after it is taken, by sneakily interposing their body between the puck and the goal, or, (again as I might do) contriving to fail to do so.

This three-fold division—all of the skaters collectively produce the shot, the shooter shoots, and the goalie endeavours to save—is the animating idea behind the model I describe here. Even at this very basic level these choices have important consequences. For instance, a player who is skilled at receiving passes in (relatively) less-dangerous shooting positions and then skating untrammelled with the puck to more-dangerous locations will score more often for that reason, and that skill will appear in my accounting in their impact on xG (which will increase, since their team is taking more dangerous shots) and not in their finishing impact (which will presumably decrease, since they are shooting from locations where goals are easier to come by). Similarly, including goaltender effects only on shots already taken prevents us from making any estimate of goaltenders' impact on xG, conceded or generated, from, say, their tendency to handle the puck.

Throughout this article, when I say "shot" I will mean "unblocked shot", that is, goals, saves, and misses (including shots that hit the post or the crossbar). All shots in all strength situations that are taken against a team with a goaltender on the ice are considered.

Odds vs Probability

The machinery here uses the goal odds of a shot throughout, that is, that ratio of the probability of a shot going in the net to the probability of the shot not going in the net. A shot with a goal probability of 50% (that is, of 0.5) has odds of 1, a shot with a goal probability of 20% has odds of 0.20/0.80 = 0.25, and so on. In particular we do not use the word "odds" as if it were a synonym for probability, as if we were craven headline writers.

Method

Once a shot is being taken by a given player from a certain spot against a specific goaltender, I estimate the probability that such a shot will be a goal. This process is modelled with a linear model, and fit with a generalized ridge logistic regression. For a detailed exposition about how such models can be fit, please see section five of Wessel van Wieringen's excellent lecture notes. Briefly: I use a design matrix $X$ for which every row encodes a shot with the following columns:

Two indicators for the shooter; (in 2021-2022 there were 978 shooters)

One for tip/deflections
Another for all other shot types

An indicator for the goaltender; (in 2021-2022 there were 119)
An indicator for the setter, that is, the person who passed the puck to the person who shot the puck. (This set of terms is new this year.)
A set of geometric terms and shot types, described below;
An indicator for "rush shots", that is, in-zone shots for which the previous recorded play-by-play event is in a different zone and no more than four seconds prior;
An indicator for "rebound shots", that is, shots for which the previous recorded play-by-play event is another shot taken by the same team no more than three seconds prior;
An indicator for teams which are leading and another for teams which are trailing; to be interpreted as representing change in configurations surrounding shots compared to when teams are tied;
Four indicators for different skater strength situations:
- SH for short-handed shots, that is, ones taken by a team with fewer skaters;
- PPv3 for shots taken by a team with more than three skaters against a team with exactly three skaters;
- PP for all other power-play shots, that is, ones taken by a team with more skaters;
- 3v3 for play where both teams have exactly three skaters (mostly in overtime)
These skater strengh indicators divide all the shots into five non-overlapping groups; the four indicated strengths and the (unmarked) "even strength" (4v4 and 5v5) shots. The measured values of these indicators should be understood as the change in goal odds compared to a similar shot at even-strength.
An interaction term for shots which are slapshots and also on the power-play of any kind, this term is meant to proxy for one-timers.

Setting

The new aspect in this year's model is the notion of setting, which is an attempt to measure how well (or poorly) some players can make a specific type of pass. There are many kinds of passing in the sport, and many kinds of play-making, but the kind which interests me here is the kind of pass which is made to a player who the passer imagines is going to shoot the puck as their next disposal of the puck. The most obvious such passes are seam passes, backdoor passes, and breakaway passes, but most shots are prefaced by such a "setting" pass, not all of them particularly glamourous. Some players are unusually able (or unable) to perceive and execute such passes, and the aim of including these setting terms in the xG model is to measure these abilities, as well as to measure the other aspects that much more closely.

The term "setting" is taken from volleyball, a much more structured sport than hockey, where one of the touches is usually performed by a player (the setter) who is understood to be uniquely skilled at passing the ball in a way which specifically enables a different member of their team to try to score. Despite hockey being much more fluid than volleyball, both positionally and territorially, the term "setting" still closely matches the aspect that I would like to measure: a pass for the purpose of a shot by the player receiving the puck.

In practice, though, the data I have to hand to measure who is making these passes is limited, so to include these terms in the model some imputation and imprecision is required. For shots which result in goals, the league records the name of the player who most-recently touched the puck before the shooter (the so-called "primary assist"); where such primary assists are recorded I mark this player as the setter. For shots which do not result in a goal, the setter (if they exist) is not recorded by the league. However, I do know which other players are on the ice, and I can impute a probability that each one was the setter for a given shot.

For a shot taken by a forward, I consider it twice as likely that one of the other forwards on the ice was the setter as one of the defenders.
For a shot taken by a defender, I consider it three times as likely that one of the forwards on the ice was the setter as opposed to the (usually unique) other defender.

These proportions are measured from historical shot assist data that I obtained from Corey Sznajder's excellent hand-tracked data. A truly thorough approach would use the specific identities of the setters for each shot documented there, where they exist, but this undertaking is not within me at the moment. As it stands this measurement of setting ability tracks primary assist rates fairly closely; what is somewhat surprising to me is that it nevertheless appears to be a fairly stable skill, largely independent of other skills.

So, for example, a goal scored by F1 assisted by D2 would be encoded with a value of "1" for shooter_F1 and a value of "1" for setter_D2 with 0 values for all other shooters and all other setters. On the other hand, a missed shot taken by F1 would be encoded with a value of "1" for shooter_F1 but with setter values of:

setter_F1: 0
setter_F2: 2/6
setter_F3: 2/6
setter_D1: 1/6
setter_D2: 1/6

and for a final example, a missed shot taken by D1 would be encoded with a value of "1" for shooter_D1 and setter values of:

setter_F1: 3/10
setter_F2: 3/10
setter_F3: 3/10
setter_D1: 0
setter_D2: 1/10

Here we exploit that the usual convention for indicator variables (that a value of "1" indicates "truth/yes" and a value of "0" indicates "false/no") can be interpreted as the probability that a given player took on a given role.

Finally, to account for how approximately 20% of non-goal shots are unassisted (again taken from Corey's data), these imputed probabilities are all multiplied by 80%. If you like, you can think of this encoding of setters as a position-weighted superposition of all the possible setters (including there being no setter), combined together into a single data row.

The Geometric Terms

The various shot types (wrist, slap, backhand, tip, and wraparound) are all qualitatively different in execution and, we observe, arise in different situations. In particular, the pattern of shot locations is quite different for each shot type. Consequently, I choose to model the geometry of each shot type separately. To do so flexibly, consider a hexagonal grid of cells, centred on the split line, covering the offensive zone. To each of the four main shot types, I define a "fabric", that is, a subset of these cells which contain at least 95% of all of the shots of that type since 2007-2008.

Wrist/Snap shots (388 hexes)
Slap shots (682 hexes)
Backhand shots (126 hexes)
Tips and Deflections (126 hexes)

Each hex in each fabric corresponds to a model covariate, in addition to five indicator covariates for each of the five shot types. The purpose, of course, of making such fabrics is to account for how the goal likelihood of a given shot depends on where the shot is taken. For wraparounds, I found that shot locations clustered very tightly in the two obvious locations (near the goalposts) and that the spatial variation in goal likelihood was extremely small. Thus, I chose to use a simple indicator for wraparounds, without encoding additional geometric detail for those shots.

I especially appreciate Michael Schuckers, who brought the importance of attacking geometric variation separately by shot type to my attention.

Every shot is assigned a single location indicator. If a shot's location does not fall into a hex for the associated fabric, I mark it either as:

"below the goal line",
"out of zone" (for shots from beyond the blue line), or
"perimeter", for in-zone shots above the goal line but far from the goal.

New in this version of the model is an additional distance term, only for slapshots. Distance from the centre of the net (0,89) is divided by 89ft, so that a distance of 1 is at the intersection of the split line with the blue line. I added this term because in every year, but especially 2019-2020, the variation due to location in goal odds for slapshots is larger than the location-variation for other shot types. Using an extra term distance term here permits me to keep the hex-to-hex fusion penalties (described below) consistent from fabric to fabric and also consistent through all three axes of the fabric.

Fitting

This section is quite technical and explains the mathematical intricacies of how I fit the above model with NHL data to obtain covariate estimates. If you like you can skip to the results section below. If you read the above section with impatience, you should skip forward; if you read about all those more-than-a-thousand-different hex coordinates and thought "Micah you poor sap this is gonna be overfit to hell and back" you should read this section first.

The observation vector $Y$ is 1 for goals and 0 for saves or misses. The model itself is a generalized linear one: $$ Y \sim l\left(X\beta\right) $$ where $\beta$ is the vector of covariate values and $X$ is the design matrix described above, and $l$ is the logistic function $$ x \mapsto l(x) = \frac{1}{1 + \exp(-x)}$$ (extended here to vectors pointwise) after which the regression is named.

Logistic regression coefficient values (the entries of $\beta$ which we are trying to find) can be difficult to interpret, but negative values always mean "less likely to become a goal" and positive values mean "more likely to become a goal". To compute the goal probability of a shot with a given description (encoded by a particular row $X_i$ of $X$): form the sum $X_i\beta$ of the model covariates to obtain a number, and then apply the logistic function to it to obtain the goal probability.

The logistic function is very convenient for modelling probabilities, since it monotonically takes the midpoint of the number line (that is, zero) to 50% while taking large negative numbers to positive numbers close to zero and very large positive numbers to positive numbers close to one.

The model is fit by maximizing the likelihood of the model, that is, for a given model, form the product of the predicted probabilities for all of the events that did happen (90% chance of a save here times 15% of that goal there, etc.). This product, with one factor for every shot in the data at hand, is called the likelihood, $L$. Large products are awkward, so instead of maximising $L$ directly we instead solve the equivalent problem of maximizing the logarithm of the likelihood, denoted by $\mathcal L$.

Before we compute the covariate values $\beta$ which minimize the log-likelihood $\mathcal L$, we bias the results with two so-called ridge penalty terms. These penalties encode our prior knowledge about the terms of the model before we consider the data at hand. The first ridge penalty encodes "static" information (things that we know about hockey generally) and the second one encodes "dynamic" prior information, that is, things learned about the covariates from previous years. By fitting the model with penalties, we obtain estimates which represent a compromise between the data at hand and our pre-existing understanding of what the covariates in the model mean.

Static penalties

The static penalties are encoded as a matrix $K$ and a term of the form $\beta^TK\beta$ is subtracted from the overall model likelihood. The entries of $K$ correspond to pairs of covariates, and we have at our disposal two different types of penalties. We can use diagonal entries $(i,i)$ of $K$ to penalize large a large covariate value in the $i$-th entry, and we can use so-called "fusion" penalties to penalize differences between two specified covariates. In our case we use a diagonal penalty of 100 for each goalie and shooter term, a diagonal penalty of 0.1 for each fabric hex, and a fusion penalty of 5 between each two adjacent hexes. The other terms (the strength indicators, the leading/trailing/rush/rebound indicators, the neutral zone/below goal line/shot type indicators) are unpenalized, so at these terms the model behaves like one fit with ordinary least squares.

It may be helpful to think of attaching elastic bands to each covariate. The diagonal penalties are very strong bands which attach each player value to zero, very weak bands that attach each hex value to zero, and medium-strength bands attaching each hex to its neighbours. Then, after the bands are prepared, the data is permitted to flow in, to push the covariate values while the tension in the bands constrains the resulting fit.

The substantial diagonal penalty for shooters and goalies encodes our prior understanding that all of the shooters and goalies are (by definition) NHL players, whose abilities are understood to be not-too-far from NHL average (that is, zero). The hex penalties work together to allow slow variation in the impact of geometry, encoding our prior belief that shots from nearby locations ought to have similar properties for that reason. In particular, fusing these many hex terms to one another in this way effectively lowers the number of covariates in the model, presumably helping to mitigate over-fitting.

Dynamic Penalties

In addition to the above penalties, I want to suitably accumulate specific knowledge learned from previous years when I form estimates of the impacts of the same factors in the future&emdash;in particular we imagine that our estimates for players describe athletic ability, which varies slowly. If you were truly interested only in a single season (or any length of games, thought of as a unity for whatever secret purposes of your own), these dynamic penalties would not be relevant. However, I am interested in fitting this model over each season from 2007-2008 until the present, and so I want to preserve the information gained from each season as I pass to the next. As we shall see, fitting this model produces both a point estimate and an uncertainty estimate for each covariate. The point estimates can be gathered into a vector $\beta_0$ of expected covariate values, and the uncertainties can be gathered into the diagonal of a matrix $\Lambda$, and then a penalty $$ (\beta - \beta_0)^T \Lambda (\beta - \beta_0) $$ can be subtracted from the overall likelihood. For players for whom there is no prior (rookies, for instance, but also all players in 2007-2008, since that is the first season in my dataset), I use a prior of 0 (that is, average) with a very mild diagonal penalty of 0.001.

Computational delicacies

In the end, then, the task at hand to fit the model is to discover the covariate values $\beta$ which maximize the penalized log-likelihood, $$ \mathcal L - \beta^T K \beta - (\beta - \beta_0)^T \Lambda (\beta - \beta_0) $$

Simple formulas for the $\beta$ which maximixes this penalized likelihood do not seem to exist, but we can still find it iteratively, following a very-slightly modified treatment of section 5.2 of these previously mentioned notes: Beginning with $\beta_0$ as obtained from prior data (or using the zero vector, where necessary), we will define a way to iteratively obtain a new estimate $\beta_{n+1}$ from a previous estimate $\beta_n$. Define $W_n$ to be the diagonal matrix whose $i$-th entry is $ l'(X_i \beta_n) $, where $X_i$ is the $i$-th row of the design matrix $X$ and $l'$ is the derivative of the logistic function $l$ above. Similarly, define $Y_n$ to be the vector whose $i$-th entry is $l(X_i \beta_n)$. Then define $$ \beta_{n+1} = ( X^TW_nX + \Lambda + K )^{-1} X^T ( W_n X \beta_n + Y - Y_n ) + \Lambda ( X^TW_nX + \Lambda + K )^{-1} \beta_n $$ Repeating this computation until convergence, we obtain estimates of shooter ability, goaltending ability, with suitable modifications for shot location and type, as well as the score and the skater strength.

Curious readers may wonder if such convergence is guaranteed; it suffices that $K$ and $\Lambda$ be positive-definite, which they are.

Results

I have fitted this model to each regular season from 2007-2008 through to 2021-2022; using the estimates from each summer as the priors to feed into the following year's regression, as explained above. I explain the results of the 2021-2022 regression in detail here; historical results can be found via the following links:

2007-2008
2008-2009
2009-2010
2010-2011
2011-2012
2012-2013
2013-2014
2014-2015
2015-2016
2016-2017
2017-2018
2018-2019
2019-2020
2020-2021
2021-2022

Geometry Results

First, the geometric terms, converted to probabilities for display. Each hex is shown with the constant term for that shot type added in already, so that the fabrics can be compared to one another.

	Wrist/Snap shots. Highest results are in tight as expected and drop off sharply outside the dots.
	Slap shots. Highest in the slot again as expected, but without the sharp drop off side to side and instead falling off roughly with total distance to the net.
	Backhand shots. High-quality areas are even more tightly concentrated than with wrist shots.
	Tips and Deflections. Here the sides of the net are quite high.

The base probabilities for:

shots from below the goal line: %;
shots from the neutral zone: %;
wraparounds: %
perimeter shots: %

Strength, Score, Rush, and Rebound Terms

Since precisely one of the geometric terms above applies to every shot, we can consider them as providing starting odds for us. Then, we can interpret the other covariates in terms of how they modify those odds, so we quote the remaining terms as odds ratios.

Covariate	Impact on Goal Odds
SH	`%`
PP	`%`
PPv3	`%`
PP-Slapshot	`%`

Rush	`%`
Rebound	`%`

Leading	`%`
Trailing	`%`

Positive values indicate factors that improve the goal odds of a shot, relative to an unadorned even-strength shot. All of the strength terms are considerably positive. Unsurprisingly, we expect that power-play shots, and especially power-play shots facing only three skaters, should have plenty of room in which to fabricate those extra details which we can't measure directly at this time which we nevertheless know make goals more likely; like more time and space to aim and shoot without immediate pressure, more pre-shot movement, and so on. More interestingly, even short-handed shots (rare as they are) have better goal odds than similar even-strength shots.

Rushes and rebounds have associated odds ratios of around double, which accords with our intuition that they make goal odds substantially better. More interestingly, leading teams shoot better than tied teams, but trailing teams show a much smaller (also positive) change for that reason.

Tip Location Adjustment

Careful students of the NHL-provided data will have noticed that shots that are tipped are recorded somewhat idiosyncratically. When recorded correctly, the original shot location is not listed, instead the tip location is recorded and the shot type is marked as "tip/deflection". However, it is also common for shot locations and descriptions to be revised after a time (sometimes this process is done in minutes, other times after a delay of a day or two); one common revision is for a shot to be recorded first as a slap or wrist shot from a distant spot and then changed to be listed as a tip/deflection with a different shooter and a different location. However, extensive spot-checking has revealed that many such tips are still listed as having locations corresponding to the original shot; with the actual tip location unrecorded. This particular pattern of mistake is especially common when such tips lead to goals (since such players are understandably keen to ensure they get credit) so leaving such incorrect data untouched is untenable but also difficult to fix. I have settled on an unpleasant compromise: for every shot recorded as a tip with a shot location outside the tip fabric specified above, I impute a plausible actual tip location by moving the location four-fifths of the way from its existing location towards the far post (or, in the very rare case of shots listed as precisely on the split line, four-fifths of the way to the middle of the goal line). This surgery, however unpleasant, seems to match the games I've checked considerably more closely than simply using the mostly-incorrect locations as given.

Out-of-sample Predictions

For the first time in a number of years I've decided to compare this year's version of the xG model (version 6, with the setting terms) to the one I made last year (version 5, without setters). In both cases, we can fit the model on a given year and then use it to predict the goal probability for all of the goals in the following year, and thus compute a log loss for each model. For the 1,378,081 unblocked shots in 2008-2022, version 5 has an out-of-sample log loss of 0.2113035026 per shot, and version 6 an out-of-sample log loss of 0.2111640278, an improvement of 0.000146 per shot. This is, admittedly, very small; but still encouraging. As far as I am aware there aren't any statistical tools for interpreting how significant this improvement is that are applicable to models fit in this way.

Player Results

The model can already be used without specifying shooters or goaltenders, however, this is perhaps a little boring. Below are the values for all the goaltenders who faced at least one shot in the 2021-2022 regular season. I've inverted the scale so that the better performances are at the top. For example, Ilya Sorokin has a value near -12%, this means that, given an unblocked shot with a goal probability of, say, 9%, corresponding to odds of 0.0989, we estimate that Sorokin's impact will lower those odds by around twelve per cent, to approximately 0.0883, which corresponds to a probability of 8%. By contrast, Kevin Lankinen has an estimate of around +12%, his impact on the same shot increases its odds from 0.0989 by twelve per cent, to 0.111 which corresponds to a probability of 10%. These per-shot changes are very small, but against a season's worth of shot volume amount to a great deal.

I've placed the shooting measurements for forwards and defenders and the setting measurements for forwards and defenders on separate pages for performance reasons.

Minimum Shots Faced:

Magnus 6: xG, Shooting, and Goalie-ing

July 12, 2022, Micah Blake McCurdy, @IneffectiveMath