13 September, 2022, Micah Blake McCurdy, @IneffectiveMath

Some players shoot the puck more than others. I have made a simple elo-style model so that, given a set of players on the ice, I can estimate how likely each of those players is to be the the shooter of a shot that they generate together. The method I used is taken from this article, with some minor tweaks.

I assign to every skater in the league a "shootiness", that is, a number which encodes the likelihood that the skater at hand will take a team's shots. A high shootiness rating means that a player is more likely to take a team's shots. The slightly silly name, "shootiness", is my attempt to condense "tendancy to shoot" into a single word. In other sports (especially basketball) where shooting more definitively ends team possessions, the common term is "usage", stemming from the idea that the shooter "uses up" the possession. The key function that we will need is the logistic function: $$ x \mapsto l(x) = \frac{1}{1+\exp(-x/d)} $$ where \(d = 100\) is an empirically determined constant. Elo rating systems can be thought of as streamlined versions of logistic regression. Players begin with a rating of \( r_0 = -d\ln(n/p-1) \), where \(n\) is the number of players usually on the ice of that position (2 for defenders, 3 for forwards) and \(p\) is the historical observed probability of players of that position being the shooter (1/3 for defenders, 2/3 for forwards). This starting value is chosen so that \(l(r_0) = p/n\) as the curious reader may readily verify. To compute shootiness values for every player in the league, I iterate through every regular season shot taken since 2007-2008. For a given set \(A\) of players on the ice for a given shot, I compute the "expected number of shots taken" for each player \(a \in A\) as: $$ e_a = \frac{2}{|A|(|A|-1)} \sum_{p \in A-\{a\} } l(r_a - r_p) $$ In this way, players who have shot the puck more in the past are expected to continue to do so; the sum \( \sum_{a \in A} e_a = 1 \) as you may verify. Then, after observing that player \(q\) was, in fact, the shooter of the shot, we update the ratings for each player by $$ r_p \mapsto r_p + k(|A|-1)( \delta_{pq} - e_p )$$ where \(k = 1\) is an empirically determined constant and \(\delta\) is 1 when its arguments are the same and zero otherwise. The shooter's rating always goes up, the non-shooters ratings always go down, but the changes are larger when the shooter is more surprising, given the histories of the players on the ice. I use shootiness ratings for two purposes. When simulating games, I form the sets \(A\) of on-ice players and then the computed \(e_a\) values can be used as sampling probabilities for choosing simulated shooters. Secondly, I can compute the probability of a given player being the shooter of a shot when placed on the ice with a synthetic set of teammates. For defenders, the synthetic teammates are one defender and three forwards; for forwards the synthetic teammates are two defenders and two forwards. When the synthetic teammates are given the initial rating \(r_0\) I can compute the shot probability \(e_p\) for the skater \(p\) of interest, the difference between this probability and the initial rating \(r_0\) is the shootiness value displayed on the summary cards. This gives an interpretable probability, rather than the somewhat opaque "underlying" ratings themselves.

This graph shows the shootiness for every skater who took a shot in the 2007-2023 regular seasons, with forwards in black and defenders in red. Some players, like David Pastrňák (the shootiest player in the league, as of writing) play a lot of minutes with shoot-first teammates; others, like Alexander Ovechkin, have their on-ice shot fractions boosted by playing with pass-first teammates. Individual skater values as of the end of specific seasons can be found at the top of the summary cards for those players.