When to Pull the Goalie: Running the Numbers on NHL Goalie Pulls

For hockey fans, it’s a familiar story. As the clock runs down in the final (3rd) period, teams losing by a goal or two will look to pull their goalie and send out an extra skater in their place. This usually results in a 5 on 6 player situation, leading to offensive pressure and generating a late game push.

This move can be effective, but it dramatically increases the chance of the opposition scoring, since they get to shoot on an empty net. Usually it’s just a matter of time until this happens, at which point it’s pretty much game over. But this is a smart risk to take, given that losing has high odds anyway if the game is played out even strength.

It’s not a question of pulling the goalie or not, but what time is best? Too early and there’s a big chance of being scored on and missing out on some 5-on-5 opportunities to score. Too late and you won’t maximize the potential of your 5-on-6 advantage.

When should you pull the goalie?

I start by discussing some previous work done on this problem.

Then I explain how my training dataset was created, and I’ll walk through some technical details of the models (including some Python code).

Lastly, I discuss the findings.

TLDR;

As discussed in the results section of this post, I found that it’s optimal to pull an NHL goalie when there’s 3:00 left in the period. In this case, you would have 1 in 4 odds of scoring.

Source Code

https://github.com/agalea91/nhl-goalie-pull-optimization

Previous works lack source data and visual aids

For example, the Sportsnet article reports:

During the 2015–16 NHL regular season …
Pull between 1:305:00 remaining 16 % chance of success
Pull < 1:30 remaining10 % chance of success

It would be nice to know the error on each statistic. Assuming N=700 goalie pulls in a season (where 600 of those are in the last 1:30) I can add binomial error estimates:

16 +/- 3% chance of success with 1:30–5:00 remaining
10 +/- 1 % chance of success with < 1:30 remaining

This suggests good confidence that it’s better to pull the goalie before the 1:30 mark.

Asness and Brown [2018] have published a model that suggests 6:10 is the optimal goalie pull time for a one-goal deficit.

Included in their paper is a literature review that’s reproduced below:

Source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3132563

Overall it seems that previous works are lacking in interpretability through visual aids. Charts can also help us identifying trends.

Additionally, there was very little access to datasets the studies relied upon. Since I couldn’t find a good training dataset for my model, I went in search of goalie pull times.

The next section describes how I generated the Goalie Pull Dataset. For results, skip down to Pulling Earlier and More Often.

Goalie Pull Dataset

In this section, I’ll explain how I created this dataset and you can see some of the assumptions that went into my algorithm.

To obtain a suitable data source, I parsed goalie pull information directly from the play-by-play game sheets on NHL.com.

Here’s a sample of the result (full file at link above):

From 2003–2007 this was recorded as a timestamped row with description: <#> <NAME>, Goalie Pulled , where <#> and <NAME> label the goalie that was pulled [example].

After finding a row like this, we scan the remainder of the table looking for a goal.

If a goal is found then we cross reference the players on the ice to make sure the a goalie is not present. This step is important because there’s no data on when goalies return to the net, which happens quite commonly, e.g. in the case of a defensive zone face-off.

Also, I noticed that goalie pulls are being recorded when penalties are called, where the goalie goes to the bench for what is usually just a few seconds. To minimize false positive results as a result of this, I only searched for goalie pulls in the last 5 minutes of game time.

From 2007 onwards, goalie pulls were no longer recorded explicitly on the game sheet [example]. For these games, each row is labeled with the players on ice, so I used this to infer the estimated pull time.

Here’s the simplified algorithm:

pulls = []
for season in seasons:
for game in season.games:
goal_scan = False
for row in game.game_sheet_rows:
# Look for goalie pull
if row.is_goalie_pull:
goal_scan = True
pulled_goalie = row.pulled_goalie
pulled_time = row.time
if goal_scan:
# There has been a pull, scanning for a goal
if row.is_goal:
if pulled_goalie not in row.players_on_ice:
# We have found an empty net goal
pulls.append({
"season_game": [season, game],
"pulled_time": pulled_time,
"goal_time": goal_time
})
# We have found an empty net goal
pulls.append({
"season_game": [season, game],
"pulled_time": pulled_time,
"goal_time": goal_time
})

Additionally, I record which team scored the goal (for / against) and track goalie pulls that result in no goal.

Pulling Earlier and More Often

The full source code for this analysis is available in a Jupyter Notebook here: https://nbviewer.jupyter.org/github/agalea91/nhl-goalie-pull-optimization/blob/master/notebooks/src/3_exploratory_analysis_2003-2019.ipynb

Goalie Pulls Trending Up

We see a gradual increase in total number of goalie pulls over this time. Expansion teams entering the league would naturally push the total counts up, but we also see the average number of pulls per game increasing:

Marginal Gains on Positive Outcomes

The increase in goalie pulls over the years has only resulted in slight increases of goals for, with most of the additional pulls resulting in a goal against.

You might have noticed the blue outlier point in the chart above this one for the 2015/2016 season. Here we see that is mostly due to an unusually high number of goals against (red), as opposed to goals for (blue) or no goal outcomes (yellow). Perhaps this poor return on good outcomes explains why the next year we see average pulls per game return to the trend.

Interestingly, we see a downwards trend in goalie pulls where no goal is scored (i.e. the game ends).

Emerging Trend to Pull Goalies Earlier

This is illustrated in the following box plot of average goalie pull times:

The average time remaining for each season is marked with a solid black line through the middle of the bar, while the upper whiskers give a sense of the variation in each segment.

In recent years we can see increasingly large contribution of relatively early goalie pulls (e.g. above 3 minutes remaining). Historically, these points were just outliers.

Goalie Pulls are Left Skewed

Labelling by outcome, we see that late game pulls tend to have no-goal outcomes (yellow). Not having normalized the histograms, we can visualize the high likelihood of a goal against (red), compared to goal for (green).

Note the sparsity of data below ~17.5 mins and above ~19.5 mins. This will end up leading to huge uncertainty in the likelihood calculations below.

“Goals for” Lead “Goals Against”

These tend to occur very late in the game, with goals against (red) slightly lagging goals for (green). This is logical given that teams intentionally pull the goalie when they are in a strong offensive position and usually get a scoring opportunity before the opposition does.

Overall, exploratory analysis reveals that we have a highly noisy dataset where statistically significant optimization will be difficult, especially due to a lack of data for early pulls (prior to ~2.5 min remaining).

Despite this, I feel that our dataset showed some interesting trends and yielded valuable insights. It is also much larger than data sets used in other studies and open source.

I encourage others to study and help validate the dataset, which is available on GitHub.

New seasons can be added to the analysis by forking the project and expanding the source code.
⭐ Pull requests are welcome
⚠️ Please respect the API quotas when using this code

Building a Bayesian Model

The full source code for this modeling is available in a Jupyter Notebook here: https://nbviewer.jupyter.org/github/agalea91/nhl-goalie-pull-optimization/blob/master/notebooks/src/4_bayes_gamma.ipynb

Since I am interested in the optimal pull time, I’ll first fit the outcome (goal for, goal against, no goal) distributions above. The Gamma distribution is a suitable choice for modeling the data:

Gamma distribution

where t is the time elapsed in the 3rd period, alpha and beta are parameters to be determined using Bayes rule, and P will be the posterior probability of an outcome.

Using our full 2003–2019 goalie pull dataset X, I’ll solve for the probability of the outcome y, i.e. P(y|X; t). This is done computationally using PyMC3’s Markov Chain Monte Carlo (MCMC) algorithm. The outcomes of interest are y={goal for, goal against, no goal}.

I set up uniform priors on the Gamma parameters alpha and beta, and solve for these using MCMC and our observations on Gamma. With PyMC3 handling the heavy lifting, the code for this is deceivingly simple. For more details about the calculation, you can check out the source code.

MCMC Samples

When performing this calculation, PyMC3 also samples P(y|X; t) for us. Below I plot those samples along with the theoretical distributions (i.e. using values I calculated for alpha and beta).

Normalizing these as per population sizes in the training data, we see the following charts:

Early Pulls Yield More Goals

+--------------+----------+--------------+---------+
| | Goal For | Goal Against | No Goal |
+--------------+----------+--------------+---------+
| Time Elapsed | 18.6 | 18.7 | 19.3 |
+--------------+----------+--------------+---------+
| Game Clock | 01:24 | 01:19 | 00:41 |
+--------------+----------+--------------+---------+

Successful Outcomes are Unlikely

On the right hand side of the chart, we see that no goal outcomes are about twice as likely as goal against outcomes, which in turn are about twice as likely as a goal for (the success case). This is summarized as follows:

+------------------+----------+--------------+---------+
| | Goal For | Goal Against | No Goal |
+------------------+----------+--------------+---------+
| Mean Probability | 0.13 | 0.33 | 0.53 |
+------------------+----------+--------------+---------+

Odds of Success are 20% if Pulled Early

Mathematically this is done by multiplying P(y|X; t) with a function c(t), as defined by:

Re-normalization function

The result is as follows. Keep in mind that the x-axis corresponds to the time when the goalie is pulled. For example, if pulling the goalie at t=19 min (01:00 game clock) there’s a 30% chance of a goal against outcome.

This chart leads to several interesting observations:

  • The odds of a goal for are ~20% up until the 02:00 mark (peaking at 03:00). Then they approach zero gradually through 02:00–01:00 remaining, and more rapidly in the final minute.
  • Odds of a goal against drop off linearly up to the 02:00 mark, dropping from a high of ~60% to ~40%. From 02:00 onwards it follows the same trend as goals for.
  • Odds of no goal starts low and increases exponentially as the game clock ticks down.
  • If pulling the goalie with 30 seconds left, the odds are 5% goal for, 15% goal against and 80% no goal.

Outcomes are Uncertain for Very Early Pulls

Error propagation with partial derivatives

This results in the following error band estimates:

As expected, uncertainty plays a large factor for early pull times, and odds for times earlier than 03:45 cannot be accurately distinguished. Note that the singular points are a result of error propagation with partial derivatives and should not be interpreted literally.

Look to Pull ASAP after the game clock hits 03:00

The maximum likelihood is 26% ± 4% at the 03:00 mark on the game clock. In other words, pulling the goalie with 3 mins left in the 3rd period has historically yielded a 1/4 chance of success.

Following the line over to the right, we see the odds of success drop to zero as the game clock winds down. Like the chart above, we have very little statistical confidence in our model for earlier goalie pulls, due to a lack of training data.

Conclusion

This work supports this view through use of visual aids and models of goalie pull results that vary as a function of time left in the game.

The dataset and statistical method used for this work is open source, and I hope they can influence future research on the subject.

Thanks for reading 🏒
- Alex
alexgalea.ca

Special thanks to Willem Klumpenhouwer @wklumpen for reviewing this work and offering very helpful advice.

Please direct technical questions, comments or concerns through GitHub’s issue tracker.

Python Data Engineer, MSc. Physics

Get the Medium app