When to Pull the Goalie: Running the Numbers on NHL Goalie Pulls

When should you pull the goalie?

TLDR;

As discussed in the results section of this post, I found that it’s optimal to pull an NHL goalie when there’s 3:00 left in the period. In this case, you would have 1 in 4 odds of scoring.

Source Code

https://github.com/agalea91/nhl-goalie-pull-optimization

Previous works lack source data and visual aids

Source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3132563

Goalie Pull Dataset

pulls = []
for season in seasons:
for game in season.games:
goal_scan = False
for row in game.game_sheet_rows:
# Look for goalie pull
if row.is_goalie_pull:
goal_scan = True
pulled_goalie = row.pulled_goalie
pulled_time = row.time
if goal_scan:
# There has been a pull, scanning for a goal
if row.is_goal:
if pulled_goalie not in row.players_on_ice:
# We have found an empty net goal
pulls.append({
"season_game": [season, game],
"pulled_time": pulled_time,
"goal_time": goal_time
})
# We have found an empty net goal
pulls.append({
"season_game": [season, game],
"pulled_time": pulled_time,
"goal_time": goal_time
})

Pulling Earlier and More Often

Goalie Pulls Trending Up

Marginal Gains on Positive Outcomes

Emerging Trend to Pull Goalies Earlier

Goalie Pulls are Left Skewed

“Goals for” Lead “Goals Against”

Building a Bayesian Model

Gamma distribution

MCMC Samples

Early Pulls Yield More Goals

+--------------+----------+--------------+---------+
| | Goal For | Goal Against | No Goal |
+--------------+----------+--------------+---------+
| Time Elapsed | 18.6 | 18.7 | 19.3 |
+--------------+----------+--------------+---------+
| Game Clock | 01:24 | 01:19 | 00:41 |
+--------------+----------+--------------+---------+

Successful Outcomes are Unlikely

+------------------+----------+--------------+---------+
| | Goal For | Goal Against | No Goal |
+------------------+----------+--------------+---------+
| Mean Probability | 0.13 | 0.33 | 0.53 |
+------------------+----------+--------------+---------+

Odds of Success are 20% if Pulled Early

Re-normalization function
  • The odds of a goal for are ~20% up until the 02:00 mark (peaking at 03:00). Then they approach zero gradually through 02:00–01:00 remaining, and more rapidly in the final minute.
  • Odds of a goal against drop off linearly up to the 02:00 mark, dropping from a high of ~60% to ~40%. From 02:00 onwards it follows the same trend as goals for.
  • Odds of no goal starts low and increases exponentially as the game clock ticks down.
  • If pulling the goalie with 30 seconds left, the odds are 5% goal for, 15% goal against and 80% no goal.

Outcomes are Uncertain for Very Early Pulls

Error propagation with partial derivatives

Look to Pull ASAP after the game clock hits 03:00

Conclusion

--

--

--

Python Data Engineer, MSc. Physics

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Why I Quit My Job as a Data Analyst

Shaping the future of Supply Chain Management: Kyvos with Tableau

Finding Bayesian Legos — Part 1

An Empirical Analysis Study of The Parameter Settings of The Konomi Oracle

Training a Spark Model for predicting User Churn

Python: Merge Multiple csv files into one to facilitate reporting on transaction data over time

How the three types of data analytics can improve your business

3 Simple Tricks to Better Structure Your SQL Queries

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alex Galea

Alex Galea

Python Data Engineer, MSc. Physics

More from Medium

Does Bayesian Probability of Success Help in Drug Development?

Mean and 3-sgima for Lognormal distributions

The S&P 500 index. History, insights, and visualisations

Free Covid Tests and Their Accuracy