Battle of Esoterica – Hockey Rate Stats Poll
Let me start out by saying that I know this is a freaking novel. But it’s the middle of the offseason, and what else have we got to do? And I think the issue is important. Even if you just skim this fanpost, I’d like to know what you think, so please vote in the poll and give comments.
I believe hockey is entering a statistical renaissance. We have more tools to describe a player’s performance now than we ever have before. But even the most basic of rate stats in hockey are not yet uniform. We have statistics that are measured by the game; by the minute; by sixty minutes; by twenty minutes; by the shift. It’s a mess.
We have an opportunity now to choose how we will describe the game into the future. And I think we should choose the most intuitive statistics. The ones that just make the most sense when you first run into them. The ones that are the easiest to explain to other people. In this fanpost, I propose that hockey adopt "per shift" as the standard stat.
To see what I’m getting at, it’s useful first to take a digression into the most stat-focused of the major sports…
Baseball Stats
Batting Average should be the best stat in sports. It is one of the oldest stats that isn’t just a simple counting number, like Hits or Home Runs are. It’s also one of the most intuitive. If a guy hits .250, then you can expect him to get a hit in one out of every four times up*. Even little kids can get a handle on batting average.
Batting Average should be the best stat in sports. But it gets a major asterisk. (See the asterisk up there? Ford Frick would be proud.) Baseball got it wrong – they made the denominator for Batting Average "at bats."
I believe most of us have had the experience of explaining to someone why "at bats" isn’t what you’d think it is. We say "No, what you’re thinking of is ‘plate appearances,’ not ‘at bats,’" and they look at us like we just grew a second head. And they’re right to be confused. Because Batting Average is broken. The mistake even propagated to Slugging Percentage, before they finally got it right with On Base Percentage. The basic event in baseball is the Plate Appearance, and that’s what they should have used in the first place.
By limiting the definition of an "at bat" to just hits and outs, they ruined a lot of the intuitive value of Batting Average. If a guy hits .250, then you might expect him to get a hit in one out of every four times up. And you’d be wrong. It’s more like one every five times up, after you account for walks and such.
But the problem is much worse with ERA, which is one of the worst stats in sports. I’m not talking about its predictive powers here. I’m talking about the fact that it is not intuitive at all. "Earned" runs per nine innings… Who ever pitches nine innings these days? Less than 3% of the games started in 2008 were complete games. And the state is at its worst for relievers. Why would I want to know how many runs Joe Nathan would give up in nine innings? He’s never pitched nine consecutive innings in his career! The "per nine" of ERA may have made sense in the early days of baseball, but it’s an anachronism now.
[The "earned" thing is a whole ‘nother problem because it introduces the subjective judgment of one person – the official scorekeeper – to replace what we all see with our own eyes]
Pitching stats could easily have been based on a "per batter" (i.e., plate appearance) or "per inning" basis. In fact, many pitcher stats are shifting in that direction these days. People are calculating the batting average, slugging percentage, and OBP against a pitcher. Those are (roughly) per batter/plate appearance. And WHIP – Walks plus Hits over Inning Pitched is another stat that gained prominence over the last few years. These are much more intuitive than ERA. When a pitcher starts an inning, if I look at his WHIP it gives me a sense of what to expect for that inning. I look at his batting average against and (if batting average had been correctly designed) I get a sense for what the next batter is likely to do.
The lesson from baseball is that the basic "event" for a rate stat matters. By "event" I mean something that you can use for the "per" in the rate stat. The denominator. The best stats are the ones that tell you what to expect in a well defined, bite-sized bit of playing time.
Back To Hockey – Per X Minute Stats
Until about ten years ago, hockey stats never gave us the opportunity to make useful rate stats. The only "events" that were logged for players were games. And per game stats like goals per game (or games per goal) make for a pretty gross measure of someone’s talent.
But lately, we have much more opportunity. That’s because folks have started to publish Time On Ice – the number minutes played for each player in different situations. And even more recently, folks are starting to publish shift data. We suddenly have two new "events" that we can use for the denominator of our rate stats: (1) the minute of ice time, and (2) the shift.
We’re starting to see rate stats get published. But many of them are measured in "per 60 minutes" rates. For example, we’ve seen "per 60" stats used quite heavily on this site, for example in the Blueliners article two days ago. I fear that "per 60" is becoming a standard, and I’m not a fan.
In my opinion, "per 60" is all wrong. It has exactly the same problem of ERA, but worse. At least it’s possible for pitchers to throw 9 straight innings, even if it hardly ever happens anymore. But we never watch a player play 60 minutes all at once (goalies excepted).
For example, Alex Semin scored 1.76 goals per 60 last year. It’s impossible to know on an intuitive level what Semin’s 1.76 goals per 60 minutes really means. If you tell a typical ardent hockey fan that Semin scored 1.76 goals per 60 minutes of ice time, they won’t be able to tell you if that’s good or not. And that’s a shame because it was the best in the league.
I guess 60 minutes of ice time is about 4 games (or more) for a forward like Semin. So we’re saying Semin scores 1.76 goals every 4 or 5 games. But 60 minutes is only about 3 games (or less) for a defenseman. So already it’s hard intuitively to compare what 1.76 goals per 60 means between players at different positions.
This site ranks things per 20 minutes. That’s a little better, because 20 minutes of ice time is roughly what a good-not-great defenseman gets in one game. I feel like I have an intuitive sense for 20 minutes. But not really. Riley Cote doesn’t play 20 minutes in three games. And Mike Green plays 20 minutes – and then half that again – every game. So if I look at something like "ON-Goals per 20" and translate in my mind "that’s about one game," I’ll undervalue some players and overvalue others.
If we must use "per X minutes" stats, I’d recommend just shifting to "per minute." One minute of ice time is a nice, digestible bit of time. And we can all weigh it a little differently to account for the fact that defensemen play more than forwards and stars play more than grinders. Best of all, one minute is roughly equivalent to a real hockey event – one shift – so we all have a good intuitive sense of what it means. But that brings me to my preference…
My Preference – Per Shift Stats
Hockey has a basic event, equivalent to the "plate appearance" in baseball. It’s the "shift."
As hockey fans and players, we all intuitively know what a shift is.* And the advantage "per shift" stats have is that, as soon as that player comes over the boards, the stats give us a sense of what might happen during that shift. Watching the second hand cross sixty to figure out when the next "minute" will start is a little abstract. But if you watch a line change, and you know what the "per shift" stats are for the players who just hit the ice, you have some sense for what might happen while they’re on the ice. It’s an intuitive "event" to use.
Here are a couple of examples, pulled from JP’s posts from yesterday and the day before on forwards and defensemen.
Forwards:
| Line | Shifts | Points | Points % |
| Ovechkin-Backstrom-Kozlov | 2099 | 44 | 2.096% |
| Ovechkin-Backstrom-Semin | 1302 | 47 | 3.610% |
| Laich-Fedorov-Semin | 442 | 10 | 2.262% |
| Fleischmann-Fedorov-Semin | 423 | 6 | 1.418% |
| Ovechkin-Backstrom-Nylander | 333 | 3 | 0.901% |
| Fleischmann-Fedorov-Fehr | 298 | 11 | 3.691% |
| Fleischmann-Nylander-Fehr | 277 | 6 | 2.166% |
| Laich-Nylander-Semin | 258 | 2 | 0.775% |
| Fleischmann-Nylander-Kozlov | 235 | 1 | 0.426% |
| Ovechkin-Fedorov-Semin | 233 | 11 | 4.721% |
Defensemen:
| Pairing | Shifts | GA | GA % |
| Green-Poti | 168 | 0 | 0.000% |
| Erskine-Jurcina | 1315 | 6 | 0.456% |
| Erskine-Poti | 498 | 3 | 0.602% |
| Jurcina-Schultz | 1356 | 8 | 0.590% |
| Green-Schultz | 1521 | 12 | 0.789% |
| Erskine-Schultz | 358 | 3 | 0.838% |
| Erskine-Green | 868 | 7 | 0.806% |
| Green-Morrisonn | 2387 | 21 | 0.880% |
| Jurcina-Morrisonn | 196 | 2 | 1.020% |
| Alzner-Jurcina | 1569 | 17 | 1.083% |
| Morrisonn-Poti | 892 | 9 | 1.009% |
| Poti-Sloan | 746 | 9 | 1.206% |
| Poti-Schultz | 598 | 8 | 1.338% |
| Jurcina-Poti | 263 | 4 | 1.521% |
| Schultz-Sloan | 199 | 3 | 1.508% |
| Morrisonn-Schultz | 256 | 5 | 1.953% |
The percentage is the percent of shifts in which something happened. In the first table, it’s the percent of shifts in which the line scored a point; in the second, it’s the percent of shifts in which a goal was allowed. You can think of it as the odds of that thing happening. When Green and Mo took the ice together at even strength last year, there was a 0.880% chance that the Caps would allow a goal. I think this is easy to understand, but I’m curious to know what you all think.
Here’s another advantage of shift data: we can compare it across all players (goalies excepted again). We don’t really need to convert it in our heads. A shift is a shift. Defensemen and stars have more of them per game, but shifts are about the same length for every player (or they should be, but that’s a discussion for another day). So "per shift" stats equalize for the quality of a player while he’s on the ice, as opposed to the quantity of their playing time. Riley Cote’s per shift stats are pretty much on the same scale as Mike Green’s.
No doubt you’ve see that second pesky asterisk up there. We all know what a real shift is, but to make the stats work, folks have had to redefine it to create the "statistical shift." As far as I can tell, "statistical shifts" start and end when (1) a player enters or leaves the ice; or (2) anyone enters or leaves the penalty box; or (3) the whistle blows. So if someone’s out there for six seconds and there’s an icing, and they stay out there for the faceoff, that’s now another statistical shift. If Ovechkin is out there when a power play ends, one "power play shift" ends and an "even strength shift" begins, even though he just stayed on the ice. This is why JP discussed "occurrences" instead of "shifts" in yesterday’s post about forward combinations. The shift data that is currently available is useful, but it doesn’t reflect real shifts – not the way we all normally think of them.
This may not be the traditional definition of a shift, but is necessary to allow us to compare different types of gameplay (shorthanded shifts versus even strength, for example) and to let the goal stats work correctly. I think the advantages of using shifts as the basic event outweigh any problems caused by the difference between "statistical shifts" and real shifts. So I think we should just use the data that is being compiled, call it "shifts," and just remember that the "statistical shift" is a slightly different beast than the real shift.
I realize that after preaching about the "intuitive" value of other stats, here I am pushing an artificial creation – the "statistical shift." Maybe it needs a better name, but I think this really is the best base to use for hockey rate stats. Every time a forward line or defensive paring comes over the boards, that's one battle. Much like a hitter for one plate appearance, those players have that limited opportunity to do something before they go back and sit on the bench again. I think a player's per-shift stats are the right measure of his effectiveness.
If this FanPost is written by someone other than one of the blog's editors, the opinions expressed in it do not necessarily reflect those of this blog or SB Nation.
24 comments
|
6 recs |
Do you like this story?
Comments
I didn’t vote, but I have admit that I’m in favor of any of the “per [x] minute” statistics because comparing shifts just isn’t apples to apples. Shaone Morrisonn’s average shift last season was 43 seconds; Mike Green’s was 57, about 33% more. Even if you assume half of that is due to incomplete shifts, i.e. non-“statistical shifts”, Green would still be skating shifts that were 15%+ longer than Morrisonn’s, which is a significant enough difference to make comparing per shift stats difficult.
Perhaps an option would be to look at statistics in per minute (or per sixty minutes) terms but throw out the meaningless shifts like your example of the six seconds on the ice after coming out of the box (assuming, of course, that the guys goes off the ice after the whistle).
I think the variability of shift lengths across the league is very hard for us as Caps fans to judge because the Caps are such an outlier on shift length. In a 30-team league, four of the top 20 in average shift length last year were Capitals (the young guns). (I’m using the NHL.com data, which I think uses “real” shifts, not “statistical shifts.”)
The next player on the Caps in average shift length after the young guns is Fehr at 49 seconds, and everyone else on the team was in the 40s. (I’m omitting guys who won’t play a significant role next year — Nylander, Kozlov, Aucoin, Lepisto, Brashear, and Fedorov). Shift length should be in the 40s for everyone in the league, and the vast majority of players are in the 40s or low 50s. I checked the Flyers to look at Knuble, and not one Philly player was in the 50s. They were all 49 or below.
If Bob Woods does only one thing this coming year, I hope it is to get those shifts shortened.
by Gould Old Days on Jul 23, 2009 9:10 AM EDT up reply actions
Here, this should give a rough sense for shift length distribution across the league:
- Ilya Kovalchuk, 66 seconds (Ovechkin was 2 at 64 seconds)
- Nick Backstrom 57 seconds
- Francois Beauchemin 56 seconds
- Brian Campbell 54 seconds
- Alex Goligoski 54 seconds
- Philippe Boucher 53 seconds
- Marc Savard 51 seconds
- Dan Hamhuis 50 seconds
- Patrick Kane 48 seconds
- John Erskine 47 seconds
- Milan Jurcina 45 seconds (two Caps in a row — what are the odds?)
- Marc Methot 44 seconds
- Ryan Hollweg 43 seconds
- Jussi Jokinen 41 seconds
- Tim Conoy 39 seconds
- Riley Cote 34 seconds
Looks to me like the distribution has a high peak at the left, then a loooong plateau followed by a trailing off. For example, #1 and #100 are separated by 16 seconds; #100 and #700 are separated by 9 seconds. Other than the young guns, the players we most care about around the league are between 55 and 45 seconds per shift.
It seems from looking at the list like the guys who play more than 55 seconds per shift are mostly stars who play a lot on the power play. This suggests that power play time is the main contributor to extending average shift lengths — along with lack of discipline. But Philly had a great power play and kept the shifts short.
by Gould Old Days on Jul 23, 2009 9:27 AM EDT up reply actions
Wow. SB fail. It completely wiped out the numbers before the names. Here’s that list again.
#1. Ilya Kovalchuk, 66 seconds (Ovechkin was 2 at 64 seconds)
#10. Nick Backstrom 57 seconds
#20. Francois Beauchemin 56 seconds
#30. Brian Campbell 54 seconds
#40. Alex Goligoski 54 seconds
#50. Philippe Boucher 53 seconds
#75. Marc Savard 51 seconds
#100. Dan Hamhuis 50 seconds
#200. Patrick Kane 48 seconds
#300. John Erskine 47 seconds
#400. Milan Jurcina 45 seconds (two Caps in a row — what are the odds?)
#500. Marc Methot 44 seconds
#600. Ryan Hollweg 43 seconds
#700. Jussi Jokinen 41 seconds
#800. Tim Conoy 39 seconds
#875. Riley Cote 34 seconds (and last)
by Gould Old Days on Jul 23, 2009 9:31 AM EDT up reply actions
I like what you did with the per shift thing, but why not do it per time on ice per game? seems like that would get you the most narrowed down numbers…Especially since like you said D men play longer shifts that forwards and Ovie plays longer shifts than Steckel.
"You will remember the night you were struck by the sight of [18] thousand fists in the air" -Disturbed
This comment has been eating at me for the last two days, and I finally figured out why:
To figure out average time on ice, you divide time on ice by games played:
total minutes / games played
To calculate Goals per sixty, you first divide goals by minutes and then you multiply that by sixty:
(Goals / total minutes) * 60
So if you replace the 60 with time on ice per game, you get this:
(Goals / total minutes) * (total minutes / games played)
The total minutes cancel each other out, and you just end up with Goals per game. Scott in one of the comments down below suggests that simple “per game” stats are the best way to go, so there’s definitely some appeal to this…
by Gould Old Days on Jul 25, 2009 12:27 PM EDT up reply actions
Obligatory Mark Twain reference
Batting Average should be the best stat in sports. But it gets a major asterisk. (See the asterisk up there? Ford Frick would be proud.) Baseball got it wrong – they made the denominator for Batting Average “at bats.”
I believe most of us have had the experience of explaining to someone why “at bats” isn’t what you’d think it is. We say “No, what you’re thinking of is ‘plate appearances,’ not ‘at bats,’” and they look at us like we just grew a second head. And they’re right to be confused. Because Batting Average is broken. The mistake even propagated to Slugging Percentage, before they finally got it right with On Base Percentage. The basic event in baseball is the Plate Appearance, and that’s what they should have used in the first place.
By limiting the definition of an “at bat” to just hits and outs, they ruined a lot of the intuitive value of Batting Average. If a guy hits .250, then you might expect him to get a hit in one out of every four times up. And you’d be wrong. It’s more like one every five times up, after you account for walks and such.
But the problem is much worse with ERA, which is one of the worst stats in sports. I’m not talking about its predictive powers here. I’m talking about the fact that it is not intuitive at all. “Earned” runs per nine innings… Who ever pitches nine innings these days? Less than 3% of the games started in 2008 were complete games. And the state is at its worst for relievers. Why would I want to know how many runs Joe Nathan would give up in nine innings? He’s never pitched nine consecutive innings in his career! The “per nine” of ERA may have made sense in the early days of baseball, but it’s an anachronism now.
[The “earned” thing is a whole ‘nother problem because it introduces the subjective judgment of one person – the official scorekeeper – to replace what we all see with our own eyes]
1. Lies
2. Damn lies
3. Statistics
/obligatory Mark Twain reference
Carry on
natty, this is the funniest thing I have EVER seen on the internet. Is this a first on the Rink or has this graphic surfaced before (Rickrolled seems like a pet name of some sort)? I work in a budget office, so the temptation to pie chart the world (PCTW?) is ever-present here. Oh dear lord, you’ve just made my whole day…
Back to the subject at hand, MATH NERDS UNITE! Great stuff Gouldie! I’ve got to read this one more time after coffee to try and digest this stuff. Always been interested in statistics but never jumped in with both feet past Stats 101. Peerless posts PhD-level stuff like this on his blog sometimes, too. : ]
by war_capitals on Jul 24, 2009 10:27 AM EDT up reply actions
awesome awesome fanpost, GOD (i voted 20 minutes because, though flawed, i do like that it somewhat embodies a single game played).
warcaps, i came across the rick astley chart at this link. more great ones there. and if you’re interested, the origin of rickrolled.
by Natty Bumppo on Jul 24, 2009 10:47 AM EDT up reply actions
thanks for explaining the inside joke to the new guy, i’m still chuckling… : )
by war_capitals on Jul 24, 2009 11:36 AM EDT up reply actions
There was a thread many months ago that quickly devolved into people posting their favorite charts and graphs from this site.
by Scott in Shaw on Jul 24, 2009 4:01 PM EDT up reply actions
That’s what happens when ideology takes precedence over the desire to objectively analyze a situation.
your comment brings to mind another gould (stephen jay).
by Natty Bumppo on Jul 23, 2009 11:24 AM EDT up reply actions
Per minute and per shift measures are the differences between “efficiency” (the former) and “effectiveness” (the latter). The use of “per shift” measures is intriguing, although there might be some odd effects of “matching” (when a player goes on, then comes off seconds later due to matchups)
If you've read this far...seek help.
I think this is a great way to think about it. And perhaps the poll is a bit unfair — these measures do different things and perhaps they all have their uses.
by Gould Old Days on Jul 24, 2009 11:58 AM EDT up reply actions
I like the idea of “per shift” stats being an intuitive sense of what might happen during a given shift. However, the offensive version of this stat when looking at lines should definitely be goals for per shift, not points per shift. Here’s why:
Let’s look at Ovie, Backstrom, and Kozlov as a unit. They scored a point on 2.096% of their shifts on average, according to the table. Suppose the line scores a goal and all three players get a point. Using points per shift, this results in 3 separate incidences, making it appear that a single event (scoring) occurred 3 times. In reality, many of those points were scored together on the same goal, hence during the same shift. So in reality, OBK did not have a 2.096% of scoring a point on any given shift, but a much lower number. 2.096% is just an average, and it doesn’t help us understand the likelihood of an event during a given shift. It’s really no better than the “per X minutes” you don’t like.
Now if we look at GF per shift, that will tell us the likelihood that a given line will score during a given shift without double or triple counting points which occur on the same shift.
by LSF76 on Jul 24, 2009 11:32 AM EDT reply actions 1 recs
This is an excellent point, and it’s true no matter what the base rate is — you can’t use points to evaluate linemates because of the redundancy issue.
by Scott in Shaw on Jul 24, 2009 4:02 PM EDT up reply actions
None of the Above
I vote for “per game.” That’s the most intuitive stat for a casual fan. Is Ovie going to score tonight? What are the chances a goal is scored while Mike Green is on the ice in this game?
I agree, I would go with per/game for most stats but I voted for per/60. Shifts are too hard to standardize. How do you compare AO’s per/shift numbers to a guy that skates 40 second shifts. It’s just another layer to adjust for. I don’t think it’s that hard to figure out per/60. Most knowledgeable fans know that the elite forwards skate about 22-24 minutes. Average forwards skate about 15-20. D start at about 18-20 per night and end up around 27-28. If you know what kind of player you are dealing with then you can figure out how the per/60 relates to per/game.
An aside, the measurements for PP are completely ridiculous and someone with a brain at the NHL needs to change it. PP numbers should in some way reflect how long you were on the PP. If you go SH and then the other team takes a PIM 5 seconds in you get a 5 second PP at the end and in all likelihood this gets reflected as a failed PP. Scoring 5 seconds into a PP gets the same credit as scoring 1:59 into the PP. I think PP should be either measured in terms of “goals per 2 minutes of PP time” or “how many minutes it takes to score a PPG.” That would give fans a much better sense of how deadly each PP really is.
I’ll return the favor and say that I agree with your proposal of PP stats per 2 min. of PP time.
by Scott in Shaw on Jul 27, 2009 12:09 PM EDT up reply actions

by 


































