On the randomness of plus/minus
I am not a fan of using a player's plus/minus as a barometer of their play on the ice. However, many coaches, players and fans disagree -- and that is OK, but the metric is far from perfect.
I decided to set up a simulation to show just how inconsistent the metric can be. I created a player who has absolutely no impact on the game. For both his team and his opponent, there is an 8 percent chance of scoring a goal on a shot during even-strength. I am assuming that this player is on the ice for 11 even-strength minutes per game and will be on the ice for six shots for and six against*. I ran the simulation for 1000 82-game seasons, and as you can see, the metric is full of statistical noise.
If the plus-minus statistic were truly fair there would be nothing but zeroes on the chart because like I said, this player has absolutely no statistical impact on the game. Instead, this player is just as likely to be plus-10 or better in a season as he is minus-10 or worse.
Sure, there is noise in any statistic, but this chart shows that the amount of noise in plus-minus is unacceptable. Imagine the narratives that would be written about this same player: how he was too slow, not scoring enough, etc. Some may even feel that the player should be a healthy scratch, when in reality it is simply randomness creating the numbers.
* All bits and bytes appearing in this work are fictitious. Any resemblance to real persons, living or dead, that may or may not rhyme with Pike Panoobell is purely intentional.
If this FanPost is written by someone other than one of the blog's editors, the opinions expressed in it do not necessarily reflect those of this blog or SB Nation.
42 comments
|
3 recs |
Do you like this story?
Comments
Great work
Maybe
Jibblescribbits: C'mon over and waste some time
by Jibblescribbits on Feb 11, 2012 2:19 PM EST reply actions
On the scratching of Knuble, in many ways, it’s age discrimination at work. Given that he’s pushing 40. When a relatively old player is in a slump (or having a statistically poor season due to linemates in a slump), the cause is attributed to aging and he’s presumed to be “washed up”. as they say.
Rocking the Red for teams on the banks of the Potomac and at the Gateway Arch and Singing the Blues about Hockey.
When it is for over half a season and he has played with many line mates it isn’t just a slump or his line mates being in a slump. While plus minus may not be an exact indicator when you are worse then everyone else on your team in that category you are either the unluckiest guy on the team or you are one of the weaker players on the team. This isn’t about age as Schultz had a few games where he wasn’t playing well and then rode the press box for a long time too and that isn’t ageism. Or when TV was having a rough time they still played him and he worked his way out of it. Knuble has had plenty of time to work his way out of it and he hasn’t plain and simple. He has 1 goal and 4 assists in the 44 games since November started and is a -16.
I like Mike as much as any of the Caps but if any other player was having a season like that and still getting in the lineup everyone would be talking about how his is the new coaches pet and would be demanding he be sent to the press box.
His stats have certainly been bad this year for a variety of reasons. And, yes, the fact that his stats are bad would prompt Coach Hunter (or any other coach) to bench him.
But, given his age, the greater the chance that he would be considered to be “washed up”.
Rocking the Red for teams on the banks of the Potomac and at the Gateway Arch and Singing the Blues about Hockey.
On the scratching of Knuble, in many ways, it’s age discrimination at work.
No, it’s not.
Japers' Rink: Hockey blogging from the most powerful city in the world
by J.P. on Feb 11, 2012 7:38 PM EST up reply actions 6 recs
Gordie Howe had 800 goals — the fact that nobody will give him a contract today must be discrimination and collusion!
Hey, atta dinnin stick a who!
by Gould Old Days on Feb 12, 2012 4:51 AM EST up reply actions 4 recs
Does anybody really use +/- as an evaluation tool anymore? I can’t remember the last time I heard any coach, player, analyst, etc. make an evaluation of a player using +/- as anything more than a throw-in piece of evidence.
Release the Mackan!
Would you not say that comparing +/- of players on the same team is of at least some value, even if it just shows who’s having a luckier year than others?
It isn’t even anger-inducing. It does not seem to be worth that kind of emotional investment. It might not even be disappointing any more. It is expected.
-Peerless 5.6.2011
+/- can provide some value at the margins, but it requires a lot of context (for example, quality of competition and linemates, etc.). I think the stat has more value looking at it over more than just one year. For example, when you have a guy like Eric Staal who is consistently a minus player it’s probably not the case that he just keeps getting unlucky year after year. Or a guy like Lidstrom isn’t getting lucky to always be a plus player every year.
But even in those instances +/- doesn’t tell us more than what other stats can tell us, which is why no one has really used it as a reliable stat for several years now.
Release the Mackan!
by Killer_Carlson on Feb 11, 2012 4:33 PM EST up reply actions 1 recs
Joe Sakic was career +30, and had 8 seasons of “-”
On the plus side, it means nothing.
Maybe
Jibblescribbits: C'mon over and waste some time
by Jibblescribbits on Feb 11, 2012 6:19 PM EST up reply actions
Well when I talk about looking at +/- over time I don’t mean the aggregate totals. I’m talking more about Staal being a minus in 6 of 8 seasons, and Lidstrom being a plus in 19 of 20 seasons. It’s likely not luck if someone is a minus every season, or a plus every season.
Release the Mackan!
by Killer_Carlson on Feb 11, 2012 7:11 PM EST up reply actions
But for Staal, three of those minuses were -6 or better (i.e. closer to zero). One’s a minus-two. To me, there’s absolutely no reason to view a minus-two any differently than a plus-two on its own. Once you get into double-digits or so, I start wondering what else is going on. So the aggregate numbers or numbers of seasons doesn’t do much for me (though Lidstrom having 11 seasons above +20 is not nothing).
Japers' Rink: Hockey blogging from the most powerful city in the world
by J.P. on Feb 11, 2012 7:27 PM EST via iPhone app up reply actions
Sure, but if +/- is as random as the graph (which is based on a player with no impact on the game) suggests wouldn’t you expect Staal – a player who clearly contributes a whole lot of offense for his team – to be on the + side of the ledger more often, even if he is only slightly below zero most seasons?
Also, how did defending Eric Staal feel?
Release the Mackan!
by Killer_Carlson on Feb 11, 2012 10:51 PM EST up reply actions
But anyway, I think we are pretty much in agreement regarding the usefulness of +/-, even if we disagree on the use of Staal as an example.
Release the Mackan!
by Killer_Carlson on Feb 11, 2012 10:55 PM EST up reply actions
Also, how did defending Eric Staal feel?
Icky. But I held my nose and told myself I was really just taking issue with a stat, independent of the player.
Japers' Rink: Hockey blogging from the most powerful city in the world
I actually don’t hate +/- as much as most people do, probably because I’m a Jeff Schultz fan.
Seriously, though, it’s just another stat and I certainly take it with much larger grains of salt precisely because of the randomness and how much I believe it reflects (or, more accurately, doesn’t reflect) actual ability. But taken with a bunch of other stats, it does add some context (as most do), especially at the extremes.
Plus-minus doesn’t tell you anything about the player’s offensive or defensive ability. Is a good +/- driven by exceptional offensive and mediocre defensive luck, vice versa, 50/50 or something else? Is a bad +/- driven by crappy offensive luck and average goaltending or crappy defense and average offense, etc.? It doesn’t tell you any of that. It just tells you who’s been lucky or unlucky… not entirely unlike PDO (which a lot of folks refer to as the single most valuable metric in hockey), no?
Japers' Rink: Hockey blogging from the most powerful city in the world
Does anybody really use +/- as an evaluation tool anymore?
I’m assuming Neil’s post today was a result of this quote:
Knuble … described how coach Dale Hunter told him he was going to be a healthy scratch.
"He told me I wasn’t going to play and just kind of said, ‘You haven’t scored in a while and bad plus-minus.’"
"You can want to get to April but when you get to April you may not like the answers you get, so you might as well enjoy the ride while it's going on." - Brian McNally on JRR, 8/29/2011
I can neither confirm or deny.
"Shots aren't the important thing. Scoring chances are way more important than shots." - Bruce Boudreau
See my work in the Washington Post and on ESPN Insider.
Follow me on Twitter @ngreenberg
by NGreenberg on Feb 11, 2012 5:10 PM EST up reply actions 1 recs
Good Lord, Hunter stinks.
Follow @chasew12
Writing for Driving Play - The Blog with Three First Lines and The Copper & Blue.
In fairness to Hunter, this is second-hand from an understandably disappointed player. Maybe that’s all Hunter he said. Maybe that’s Knuble’s synopsis of what he remembers being said to him, or maybe the points that stood out the most to a similarly old school player (think Knuble countered with his on-ice scoring chance conversion rate being unsustainably low?). Who knows.
Knuble has stunk by most metrics, traditional, advanced and otherwise. There’s every reason to believe his staff is well aware of all of these. Yes, I’d probably have gotten a stiffy if he’d said, “Dale came to me and said, ‘your Corsi Rel Team sucks,’” but Knuble relating why he was scratched and mentioning plus-minus doesn’t get me too worked up.
Japers' Rink: Hockey blogging from the most powerful city in the world
by J.P. on Feb 11, 2012 6:47 PM EST via iPhone app up reply actions 2 recs
Yeah, I don’t exactly see that quote as evidence that +/- is still a heavily valued stat on its own. Knuble is actually a case where the poor +/- is in fact a symptom of poor play. I’d be shocked if Hunter didn’t have more substantive criticisms of Knuble’s play, but in the end they easily boil down to “when you are on the ice we are giving up more goals than we are scoring”.
Release the Mackan!
by Killer_Carlson on Feb 11, 2012 7:16 PM EST up reply actions
@coreypronman: Talked to a GM today who said, “Prospect X is good defensively, look at his plus-minus”. Seriously.
Blueshirt Banter - Where Rangers' Fans Matter
Tracking the Rangers - Numbers don't lie. They just don't agree with you.
Twitter: RangerSmurf
"Oh, that sensible and sober* Rangers fan guy who is cool, actually" - Dominik, Lighthouse Hockey
*Statement has not been verified nor regressed
by George E. Ays on Feb 11, 2012 10:13 PM EST up reply actions
Corey followed it up by noting he was talking about one of his own prospects, so maybe he was just pumping tires. And you wouldn’t expect a GM to tip has hand as to how he evaluates talent, would you?
But yeah, I bet you’d be surprised.
Japers' Rink: Hockey blogging from the most powerful city in the world
by J.P. on Feb 12, 2012 11:51 AM EST via iPhone app up reply actions
Here’s something to keep in mind: there’s a lot of randomness in an individual player’s plus/minus, but overall, “on average”, this stat is legit. (Although perhaps a plus/minus relative to one’s teammates is better, e.g. a player’s plus/minus with team’s average plus/minus subtracted. I think behindthenet has something like that.)
The point is, a player’s high positive plus/minus is not sufficient to qualify him as a very good player, but it is supporting evidence that he might be one. Just like a negative plus/minus is a red flag. The higher (and the more consistent year-to-year) a player’s deviation from “average”, the more likely he is to be different from average.
Outliers (like Schultz’s +55 season) don’t invalidate statistics, they just mean you have to be careful with conclusions and keep looking for better and more varied indicators.
My own feeling on the Plus/Minus statistic is that it’s a more reliable barometer of performance in the long run than in the short run. In the short run; i.e. a game or two, the measurement is subject to luck.
While Schultz’s Plus/Minus was an outlier and way better than normal, the fact that it was a better Plus/MInus than anyone else on the team was a sign that he was having a good year. Yes, we do have to careful with conclusions based on that stat and need other indicators for the context.
Rocking the Red for teams on the banks of the Potomac and at the Gateway Arch and Singing the Blues about Hockey.
Simply, every player on the ice has a job. If you’re not doing your job, you’re hurting your team. +/- is actually a really good metric of that. To say that the test player has 0 statistical affect on the game diminishes the actual affect that every player has on every game, no matter their affect on the score sheet.
There is 1% possibility that a player with no points AND a bad +/- should be on the ice (if you’re not a center who wins 70-80% of dzone/pk draws, drop that to 0%). That is exactly what Knubes said Hunter said and the coach is 100% right.
Similarly, I want a player with low points and HIGH +/- on the ice. That means the guy is allowing his teammates to do their job better, most likely making those good breakout passes or smart plays to keep pucks in the zone. Even if you handed out a point for every player that touches the puck on a goal, you are still ignoring the guy who takes a hit to allow another teammate to pick up the puck and skate out of the zone with it. How do you measure that with anything BUT +/-?
In the great debate of the last few years, just about everybody would be on board with the following statement:
Sidney Crosby is a better defensive player than Ovechkin…
+/- happens under very clearly defined conditions. You were on the ice when your team scored/was scored upon. Simply, at 5v5, Sid’s line was scored upon more than Ovie’s line. There are only a few explanations for this:
a. More of Sid’s points came on the PP than Ovie’s (Stretch)
b. Sid was on the ice for more disadvantageous situations than Ovie (he’s their best FO% center, so not a stretch)
c. Ovie did HIS defensive job better than Sid did HIS defensive job (different positions, different jobs)
It would take me more time than I have now to prove out the above, but I bet we would come to a combination of all 3. Quality of competition be damned, if you’re on the ice, and you do your job right, you’re going to have a good +/- relative to your point totals. End of story.
What if you are doing your job but a guy on your line routinely isn’t doing his?
Please, call me F&B.
So if a combo of players is not working, you adjust your lines and follow the +/- impact relative to production.
Examples: Prior to Nick’s concussion, Perreault was @ even player who had done just about nothing since October. With his new linemates and increased responsibility he’s improved his production AND +/-. If Perreault was the liability, the lack of production should have followed.
Like Perreault, Knuble did most of his points-damage in October. Unlike Perreault, new lines for Knuble have not yielded any change in production (none) or +/- (bad). Knuble sure does sound like the liability to me.
Now, some comments on the actual exercise… I think your assumption that you should get 0’s across the board is flawed, not the statistic. You have a bunch of fixed variables: Games per season, Shooting %, time on ice, and number of shots. You then have a “random” roll of the dice for 6 shots/game. This is where your random noise is coming from.
A question I have for you is, in the results of your simulation, did the team this guy played for finish every season of the 1000 with 41 wins? I bet there’s a correlation between W/L in the simulation with the good/bad +/- values.
Another question… Over the 1000 seasons, is the player close to Even +/-? By capping the games per season, you are taking small snapshots of what is actually 82,000 games…over the course of the whole simulation is where I would expect the +/- to be Even.
For what little it is worth, I think +/- is a data point and like any other one point doesn’t show the whole picture. So, you can’t make too many decisions with it in isolation, nor can the fact that it doesn’t stand alone be taken as a reason to toss it.
From a purely ‘nerd who loves sifting through the meaning of numbers’ (read: Physicist) point of view, I wonder if what this shows is that if you remove the ‘signal’ by setting the +/- to 0, you see all noise. While admittedly, the wide range here doesn’t imply good things, does a clearer pattern emerge when you put in a player that DOES drive play, perhaps to an extreme to either the positive or negative?
Occasionally wrong, never uncertain.
I’m not a big believer in the plus-minus stat except when looking at an entire team at the outliers. For example, suppose we have a team that outscores it’s opponents by 40 non-PP goals over the course of a season. That means that the entire team is going to be about +200 (5 skaters, 40 goals to the good). Divide that among 25 guys who might skate and the average guy is going to be somewhere about +4. Most guys won’t have any meaningful data there, but if there’s a guy on the roster who is +20 or higher…well, that says something. Ditto for a guy who might be -15.
It has to be looked at in context, and it really applies only to outliers.
Occasionally reporting from Section 421 of the Verizon Center...
Next victims
As others said, Knuble has been quite bad by every stat you can get your hands on. I took Neil’s point to be that the report that Hunter is basing the decision on plus minus is waht is upsetting.
I don’t think he’s objecting to scratching Knuble, I think he’s objecting to 1) scratching him based on +/- and 2) scrating him for Jay Beagle.
For what it’s worth, if Hunter is using +/- then the next two worst guys on the team are Ovie and Laich so I guess he’ll scratch them next.
Sometimes you win, sometimes you loose, sometimes it rains.
This post has bothered me since it went up. I’ve been back and forth on this and I think I understand what is bothering me now. Your simulation is basically a(n evenly weighted) trinomial random walk, so of course you are going expect a mean of zero. This is what you want (“a player with no statistical impact”) and just by eyeballing the graph, it looks like you got what you wanted.
The problem is that I don’t think you can draw any useful conclusions from that data, especially as presented. Particularly, this conclusion is quite flawed:
If the plus-minus statistic were truly fair there would be nothing but zeroes on the chart because like I said, this player has absolutely no statistical impact on the game. Instead, this player is just as likely to be plus-10 or better in a season as he is minus-10 or worse.
Many other values other than zero are expected on the chart. I mean, many realizations of the distribution are going to be non-zero. It is, however, expected that there will be a distribution clustered around zero (because the probabilities chosen are symmetric). In fact, the central limit theorem says that as you sample the (seasonal) distribution, you expect the result to tend towards a Gaussian distribution. This is in fact what you get, though it is more evident if you look at >1000 runs (empirically, 100,000 is plenty to get a pretty bell curve). Of course the Gaussian nature is obscured because the tails are grouped together, making the plot look closer to uniform than anything else, which is incredibly misleading. The last sentence of the block quote is precisely a consequence of the zero-mean Gaussian. You could say the same about +5 and -5, or +0.2 and -0.2.
What is really evident from this, is that for players that don’t increase the shooting percentage for or reduce the shooting percentage against (but, really, who in the NHL is in the category), the variance, not noise, in accumulated +/- is too high for it to say anything. Empirically, from my own simulations I come up with around 8.5 for this configuration (though an exact result is almost surely [hah!] known for the distribution in question), which implies that (very roughly) %70 of seasons for this player will be between 8.5 and -8.5. Despite this, I’m not sure a tighter distribution would tell you much about the utility of plus/minus (stupid SBN formatting).
I'll have the milksteak, over hard
by renstar on Feb 13, 2012 8:09 PM EST reply actions 8 recs
A different problem has been nagging me. I’m left with the following question — which hockey stats really survive an approach like this one? How many really have “acceptable” variance or noise?
Hey, atta dinnin stick a who!
by Gould Old Days on Feb 15, 2012 12:29 AM EST up reply actions 1 recs
How do you even define “acceptable” and is it a hard line or is it a sliding scale? Personally, I just temper conclusions based on how much “noise” (or imprecision in the model, as I’d put it) I think there is. I don’t think it’s a binary “acceptable or not acceptable” situation.
Please, call me F&B.
Uncertainty is acceptable if it doesn’t interfere with your ability to draw conclusions. Alternatively, it is acceptable if it doesn’t obscure the effects of the model. Both of these are intentionally vague because even these definitions vary depending on the task at hand.
This is sports, so I’m fine with more uncertainty. It makes for interesting arguments conversations on blogs. Also, I’m not spending millions of dollars based on the results, so it impacts me less.
Your second point is really important though. It is critical to separate which part of the uncertainty is due to an imperfect model and which is due to imperfect data. It is all too easy to lump it all in to one or the other, depending on your preconceived notions on the matter at hand.
The stats community seems to do a better job at isolating the latter (by normalizing for game situation, using windowed averages, etc) than for the former. But I see this fanpost as an attempt (if incomplete) at finding the source of the ineffectiveness of the +/- model, and this is a good thing.
I'll have the milksteak, over hard
by renstar on Feb 15, 2012 9:39 AM EST up reply actions 2 recs

by 































