Setting and Measuring Expectations: The Leafs Coaching Staff

No Strategy yet HPIM0785
In search of a clean slate for the X's and O's

For Leafs fans, the upcoming season will be an important one. Though it is (once again) extremely unlikely that the Leafs could win the big silver beer stein on offer at the end of the postseason tournament, fans of the team will be watching very closely for signs that any of the existing questions about the team might be answered. We’ll dig through the statistics like the oracles of old pawed through goat entrails, looking for evidence that augers well for a brighter future ahead. It is pretty safe to assume that Brian Burke and his staff will be engaging in a similar process.

Many of those questions concern individual players: what, for example, can we realistically expect from players like Jonas Gustavsson, Luke Schenn, Tyler Bozak and Nikolai Kulemin, all of whom are approaching their likely peak athletic potential in the next few years.   Other questions concern more collective issues:  what improvement can we expect from the Leafs’ power-play and penalty killing units?

All of those questions merit discussion, but they all relate to issues about the players; with Ron Wilson entering his third season as Maple Leafs head coach, and keeping in mind that last season in particular represented a disappointing step backwards, it’s safe to say that questions must also remain about the suitability of the current staff for the task ahead.

One of the things I like most about the hockey blogosphere is the very strong tendency to attempt to quantify, measure and make concrete and expressible these sorts of issues.  When we speak of “issues” and “questions” about the coaching staff, the reality is that there must be some set of performance metrics against which it is reasonable measure the observed outcome of this season, in an effort to dispassionately judge whether the coaches are making a discernible difference in the team’s play (and whether that difference represents an improvement).

Statistical analysis isn’t my strong suit, and I don’t pretend to have the facility with numbers that many other hockey bloggers have ably demonstrated, but I thought I’d try my hand at attempting to cobble together an answer to this last question.  What types of numbers should we look for when attempting to grade Messrs. Wilson, Hunter and Acton at the end of this season.  Please accept this analysis for what I hope it is:  a starting point for the discussion, and a jumping off point for others with the statistical chops that are absent from my toolkit.  Criticisms, comments and refinements are welcome – put ’em in the comments below!

I wish I could figure out a way to embed the tables I compiled directly into this post, but two hours of futzing about with Google, Google docs, WordPress, Excel and Numbers have failed to surrender any such secrets, assuming they exist.  Unfortunately, therefore, I have to just insert a link to the table I compiled.  All data are sourced from hockey-reference. com.

I thought the most logical place to start in assessing the performance of the coaches would be year over year changes in goals for and goals against.  I compiled the goals for and goals against data for all 30 teams in each season since the lockout, calculated the percentage change in each from the previous year.  I then tried to normalize the percentage change data by calculating the average change each year and the standard deviation of the data.  I then selected out those results that lie between one and two standard deviations away from the mean (classified as “moderately exceptional”), and those results that lie two standard deviations or more away from the mean (classified as “significant”).

Link to Google docs spreadsheet re: YOY data: change in GF and GA

Assuming that the year-to-year changes are normally distributed, if I remember my statistics class correctly, the results that are interesting are those that fall more than one or two standard deviations from the mean. Those are the results I mentioned above, with the moderate desirable increments marked in light green, the significant desirable increments marked in dark green, the moderate undesirable increments marked in pink, and the significant undesirable increments marked in red.

If I’m reading all of the data correctly, it would appear that the standard deviation of the Goals Against data is typically between about 9 and 12 per cent.  Thus, an increase or decrease of anything less than 9 to 12 per cent, statistically speaking, represents the mushy random middle, results in the 68% of data that cluster around the mean in a normal distribution.  If I am applying the theory correctly, it would be unwise to come to any conclusion that the team’s performance had either improved or deteriorated based on data of this nature.  To make that sort of judgement, I would suggest that to even make a weak judgment about significant differences in performance, we would need to observe an increment (or reduction) of between 9-12% and 18 to 24% (these would be the results between one and two standard deviations from the mean).  Variances of more than 18 to 24% from last year’s data could confidently be said to represent a clear indication of differential performance.

Two thoughts come to my mind: first, it’s important to keep in mind the (perhaps obvious) but important point that increases or decreases in a team’s goals for or goals against are not solely attributable to coaching.  In fact, it’s probably a live question whether coaching can be said to have a demonstrable effect upon the results at all.  Certainly, the old saw is that “you can’t teach scoring,” though it is generally believed that coaches and their systems can and do have a more pronounced effect upon the defensive side of the game (and, by extension, the goals against ledger).    If anyone has any thoughts on how to examine the evidence in that regard, I’d love to hear about it.

Second, the numbers involved are fairly large. I think the data seem to be telling us that wide variances in the numbers may be expected from year to year for purely random (or at least statistically uninformative) reasons.

If that last conclusion is correct, unless there is an enormous change in the Leafs goals against totals this year (more than +/- 20%, which in practice would translate into about a 54 goal change either way), it seems that we ought not to make any judgements about the performance of the coaching staff based upon these numbers.