In the 2018-19 season, both Michelle Karvinen and Marie-Philip Poulin played 26 games. Karvinen averaged 2.15 points per game, while Poulin averaged 1.92 points per game.
To compare these two players, it may seem that Karvinen had the better offensive season. But the difference between them is that Poulin's 23 goals and 27 assists were scored in the CWHL, and Karvinen’s 25 goals and 31 assists were scored in the SDHL.
As hockey writers, fans, and analysts, we know intuitively that there are some leagues where scoring a goal is more difficult than in others. We can say confidently that the competition in NCAA Division I is tougher than the competition in NCAA Division III, and so a player with 20 goals for Boston College had a better season than a player with 20 goals from Amherst college.
It is not always so simple to draw comparisons between players in different leagues, however. How do we compare Poulin’s season in CWHL with Karvinen’s in the SDHL? The NCAA and USports? Australia’s AIWHL and England’s WNIHL?
Networked Women’s Hockey Equivalency (N-WHKYe) is a metric that allows for the standardization of offensive production across more than 40 Women’s hockey leagues around the world.
The N-WHKYe model was built as a project for the 2021 Women’s Hockey Analytics Conference by Data Analyst, Mikael Nahabedian, and was inspired by CJ Turturo’s NNHLEe project.
Equivalency gives us a way to determine what a point in any one of more than 40 women’s hockey leagues worldwide is worth.
Assigning value to points starts with selecting a “target league” which will serve as the benchmark for comparison. In men’s hockey, the NHL makes for a convenient target league, the NHL has many direct and indirect relationships to developmental leagues across the world and is generally regarded as the most competitive men’s hockey league in the world.
“Unfortunately, in women’s hockey, a league like the NHL in that regard does not exist (yet!). In order to fix this issue for our modeling process, we decided to use international tournaments as our target ‘leagues’, in different divisions and age groups,” said Nahabedian.
The international tournaments include the Women’s World tournaments, the Olympic Games, as well as the u18 World Championships.
After selecting the target leagues, it came time to build the model. This process can be explained using two Swiss women’s hockey leagues: SWHL A and SWHL B, A being the stronger of the two. Between the two leagues, in any given season, players are called up to one league and sent down to the other to meet the roster needs of various teams. This fact makes it easy to sum up the total Games Played (GP) and Points (Pts) for players who played in both leagues in the same season between 2010 and 2019. To link the two leagues, the model uses the Pts/GP ratios between the two leagues, building a simple equivalency model that goes from A to B, it can also be done inversely to build an equivalency model that goes from B to A.
The ratios obtained from the calculations, and this is the part that even those of us who weren’t paying attention in math class will be able to understand, show that a player who puts up 1 point per game in the SWHL A would average 2.36 points per game in the SWHL B.
This network approach can be used to build more hypothetical paths between more and more leagues. The goal is to build a path from any one of the 40+ leagues in the model to reach the target league, multiplying the ratios along the way to find the WHKYe of any path or combination of paths. The model uses a weighted calculation to find an average for path combinations which is based on the following formula.
“To put it simply, the model gives more importance to the robust paths and gives less importance in the calculation of the weighted average to the weakest paths,” said Nahabedian.
Finally, the model takes into account that the level of play fluctuates in a given league over time as players move from one league to another. So the model evolves over time to estimate this variation.
“For a given year N, half of the data comes from the year in question, while the remaining half pertains to the following year to smooth the results using this moving average approach,” said Nahabedian.
Interpreting the Data
Using the results of the model, John Bouchard, Data Visualization Developer, created several dashboards which can be found here to help interpret the data in simple and easy-to-understand terms.
In ranking the leagues by year, the data shows that the CWHL was the most competitive offensive league in the 2018-19 season, followed by the NWHL, the NCAA, the SDHL.
Returning to the original question of Poulin’s 2018-19 season vs. Karvinen’s 2018-19 season, the CWHL, where Poulin played, outranks the SDHL.
Whose season was more impressive? Without the data to back it up, it’s possible to come to the conclusion that Poulin had the better season, the CWHL being generally agreed upon to have been one of the most competitive leagues in the 2018-19 season, but without a metric to show that there’s no evidence to back up that conclusion.
With this metric, we can also examine individual player rankings, both over the course of a season and the course of a career. In 2018-19, although Karvinen had a higher Pts/GP that season, her average WHKYe was 0.86, while Poulin’s was 1.11. This metric takes some of the guesswork out of cross-league player comparison.
Both the model and the dashboard do not begin and end with national team players. There are thousands of players with available data to look at. With NWHKYe, it’s not only possible to compare the most elite women’s hockey players, but u18 players, non-national team players and just about every player with available data in the world.