Hockey Babies: Malcolm Gladwell vs Big Data
An advice to all the wannabe hockey daddies out there: forget about Valentine's Day

In his famous book ‘Outliers’, Malcolm Gladwell exposes a rather disturbing fact about the Hockey professional world: according to him, babies who are born in the earlier months of the year (i.e. January, February or March) are more likely to reach the top divisions than those who were conceived in the colder months. However convincing the author’s subsequent development of the reasons underpinning this surprising finding, the hard and cold data offered to the reader is too restricted in size for anyone to feel confident in condoning his claims. Indeed, Gladwell only presents birthday data for a single team, the Medecine Hat Tigers, during a single year: 2007.
Because I have always valued Gladwell's work, I wanted to verify whether his assertion would withhold the test of big data. I thus went ahead and collected from the web - specifically from the website hockey-reference.com - 7411 observations about the birth date of as many Hockey players from the Canadian league. I then formed a data frame of the 366 days of the year linked to the corresponding number of Hockey babies born that day. Once I had gathered and formatted all that data, I was able to produce the following visualizations:
The results are quite stunning: the number of hockey players born each day dramatically decreases as we progress down the year. January and december are respectively the months with the highest and lowest number of births: 765 babies were scheduled along with the new year while only 488 babies were greedily delivered by Santa Claus.
Now, you may be starting to draw more general concerns from these observations: might it be that every sport is dominated in numbers by individuals born early in the year? The answer is quite unsatisfying: it depends. The good news is, it depends on something anyone can look up: the month during which coaches select their teams. That is, as Gladwell rationalizes in the same chapter, in a sports like Baseball where selections happen in August, young players born around that time of the year - and preferably slightly after it - will have had the most time on earth to develop physically. Trainers will then tend to favor them over their younger peers, and put them in more intensive programs which progressively contribute to increase the performance gap.
I decided to once again verify the claims for myself, and the results were once again clearly supporting the author's claim: across the 18,000 birthday dates I collected from the website baseball-reference.com, there is a clear birth pike around August-September-November.
It thus seems that Malcolm Gladwell had a good flair on the issue at hand : if you happen to be Canadian and want your baby to go on to become a great hockey player, you are probably better off skipping Valentine's day celebrations and reserving your passions for Easter. And if you are a proud American baseball fan, Christmas is the time to make the gift of life.