The Distribution of Talent Between Teams
Wednesday, October 20th, 2010Four years ago Tango had a very interesting post on how talent is distributed between teams in different sports leagues. I want to revisit and expand upon some of the points that came up in that discussion.
First, lets look at some empirical data. I scraped end of season records from the last ten years for the NFL, NBA and MLB from ShrpSports (I decided to omit the NHL from this analysis due to the prevalence of ties). The data is available here (click through) as a tab-delimited text file. I used R to analyze the data. If you don’t have R you can download it for free (if you use Windows I recommend using it in conjunction with Tinn-R, which is great for editing and interactively running R scripts). Here is the R code I used:
records = read.delim(file = "records.txt") lgs = data.frame(league=c("NFL","NBA","MLB"),teams=c(32,30,30),games=c(16,82,162)) lgs$var.obs[lgs$league == "NFL"] = var(records$win_pct[records$league == "NFL"]) lgs$var.obs[lgs$league == "NBA"] = var(records$win_pct[records$league == "NBA"]) lgs$var.obs[lgs$league == "MLB"] = var(records$win_pct[records$league == "MLB"]) lgs$var.rand.est = .5*(1-.5)/lgs$games lgs$var.true.est = lgs$var.obs - lgs$var.rand.est lgs$regress.halfway.games = lgs$games*lgs$var.rand.est/lgs$var.true.est lgs$regress.halfway.pct.season = lgs$regress.halfway.games/lgs$games lgs$noll.scully = sqrt(lgs$var.obs)/sqrt(lgs$var.rand.est) lgs$better.team.better.record.pct = 0.5 + atan(sqrt(lgs$var.obs - lgs$var.rand.est)/sqrt(lgs$var.rand.est))/pi lgs |
Here is the resulting table: