Archive for the ‘Talent Distribution’ Category

The Distribution of Talent Between Teams

Wednesday, October 20th, 2010

Four years ago Tango had a very interesting post on how talent is distributed between teams in different sports leagues. I want to revisit and expand upon some of the points that came up in that discussion.

First, lets look at some empirical data. I scraped end of season records from the last ten years for the NFL, NBA and MLB from ShrpSports (I decided to omit the NHL from this analysis due to the prevalence of ties). The data is available here (click through) as a tab-delimited text file. I used R to analyze the data. If you don’t have R you can download it for free (if you use Windows I recommend using it in conjunction with Tinn-R, which is great for editing and interactively running R scripts). Here is the R code I used:

?View Code RSPLUS
records = read.delim(file = "records.txt")
lgs = data.frame(league=c("NFL","NBA","MLB"),teams=c(32,30,30),games=c(16,82,162))
lgs$var.obs[lgs$league == "NFL"] = var(records$win_pct[records$league == "NFL"])
lgs$var.obs[lgs$league == "NBA"] = var(records$win_pct[records$league == "NBA"])
lgs$var.obs[lgs$league == "MLB"] = var(records$win_pct[records$league == "MLB"])
lgs$var.rand.est = .5*(1-.5)/lgs$games
lgs$var.true.est = lgs$var.obs - lgs$var.rand.est
lgs$regress.halfway.games = lgs$games*lgs$var.rand.est/lgs$var.true.est
lgs$regress.halfway.pct.season = lgs$regress.halfway.games/lgs$games
lgs$noll.scully = sqrt(lgs$var.obs)/sqrt(lgs$var.rand.est)
lgs$better.team.better.record.pct = 0.5 + atan(sqrt(lgs$var.obs - lgs$var.rand.est)/sqrt(lgs$var.rand.est))/pi
lgs

Here is the resulting table:

(more…)