Links:
Consider the following
“season” involving a total of 8 games among 5 different teams, as described in
the paper.
Game 1: Team A defeated Team
C Game 2: Team A defeated Team E
Game 3: Team B defeated Team A Game 4:
Team B defeated Team E
Game 5: Team C defeated Team
D Game 6: Team C defeated
Team E
Game 7: Team D defeated Team
E Game 8: Team D defeated
Team E
We will assign numeric id’s
to the five teams as A=1, B=2, C=3, D=4 and E=5.
The R code below will rank
the five teams based on the season above.
Simply paste all of the red code into R to run.
#David Mease
2006
#http://www.davemease.com/football
#the code below
runs my simple example
#to run for
different data, simply edit only the variables number_of_teams,
number_of_games and input_matrix
number_of_teams<-5
#note: if you have an
extra collect-all team for all non-Division 1A teams as I do for real data
#then this single
team *should* count as 1 toward the total numbers of teams
#and it should be
treated as a regular team throughout
number_of_games<-8
input_matrix<-matrix(0,number_of_games,2)
#the input matrix
has two columns and one row for each game
#the first column
is team id for the winner of the game and the second column is the team id for
the loser
#the team id's should begin with 1 and be sequential
#for those of you familiar
with R, you could instead create a data file containing
#this information
and read it in using read.table or read.csv
input_matrix[1,]<-c(1,3)
input_matrix[2,]<-c(1,5)
input_matrix[3,]<-c(2,1)
input_matrix[4,]<-c(2,5)
input_matrix[5,]<-c(3,4)
input_matrix[6,]<-c(3,5)
input_matrix[7,]<-c(4,5)
input_matrix[8,]<-c(4,5)
#the data matrix
has one column for each team and one row for each game
#the column for the
winner of each game has a 1 and the column for the loser has a negative 1
#the data matrix is
created from the input matrix automatically by this code so
#you only need to
modify the input matrix
data_matrix<-matrix(0,(number_of_games+2*number_of_teams),number_of_teams)
for (i in
1:number_of_games) {
data_matrix[i,input_matrix[i,1]]<-1
data_matrix[i,input_matrix[i,2]]<-(-1)
}
for (i in 1:(number_of_teams)) {
#every team beats the virtual team exactly
once
data_matrix[(number_of_games+i),i]<-1
#every team losses to the virtual team
exactly once
data_matrix[(number_of_games+number_of_teams+i),i]<-(-1)
}
model<-glm(rep(1,(number_of_games+2*number_of_teams))~data_matrix-1,family=binomial(link=probit))
summary(model)
The essential part of the
output from R is:
Estimate Std.
Error z value Pr(>|z|)
data_matrix1
0.2615 0.6489 0.403
0.687
data_matrix2 0.5813
0.7468 0.778 0.436
data_matrix3 0.1519
0.6423 0.237 0.813
data_matrix4 -0.0411 0.6634
-0.062 0.951
data_matrix5 -0.9732 0.6697
-1.453 0.146
The five values in the
“Estimate” column are used to rank the five teams:
Team Theta Rank
A
.2615 2nd
B
.5813 1st (Best)
C .1519 3rd
D
-.0411
4th
E
-.9732
5th (Worst)