To cite the hyper2 package in publications, please use Hankin (2017).
Here I analyse a karate
dataset taken from
wikipedia.
Short story: the dataset has too many degrees of freedom to be easily
analysed with hyper2 and Zermelo’s iteration scheme as implemented
by the BradleyTerry package exploits the special pairwise structure
of the likelihood function very effectively.
karate_table <- read.table("karate.txt",header=TRUE)
head(karate_table)
## karateka1 wins1 karateka2 wins2 group
## 1 figueira 5 dacosta 6 finals
## 2 velozo 1 figueira 3 semifinal
## 3 dacosta 2 derafshipour 0 semifinal
## 4 rosiello 2 sabiecki 0 repechage
## 5 rosiello 3 tadissi 3 repechage2
## 6 ferreras 0 zakaria 3 repechage2
Converting it to a likelihood function is straightforward:
karate <- hyper2()
for(i in seq_len(nrow(karate_table))){
p1 <- karate_table$karateka1[i]
p2 <- karate_table$karateka2[i]
w1 <- karate_table$wins1[i]
w2 <- karate_table$wins2[i]
karate[p1] %<>% inc(w1)
karate[p2] %<>% inc(w2)
karate[c(p1,p2)] %<>% dec(w1+w2)
}
summary(karate)
## A hyper2 object of size 82.
## pnames: akondzo alhadharim alkhathami almasatfa avila balcius bodrovs bouakel catro cenaj cheung dacosta daraghma delgado derafshipour dimitrov diouf elsawy ferreras figueira fillies galvez gidakos gill gordillo gorenc gudmundsson gutierrez hasanov hodzic hollowood homola jihwan joksic karkkalainen kavanagh khani krautsou kuhn limam lindelauf loannides maresca matthiasen mendez mertel mora muratov murtazaliev narkiniemi ngamphuengphit nievas noriega nyoni oneil pak pataridze petkov pokorny poshen povrzenic rodionov rodriguiez rolle rosiello sabiecki sawadogo shafei sharma shinohara sijercic suleimani tadissi tawfik teodorescu thomas toli uygur uzakov velozo zakaria zhiwei
## Number of brackets: 147
## Sum of powers: 0
##
## Table of bracket lengths:
## 1 2
## 61 86
##
## Table of powers:
## -13 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 1 2 3 4 5 6 7 8
## 2 1 1 6 8 1 4 11 17 10 8 17 8 11 9 4 6 2 4 2
## 9 10 12 16 17 18 19 20 31
## 3 3 1 2 2 1 1 1 1
In the summary above (in the summary of bracket lengths), we see how the likelihood function comprises only singletons [which appear on the numerator] and pairs [on the denominator].
karate_maxp <- maxp(karate)
dotchart(log10(karate_maxp))
We will define a matrix of win-loss statistics:
pk <- pnames(karate)
M <- matrix(0,length(pk),length(pk))
rownames(M) <- pk
colnames(M) <- pk
o <- karate_table # saves typing
for(i in seq_len(nrow(o))){
M[o[i,1],o[i,3]] %<>% `+`(o[i,2])
M[o[i,3],o[i,1]] %<>% `+`(o[i,4])
}
pM <- pairwise(M)
pM == karate # consistency check; should be TRUE
## [1] TRUE
Now use Zermelo’s iteration method and compare with hyper2’s maxp():
karate_zermelo <- zermelo(M)
loglik(karate_zermelo,karate)
## [1] -160.23
loglik(karate_maxp,karate)
## [1] -188.42
Above we see that Zermelo gives a much more likely point in parameter
space than maxp().
par(pty='s')
plot(log10(karate_zermelo),log10(karate_maxp),asp=1,xlim=c(-4,0),ylim=c(-4,0))
abline(0,1)
grid()
Following lines create karate.rda, residing in the data/ directory
of the package.
save(karate_table,karate,karate_maxp,karate_zermelo,file="karate.rda")
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.