rego datasetTo cite the hyper2 package in publications, please use
Hankin (2017). Here we consider the rego dataset drawn from the
European Union, as analysed in the rSRD package:
jj <- as.matrix(read.table("rego.txt",header=TRUE))[1:16,]
jj[1:4, 1:6]
## Botenga Bompard Ernst Chahim Pereira Buzek
## Budget 12 20 17 0 7 17
## Climate 137 293 232 43 44 136
## Economy 48 79 46 14 14 35
## Energy 121 244 169 48 68 204
[Row 17 refers to summary statistics defined by Sziklai et al, not used here]. Above each row is an issue and each column an author. Entries are the number of times each issue is mentioned by each author.
rego_table <- as.ordertable(apply(jj, 2, rank, ties.method="first"))
rego_table[1:4,1:6]
## An ordertable:
## Botenga Bompard Ernst Chahim Pereira Buzek
## Budget 1 1 1 1 1 1
## Climate 4 4 4 3 3 3
## Economy 2 2 2 2 2 2
## Energy 3 3 3 4 4 4
rego <- suppfun(rego_table)
rego_maxp <- maxp(rego)
rego_maxp
## Budget Climate Economy Energy Enterprise Future
## 0.3468696 0.0032471 0.0506116 0.0041656 0.0124536 0.0640485
## Geography Health Industry Infrastructure Labour Mining
## 0.0343817 0.0213625 0.0136680 0.0538886 0.0483303 0.0189826
## Mode Science Security Social
## 0.1058095 0.0993712 0.0832971 0.0395126
plot(rego_maxp)
plot(log(rego_maxp))
plot(sort(log(rego_maxp)))
pie(rego_maxp)
dotchart(rego_maxp, pch=16)
dotchart(log(rego_maxp), pch=16)
OK now, conditional on rego_maxp, I will calculate the probability
[actually, log-likelihood] of observing each of the judges’ order
statistics:
getcol <- function(i){
ordervec2supp(setNames(c(rego_table[,i]),rownames(rego_table)))
}
getcol(1)
## log( Budget * (Budget + Climate + Economy + Energy + Enterprise + Future + Geography +
## Health + Industry + Infrastructure + Labour + Mining + Mode + Science + Security +
## Social)^-1 * (Climate + Economy + Energy + Enterprise + Future + Geography + Health +
## Industry + Infrastructure + Labour + Mining + Mode + Science + Security + Social)^-1 *
## (Climate + Economy + Energy + Enterprise + Future + Geography + Health + Industry +
## Labour + Mining + Mode + Science + Security + Social)^-1 * (Climate + Economy + Energy +
## Enterprise + Geography + Health + Industry + Labour + Mining + Mode + Science + Security
## + Social)^-1 * (Climate + Economy + Energy + Enterprise + Geography + Health + Industry +
## Labour + Mining + Mode + Science + Social)^-1 * (Climate + Economy + Energy + Enterprise
## + Geography + Health + Industry + Labour + Mode + Science + Social)^-1 * (Climate +
## Economy + Energy + Enterprise + Geography + Health + Industry + Labour + Science +
## Social)^-1 * (Climate + Economy + Energy + Enterprise + Geography + Health + Industry +
## Labour + Social)^-1 * (Climate + Energy + Enterprise + Geography + Health + Industry +
## Labour + Social)^-1 * (Climate + Energy + Enterprise + Health + Industry)^-1 * (Climate +
## Energy + Enterprise + Health + Industry + Labour)^-1 * (Climate + Energy + Enterprise +
## Health + Industry + Labour + Social)^-1 * (Climate + Energy + Health)^-1 * (Climate +
## Energy + Health + Industry)^-1 * (Climate + Health)^-1 * Economy * Energy * Enterprise *
## Future * Geography * Health * Industry * Infrastructure * Labour * Mining * Mode *
## Science * Security * Social)
loglik(rego_maxp, getcol(1))
## [1] -23.297
loglik(rego_maxp, getcol(2))
## [1] -21.095
loglik(rego_maxp, getcol(3))
## [1] -22.037
We can see that the probability [actually, log-likelihood] of obtaining the observations of Botenga, Bompard, and Ernst are slightly different from one another. We can calculate this for all the politicians in one vectorized expression:
L <- sapply(seq_len(ncol(rego_table)),
function(i){loglik(rego_maxp, getcol(i))}
)
And plot them:
L <- L - max(L)
plot(L, pch=16)
grid()
n <- ncol(rego_table)
mypos <- rep(2,n)
mypos[1] <- 4
text(seq_len(n), L, colnames(rego_table), pos=mypos, col="gray")
Above we see that Kaljurand is a bit of an outlier. Conditional on the evaluate, his is (by far) the least likely order statistic.
We may go slightly further and use the other politicians:
fothers <- function(i){ # column i
maxp_others <- maxp(suppfun(rego_table[,-i]))
return(loglik(maxp_others, getcol(i)))
}
Lothers <- sapply(seq_len(ncol(rego_table)), fothers)
plot(Lothers)
Following lines create rego.rda, residing in the data/ directory
of the package.
save(rego, rego_maxp, rego_table, file="rego.rda")
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.