The rego dataset

To cite the hyper2 package in publications, please use Hankin (2017). Here we consider the rego dataset drawn from the European Union, as analysed in the rSRD package:

jj  <- as.matrix(read.table("rego.txt",header=TRUE))[1:16,]
jj[1:4, 1:6]

##         Botenga Bompard Ernst Chahim Pereira Buzek
## Budget       12      20    17      0       7    17
## Climate     137     293   232     43      44   136
## Economy      48      79    46     14      14    35
## Energy      121     244   169     48      68   204

[Row 17 refers to summary statistics defined by Sziklai et al, not used here]. Above each row is an issue and each column an author. Entries are the number of times each issue is mentioned by each author.

rego_table <- as.ordertable(apply(jj, 2, rank, ties.method="first"))
rego_table[1:4,1:6]

## An ordertable:
##         Botenga Bompard Ernst Chahim Pereira Buzek
## Budget        1       1     1      1       1     1
## Climate       4       4     4      3       3     3
## Economy       2       2     2      2       2     2
## Energy        3       3     3      4       4     4

rego <- suppfun(rego_table)
rego_maxp <- maxp(rego)

rego_maxp

##         Budget        Climate        Economy         Energy     Enterprise         Future 
##      0.3468696      0.0032471      0.0506116      0.0041656      0.0124536      0.0640485 
##      Geography         Health       Industry Infrastructure         Labour         Mining 
##      0.0343817      0.0213625      0.0136680      0.0538886      0.0483303      0.0189826 
##           Mode        Science       Security         Social 
##      0.1058095      0.0993712      0.0832971      0.0395126

plot(rego_maxp)

plot(log(rego_maxp))

plot(sort(log(rego_maxp)))

pie(rego_maxp)

dotchart(rego_maxp, pch=16)

dotchart(log(rego_maxp), pch=16)

OK now, conditional on rego_maxp, I will calculate the probability [actually, log-likelihood] of observing each of the judges’ order statistics:

getcol <- function(i){
  ordervec2supp(setNames(c(rego_table[,i]),rownames(rego_table)))
}
getcol(1)

## log( Budget * (Budget + Climate + Economy + Energy + Enterprise + Future + Geography +
## Health + Industry + Infrastructure + Labour + Mining + Mode + Science + Security +
## Social)^-1 * (Climate + Economy + Energy + Enterprise + Future + Geography + Health +
## Industry + Infrastructure + Labour + Mining + Mode + Science + Security + Social)^-1 *
## (Climate + Economy + Energy + Enterprise + Future + Geography + Health + Industry +
## Labour + Mining + Mode + Science + Security + Social)^-1 * (Climate + Economy + Energy +
## Enterprise + Geography + Health + Industry + Labour + Mining + Mode + Science + Security
## + Social)^-1 * (Climate + Economy + Energy + Enterprise + Geography + Health + Industry +
## Labour + Mining + Mode + Science + Social)^-1 * (Climate + Economy + Energy + Enterprise
## + Geography + Health + Industry + Labour + Mode + Science + Social)^-1 * (Climate +
## Economy + Energy + Enterprise + Geography + Health + Industry + Labour + Science +
## Social)^-1 * (Climate + Economy + Energy + Enterprise + Geography + Health + Industry +
## Labour + Social)^-1 * (Climate + Energy + Enterprise + Geography + Health + Industry +
## Labour + Social)^-1 * (Climate + Energy + Enterprise + Health + Industry)^-1 * (Climate +
## Energy + Enterprise + Health + Industry + Labour)^-1 * (Climate + Energy + Enterprise +
## Health + Industry + Labour + Social)^-1 * (Climate + Energy + Health)^-1 * (Climate +
## Energy + Health + Industry)^-1 * (Climate + Health)^-1 * Economy * Energy * Enterprise *
## Future * Geography * Health * Industry * Infrastructure * Labour * Mining * Mode *
## Science * Security * Social)

loglik(rego_maxp, getcol(1))

## [1] -23.297

loglik(rego_maxp, getcol(2))

## [1] -21.095

loglik(rego_maxp, getcol(3))

## [1] -22.037

We can see that the probability [actually, log-likelihood] of obtaining the observations of Botenga, Bompard, and Ernst are slightly different from one another. We can calculate this for all the politicians in one vectorized expression:

L <- sapply(seq_len(ncol(rego_table)),
    function(i){loglik(rego_maxp, getcol(i))}
    )

And plot them:

L <- L - max(L)
plot(L, pch=16)
grid()
n <- ncol(rego_table)
mypos <- rep(2,n)
mypos[1] <- 4
text(seq_len(n), L, colnames(rego_table), pos=mypos, col="gray")

Above we see that Kaljurand is a bit of an outlier. Conditional on the evaluate, his is (by far) the least likely order statistic.

We may go slightly further and use the other politicians:

fothers <- function(i){ # column i
  maxp_others <- maxp(suppfun(rego_table[,-i]))
  return(loglik(maxp_others, getcol(i)))
}

Lothers <- sapply(seq_len(ncol(rego_table)), fothers)

plot(Lothers)

Package dataset

Following lines create rego.rda, residing in the data/ directory of the package.

save(rego, rego_maxp, rego_table, file="rego.rda")

References

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.

The `rego` dataset

R. K. S. Hankin

Package dataset

References