(takes about twenty minutes to run without cache)
To cite the hyper2 package in publications, please use
Hankin (2017). The file eurodance2008.txt, used below, is copied
from “Eurovision Dance Contest 2008”, Wikipedia, accessed March 31,
2021. First we specify the matrix as appearing in the Wikipedia page:
eurodance_table <- as.matrix(read.table("eurodance2008.txt"))
eurodance_table
## Total jury Sweden Austria Denmark Azerbaijan Ireland Finland
## Sweden 38 4 NA 3 10 0 0 7
## Austria 29 0 0 NA 3 2 1 3
## Denmark 102 48 8 7 NA 1 3 8
## Azerbaijan 106 28 5 8 0 NA 7 1
## Ireland 40 0 0 0 4 6 NA 2
## Finland 44 12 12 0 6 5 0 NA
## Netherlands 1 0 0 1 0 0 0 0
## Lithuania 110 32 7 0 7 4 10 6
## UK 47 8 1 4 5 3 8 0
## Russia 121 24 6 6 2 8 4 12
## Greece 72 40 4 2 0 0 2 5
## Portugal 61 0 3 5 1 7 6 0
## Poland 154 20 10 12 12 10 12 10
## Ukraine 119 16 2 10 8 12 5 4
## Netherlands Lithuania UK Russia Greece Portugal Poland Ukraine
## Sweden 0 1 2 2 2 3 4 0
## Austria 0 4 5 4 5 1 0 1
## Denmark 2 6 4 0 1 7 2 5
## Azerbaijan 4 12 1 10 6 4 12 8
## Ireland 0 5 8 7 0 0 6 2
## Finland 1 3 0 0 0 2 3 0
## Netherlands NA 0 0 0 0 0 0 0
## Lithuania 5 NA 10 5 4 5 8 7
## UK 10 0 NA 1 3 0 1 3
## Russia 8 10 0 NA 12 10 7 12
## Greece 3 0 3 3 NA 6 0 4
## Portugal 6 2 7 6 7 NA 5 6
## Poland 12 8 12 8 10 8 NA 10
## Ukraine 7 7 6 12 8 12 10 NA
o <- eurodance_table # saves typing
Each row corresponds to a contestant and each column to a judge. The points system used was:
points_voters <- c(12,10,8,7,6,5,4,3,2,1)
points_jury <- 4*points_voters
ov <- o[,-(1:2)] # ='o' but just the voters: exclude total and jury
pref <- ov*0 # retains NA
for(i in seq_along(points_voters)){pref[ov == points_voters[i]] <- i}
pref
## Sweden Austria Denmark Azerbaijan Ireland Finland Netherlands
## Sweden NA 8 2 0 0 4 0
## Austria 0 NA 8 9 10 8 0
## Denmark 3 4 NA 10 8 3 9
## Azerbaijan 6 3 0 NA 4 10 7
## Ireland 0 0 7 5 NA 9 0
## Finland 1 0 5 6 0 NA 10
## Netherlands 0 10 0 0 0 0 NA
## Lithuania 4 0 4 7 2 5 6
## UK 10 7 6 8 3 0 2
## Russia 5 5 9 3 7 1 3
## Greece 7 9 0 0 9 6 8
## Portugal 8 6 10 4 5 0 5
## Poland 2 1 1 2 1 2 1
## Ukraine 9 2 3 1 6 7 4
## Lithuania UK Russia Greece Portugal Poland Ukraine
## Sweden 10 9 9 9 8 7 0
## Austria 7 6 7 6 10 0 10
## Denmark 5 7 0 10 4 9 6
## Azerbaijan 1 10 2 5 7 1 3
## Ireland 6 3 4 0 0 5 9
## Finland 8 0 0 0 9 8 0
## Netherlands 0 0 0 0 0 0 0
## Lithuania NA 2 6 7 6 3 4
## UK 0 NA 10 8 0 10 8
## Russia 2 0 NA 1 2 4 1
## Greece 0 8 8 NA 5 0 7
## Portugal 9 4 5 4 NA 6 5
## Poland 3 1 3 2 3 NA 2
## Ukraine 4 5 1 3 1 2 NA
Now we treat each column as an order statistic:
eurodance <- hyper2()
for(i in seq_len(ncol(pref))){
x <- pref[,i]
eurodance %<>% `+`(suppfun(x[!is.na(x)]))
} # i loop closes
eurodance_maxp <- maxp(eurodance)
eurodance_maxp
## Austria Azerbaijan Denmark Finland Greece Ireland
## 0.017616 0.058974 0.030323 0.011992 0.018569 0.017431
## Lithuania Netherlands Poland Portugal Russia Sweden
## 0.059340 0.001335 0.481731 0.047283 0.090384 0.015828
## UK Ukraine
## 0.019494 0.129700
pie(eurodance_maxp)
Note the dominance of Poland. We verify consistency:
consistency(eurodance)
Now we consider the jury’s vote:
(jj <- o[,2])
## Sweden Austria Denmark Azerbaijan Ireland Finland
## 4 0 48 28 0 12
## Netherlands Lithuania UK Russia Greece Portugal
## 0 32 8 24 40 0
## Poland Ukraine
## 20 16
jj <- sort(jj,decreasing=TRUE)
jj[jj>0] <- seq_len(sum(jj>0))
jury <- jj
jury
## Denmark Greece Lithuania Azerbaijan Russia Poland
## 1 2 3 4 5 6
## Ukraine Finland UK Sweden Austria Ireland
## 7 8 9 10 0 0
## Netherlands Portugal
## 0 0
Now the question is whether the partial rank conferred by the jury is
consistent with that of the voters. The permutation test technique
used in cook.Rmd is not appropriate here as there are \(14!/4!\simeq
3.6\times 10^9\) permutations to consider; we will use a randomised
sampling method. Below, we use a resampling technique: 4000
randomised jury observations are created, and the likelihood [under
the Plackett-Luce likelihood function for the order statistics of the
nationals’ voters] is calculated. The \(p\)-value is then the
probability of observing the actual observation or an observation more
unlikely under this likelihood function; it is operationally identical
to a simulated Fisher’s exact test.
n <- 4000
LL <- rep(0,n)
for(i in seq_len(n)){
jjstar <- sample(jury)
names(jjstar) <- names(jury)
LL[i] <- loglik(indep(eurodance_maxp),suppfun(jjstar),log=FALSE)
}
plot(sort(LL),log="y")
p <- loglik(indep(eurodance_maxp),suppfun(jury),log=FALSE)
abline(h=p)
sum(LL[LL<p])/sum(LL)
## [1] 0.025063
The \(p\)-value of about 2.4% shows that there is strong evidence to suggest that the expert jury’s ordering differs from that of the national popular vote.
Following lines create eurodance.rda, residing in the data/ directory of the package.
save(eurodance_table,eurodance_maxp,eurodance,file="eurodance.rda")
###R References {-}
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.