Eurovision Song Contest: semifinal 1, 2009

(takes about 30 minutes to run without cache, most of the time taken processing the maximi chunk)

To cite the hyper2 package in publications, please use Hankin (2017). The file eurovision.txt, used below, is copied from “Eurovision Song Contest 2009,” Wikipedia, accessed May 13, 2018. It refers to semi-final 1. More documentation is given in eurovision.Rd [type help(euro2009) at the R prompt].

First we specify the matrix as appearing in the Wikipedia page:

eurovision_table <- as.matrix(read.table("eurovision.txt"))
eurovision_table

##    ME CZ BE BY SW AM AD CH TR IL BG IS MK RO FI PT MT BA DE UK
## ME NA  0  0  3  0  5  1  2  5  1  0  0  8  0  0  1  6 10  2  0
## CZ  0 NA  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## BE  0  0 NA  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## BY  2  1  0 NA  1  4  0  0  0  4  1  1  6  0  4  0  1  0  0  0
## SW  0  6  4  7 NA  8  7  4  4  7  0 10  3  4 10  8  8  4  4  7
## AM  4 12 10 10  5 NA  0  1 10 10  8  2  2  8  1  0  0  1 10  5
## AD  0  0  0  0  0  0 NA  0  1  0  0  0  0  0  0  4  3  0  0  0
## CH  0  0  0  2  2  0  2 NA  0  0  0  0  0  0  5  2  0  2  0  0
## TR  8  5 12  6  7 10  5 12 NA  6 12  7 12 12  7  5 10 12 12 12
## IL  5  4  3  4  6  7  8  5  3 NA  4  6  1  3  6  0  4  0  5  1
## BG  0  0  0  0  0  0  0  0  2  0 NA  0  5  0  0  0  0  0  0  0
## IS  7 10  7 12 12 12 10  7  8 12  6 NA  4 10 12 12 12  7  6  8
## MK 10  3  0  0  0  0  0  6  6  0 10  0 NA  2  0  0  0  8  0  0
## RO  6  0  2  1  0  2  4  0  7  8  5  4  7 NA  0 10  2  6  1  2
## FI  3  0  1  0 10  0  3  0  0  0  0 12  0  1 NA  3  5  0  0  4
## PT  0  2  6  0  3  0 12 10  0  2  2  8  0  7  2 NA  0  3  7  6
## MT  1  7  8  8  4  3  6  3  0  5  3  5  0  6  3  6 NA  5  3 10
## BA 12  8  5  5  8  6  0  8 12  3  7  3 10  5  8  7  7 NA  8  3

Each row corresponds to a contestant and each column to a judge. Note that the matrix is not square: the last two columns correspond to Germany and the UK, who were judges but not competitors. I have removed the first column from the Wikipedia table (which gave the total of the points), replaced self-voting entries with NA, and replaced blanks with 0. Thus the first row corresponds to votes cast for ME, Montenegro. We see that BY (Belarus) awarded them three points, AM (Armenia) five points, and so on. The first column corresponds to the points awarded by Monetenegro. We see that they placed gave Belarus (BY) two points, Armenia (AM) four points, and so on. Their favourite was Bosnia and Herzegovina (BA) to whom they gave twelve points. The points system used was:

points <- c(12,10,8,7,6,5,4,3,2,1)

Variable points gives the number of points awarded, under Eurovision rules, to voters’ first, second, third, etc choice. Points for any competitor are added and the winner is the competitor with the most points. However, in the hyper2 model, the numerical values themselves do not affect the likelihood function; only the order of the voters’ preferences matters. The following R idiom translates wiki_matrix into a form suitable for analysis with hyper2:

preference <- eurovision_table*0  
for(i in seq_along(points)){ preference[eurovision_table == points[i]] <- i }

countries <- data.frame(
    fullname =
        c("Montenegro", "Czech rep", "Belgium", "Belarus", "Sweden",
          "Armenia", "Andorra", "Switzerland", "Turkey", "Israel",
          "Bulgaria", "Iceland", "Macedonia", "Romania", "Finland",
          "Portugal", "Malta", "Bosnia Herz", "Germany", "UK"),
    abbreviation = c("ME","CZ","BE", "BY", "SW", "AM", "AD", "CH",
                     "TR", "IL", "BG", "IS", "MK", "RO", "FI", "PT",
                     "MT", "BA", "DE", "UK")
)

if(abbreviated){ 
    jj <- countries$abbreviation
} else { 
    jj <- countries$fullname
}

competitors <- as.character(jj[1:18])
colnames(preference) <- jj      # voters; 20 countries (18 + DE + UK)
rownames(preference) <- competitors  

rownames(eurovision_table) <- jj[1:18]
colnames(eurovision_table) <- jj

In the above, matrix preference records voters’ first, second, third, etc choice. A zero entry means no points (nul punkte!) and NA means that voter was forbidden from voting for that player (countries cannot vote for themselves). The competitors were the first 18 countries (the last two countries, Germany and the UK, voted but did not compete).

preference

##    ME CZ BE BY SW AM AD CH TR IL BG IS MK RO FI PT MT BA DE UK
## ME NA  0  0  8  0  6 10  9  6 10  0  0  3  0  0 10  5  2  9  0
## CZ  0 NA  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## BE  0  0 NA  0  0 10  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## BY  9 10  0 NA 10  7  0  0  0  7 10 10  5  0  7  0 10  0  0  0
## SW  0  5  7  4 NA  3  4  7  7  4  0  2  8  7  2  3  3  7  7  4
## AM  7  1  2  2  6 NA  0 10  2  2  3  9  9  3 10  0  0 10  2  6
## AD  0  0  0  0  0  0 NA  0 10  0  0  0  0  0  0  7  8  0  0  0
## CH  0  0  0  9  9  0  9 NA  0  0  0  0  0  0  6  9  0  9  0  0
## TR  3  6  1  5  4  2  6  1 NA  5  1  4  1  1  4  6  2  1  1  1
## IL  6  7  8  7  5  4  3  6  8 NA  7  5 10  8  5  0  7  0  6 10
## BG  0  0  0  0  0  0  0  0  9  0 NA  0  6  0  0  0  0  0  0  0
## IS  4  2  4  1  1  1  2  4  3  1  5 NA  7  2  1  1  1  4  5  3
## MK  2  8  0  0  0  0  0  5  5  0  2  0 NA  9  0  0  0  3  0  0
## RO  5  0  9 10  0  9  7  0  4  3  6  7  4 NA  0  2  9  5 10  9
## FI  8  0 10  0  2  0  8  0  0  0  0  1  0 10 NA  8  6  0  0  7
## PT  0  9  5  0  8  0  1  2  0  9  9  3  0  4  9 NA  0  8  4  5
## MT 10  4  3  3  7  8  5  8  0  6  8  6  0  5  8  5 NA  6  8  2
## BA  1  3  6  6  3  5  0  3  1  8  4  8  2  6  3  4  4 NA  3  8

Now, take the first column of preference. This represents the order of preferences of Montenegro (ME). Their favourite was [last row] Bosnia & Herzegovina (BA), who they gave rank 1 to (that is, 12 points). Their second favourite was Macedonia (MK), their third was Turkey (TR), and so on. They were not allowed to vote for themselves, which is why the first row is NA. So the order was: BA (first), MK (second) , TR (third), IS (fourth), RO (fifth), IL (sixth), AM (seventh), FI (eigthth), BY (ninth), MT (tenth). The other countries (CZ, BE, SW, CH, BG, PT) came joint last and did not receive any points from Montenegro. Note the final two rows corresponding to votes cast by Germany and the UK, who did not compete and therefore cast votes for all competitors.

We need to convert matrix preference into a likelihood function, here eurovision:

eurovision <- hyper2()

for(i in seq_len(ncol(preference))){   # cycle through the rows; each row is a voter
    d <- preference[,i,drop=TRUE]
    eurovision <- eurovision + suppfun(d[!is.na(d)])
} # i loop closes

eurovision_maxp <- maxp(eurovision)
eurovision_maxp

##         AD         AM         BA         BE         BG         BY         CH 
## 4.1652e-03 4.9610e-02 1.0126e-01 1.3499e-03 2.7491e-03 1.5750e-02 8.8204e-03 
##         CZ         FI         IL         IS         ME         MK         MT 
## 1.0028e-06 1.5648e-02 5.0215e-02 2.7924e-01 1.9261e-02 1.2034e-02 5.7317e-02 
##         PT         RO         SW         TR 
## 3.0281e-02 3.3604e-02 7.7127e-02 2.4156e-01

pie(eurovision_maxp)

Figure 1: Maximum likelihood strengths of competitors in the 2009 Eurovision song contest, semifinal 1

Observe the small estimated strength for Czechoslovakia (CZ) at about 1e-06. This is consistent with the second row of preference which shows that noone gave them any points.

consistency(eurovision)

consistency check: maximum likelihood obtained two different ways. Exact agreement corresponds to points lying on the diagonal line. Note the small estimated strength of Czechoslovakia

Figure 2: consistency check: maximum likelihood obtained two different ways. Exact agreement corresponds to points lying on the diagonal line. Note the small estimated strength of Czechoslovakia

Now we can check null of equal strengths:

equalp.test(eurovision)

## 
##  Constrained support maximization
## 
## data:  eurovision
## null hypothesis: AD = AM = BA = BE = BG = BY = CH = CZ = FI = IL = IS = ME = MK = MT = PT = RO = SW = TR
## null estimate:
##       AD       AM       BA       BE       BG       BY       CH       CZ 
## 0.055556 0.055556 0.055556 0.055556 0.055556 0.055556 0.055556 0.055556 
##       FI       IL       IS       ME       MK       MT       PT       RO 
## 0.055556 0.055556 0.055556 0.055556 0.055556 0.055556 0.055556 0.055556 
##       SW       TR 
## 0.055556 0.055556 
## (argmax, constrained optimization)
## Support for null:  -501.22 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##         AD         AM         BA         BE         BG         BY         CH 
## 4.1652e-03 4.9610e-02 1.0126e-01 1.3499e-03 2.7491e-03 1.5750e-02 8.8204e-03 
##         CZ         FI         IL         IS         ME         MK         MT 
## 1.0028e-06 1.5648e-02 5.0215e-02 2.7924e-01 1.9261e-02 1.2034e-02 5.7317e-02 
##         PT         RO         SW         TR 
## 3.0281e-02 3.3604e-02 7.7127e-02 2.4156e-01 
## (argmax, free optimization)
## Support for alternative:  -378.92 + K
## 
## degrees of freedom: 17
## support difference = 122.3
## p-value: 2.6467e-42

Thus the difference is about 122 units of support, surely significant.

points_scored <- rowSums(eurovision_table,na.rm=TRUE)  # points
ox <- rank(-points_scored)
oy <- rank(-eurovision_maxp)   # m = MLE
oyp <- ordertrans(oy,names(ox))

par(pty='s') # square plot
plot(ox,oyp,asp=1,pty='s',xlim=c(0,18),ylim=c(0,18),pch=16,
xlab="official order",ylab="my order",main="Eurovision 2009")

par(xpd=TRUE) # allow country names to appear outside plotting region
for(i in seq_along(ox)){text(ox[i],oyp[i],names(ox)[i],pos=4,col='gray') }
par(xpd=FALSE) # stop diagonal line from protruding beyond plotting region
abline(0,1)

1 Testing nonindependence using `hyper3` idiom

First we use hyper2:

a <- read.table("euro_ranks.txt",header=TRUE)
jj <- table(a$country)
wanted_countries <- names(jj[jj >= 10])
a <- a[a$country %in% wanted_countries,]
H <- hyper2()
for(y in unique(a$year)){
   H <- H + race((a$country)[a$year == y])
}
summary(H)

## A hyper2 object of size 27.
## pnames:  Albania Armenia Azerbaijan BosHerz Cyprus Denmark Estonia Finland France Germany Greece Hungary Iceland Israel Italy Lithuania Malta Moldova Netherlands Norway Romania Russia Serbia Spain Sweden UK Ukraine 
## Number of brackets: 357 
## Sum of powers: 0 
## 
## Table of bracket lengths:
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 
## 27 13 17 19 19 19 19 19 19 19 19 19 18 18 18 18 17 15 12  5  4  3  1 
## 
## Table of powers:
##  -4  -3  -2  -1   9  10  11  12  13  14  15  16  17 
##   1   1   3 325   2   5   6   2   3   3   1   1   4

Find the evaluate:

mH <- maxp(H)

dotchart(mH)

Now use hyper3 and follow RB_BB_LF() in plackett_luce_monster.Rmd. The blocs follow those of Price 2020.

scandinavia <- c("Denmark","Finland","Iceland","Norway","Sweden")

balkans  <- c("Italy","Serbia","Albania","BosHerz") # red
eastern  <- c("Russia","Ukraine","Moldova","Armenia","Azerbaijan", "Malta","Israel") # purple
black    <- c("Hungary","Romania","Greece","Cyprus") # orange
iberia   <- c("Spain","Germany") # blue
northern <- c("Denmark", "Estonia", "Finland", "France", "Iceland", "Lithuania", "Netherlands", "Norway", "Sweden", "UK") # green

l <- list(balkans=balkans,eastern=eastern,black=black,iberia=iberia,northern=northern)

ab <- c(balkans,eastern,black,iberia,northern)
stopifnot(all(ab %in% wanted_countries))
stopifnot(all(wanted_countries %in% ab))


list2ec <- function(l){
  jj <- stack(setNames(l,seq_along(l)))
  out <- as.integer(c(jj$ind))
  names(out) <- jj$values
  return(out)
}

bloc <- function(lam){
  out <- hyper3()
  e <- list2ec(l)
  for(y in unique(a$year)){
     v <- a$country[a$year==y]
     ec <- e[names(e) %in% v]
     out <- out + cheering3(v=v, e=ec,h=rep(lam,length(unique(e))))
   } 
  return(out)
}

o <- function(lam){  # maximize function o()
   H <- bloc(lam)
   return(maxp(H,give=TRUE,n=1)$likes)
}

lam <- seq(from=0.6,to=2,len=20)
L <- sapply(lam,o)

plot(lam,L-max(L),type="b",pch=16)
abline(h=c(0,-2))
abline(v=1)

Package dataset

Following lines create eurovision.rda, residing in the data/ directory of the package.

save(eurovision_table,eurovision_maxp,eurovision,file="eurovision.rda")

References

A. Price 2020. “Identifying voting blocs in the Eurovision Song Contest”. Towards Data Science, https://medium.com/towards-data-science/identifying-voting-blocs-in-the-eurovision-song-contest-4792065fc337

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.

Eurovision Song Contest: semifinal 1, 2009

R. K. S. Hankin

1 Testing nonindependence using hyper3 idiom

Package dataset

References

1 Testing nonindependence using `hyper3` idiom