soling

To cite the hyper2 package in publications, please use Hankin (2017). Soling is a type of boat raced in the 2000 Summer Olympics. Here I analyse race data taken from wikipedia using a Plackett-Luce extension of the Bradley-Terry model.

soling_table <- ordertable(read.table("soling.txt"))
soling_table

## An ordertable:
##              I II III IV  V VI
## norway       5  2   1  3  5 16
## nz           3  3  10  2  7  3
## netherlands  1  5  13 11  1  1
## USA          4 14   2 16  2  2
## russia       2  8   4  8  6  9
## australia    8 11   6  1  9 14
## UK          13  7   8  4  4 13
## ukraine     12  1   7  6 15 12
## france       9 13  11  9  3  7
## germany     14 10   3  7 10 11
## sweden       7 15   5 10 16  5
## denmark      6 16  12 12 11  4
## canada      10  9   9 13 12  8
## italy       16  6  14  5 14 10
## finland     11  4  15 14 13 15
## spain       15 12  16 15  8  6

Thus we see, reading column 1, that in race 1, norway came fifth, nz third, netherlands first, and so on. We can present this information in a rank table format:

as.ranktable(soling_table)

## A ranktable:
##     c1          c2     c3      c4      c5          c6        c7      c8        c9        c10    
## I   netherlands russia nz      USA     norway      denmark   sweden  australia france    canada 
## II  ukraine     norway nz      finland netherlands italy     UK      russia    canada    germany
## III norway      USA    germany russia  sweden      australia ukraine UK        canada    nz     
## IV  australia   nz     norway  UK      italy       ukraine   germany russia    france    sweden 
## V   netherlands USA    france  UK      norway      russia    nz      spain     australia germany
## VI  netherlands USA    nz      denmark sweden      spain     france  canada    russia    italy  
##     c11         c12     c13         c14       c15     c16    
## I   finland     ukraine UK          germany   spain   italy  
## II  australia   spain   france      USA       sweden  denmark
## III france      denmark netherlands italy     finland spain  
## IV  netherlands denmark canada      finland   spain   USA    
## V   denmark     canada  finland     italy     ukraine sweden 
## VI  germany     ukraine UK          australia finland norway

and also convert it to a likelihood function:

soling <- suppfun(soling_table)
head(soling)

## Warning in print.hyper2(x): powers have nonzero sum

## log(UK^6 * (UK + USA + australia + canada + denmark + finland + france + germany + italy
## + netherlands + norway + nz + russia + spain + sweden)^-1 * (UK + USA + australia +
## canada + denmark + finland + france + germany + italy + netherlands + norway + nz +
## russia + spain + sweden + ukraine)^-6 * (UK + USA + australia + canada + denmark +
## finland + france + germany + italy + netherlands + nz + russia + spain + sweden)^-1 * (UK
## + USA + australia + canada + denmark + finland + france + germany + italy + netherlands +
## nz + russia + spain + sweden + ukraine)^-1 * (UK + USA + australia + canada + denmark +
## finland + france + germany + italy + netherlands + russia + spain + sweden)^-1)

We can also consider observations from the round robins and quarter finals:

soling_rr1 <- read.table("soling_rr1.txt")
soling_rr2 <- read.table("soling_rr2.txt")
soling_qf  <- read.table("soling_qf.txt" )
soling_rr1

##         germany sweden denmark france ukraine    UK
## germany      NA   TRUE   FALSE   TRUE    TRUE  TRUE
## sweden    FALSE     NA    TRUE   TRUE    TRUE  TRUE
## denmark    TRUE  FALSE      NA   TRUE    TRUE FALSE
## france    FALSE  FALSE   FALSE     NA    TRUE  TRUE
## ukraine   FALSE  FALSE   FALSE  FALSE      NA  TRUE
## UK        FALSE  FALSE    TRUE  FALSE   FALSE    NA

Thus, for example, taking soling_rr1[1,2] == TRUE we see that Germany beat Sweden in round robin 1. To incorporate this information we need a little bespoke function:

rr <- function(H,M){
    for(i in seq_len(nrow(M)-1)){
        for(j in (i+1):ncol(M)){
            if(M[i,j]){
                winner <- rownames(M)[i]
                loser  <- rownames(M)[j]
            } else {
                winner <- rownames(M)[j]
                loser <-  rownames(M)[i]
            }
            H[winner] %<>% inc
            H[c(winner,loser)] %<>% dec
        }
    }
    return(H)
}

We can see the effect on the evaluate of incorporating the additional observations into the Plackett-Luce likelihoods:

H <- soling  # before new information
soling_maxp <- maxp(H) 
H %<>% rr(soling_rr1)
H %<>% rr(soling_rr2)
H %<>% rr(soling_qf)
soling_after <- H # after new information
soling_after_maxp <- maxp(soling_after)
par(pty="s")
plot(soling_maxp,soling_after_maxp,asp=1,pch=16,xlim=c(0,0.3),ylim=c(0,0.3))
abline(0,1)

Figure 1: Evaluate before and after incorporation round robin and quarter final information

Figure 1 shows a modest change. We might ask whether the additional observations are consistent with the original Placket-Luce likelihood:

(soling_diff <- soling_after - soling)

## log( UK * (UK + denmark)^-1 * (UK + france)^-1 * (UK + germany)^-1 * (UK + sweden)^-1 *
## (UK + ukraine)^-1 * USA^2 * (USA + australia)^-1 * (USA + denmark)^-2 * (USA +
## germany)^-2 * (USA + netherlands)^-1 * (USA + norway)^-1 * (USA + nz)^-1 * (USA +
## russia)^-1 * (USA + sweden)^-1 * australia * (australia + denmark)^-1 * (australia +
## germany)^-1 * (australia + russia)^-1 * (australia + sweden)^-1 * denmark^11 * (denmark +
## france)^-1 * (denmark + germany)^-3 * (denmark + netherlands)^-1 * (denmark + norway)^-1
## * (denmark + nz)^-1 * (denmark + russia)^-1 * (denmark + sweden)^-2 * (denmark +
## ukraine)^-1 * france^2 * (france + germany)^-1 * (france + sweden)^-1 * (france +
## ukraine)^-1 * germany^11 * (germany + netherlands)^-1 * (germany + norway)^-1 * (germany
## + nz)^-1 * (germany + russia)^-1 * (germany + sweden)^-2 * (germany + ukraine)^-1 *
## netherlands^2 * (netherlands + norway)^-1 * (netherlands + nz)^-1 * norway^2 * (norway +
## nz)^-1 * nz^2 * russia^3 * (russia + sweden)^-1 * sweden^7 * (sweden + ukraine)^-1 *
## ukraine)

So soling_diff is the difference between H— which is soling plus the observations from the roundrobins and quarter finals—and soling without the extra information. In other words, the information in the roundrobins and quarter finals.

equalp.test(soling_diff)

## 
##  Constrained support maximization
## 
## data:  soling_diff
## null hypothesis: australia = canada = denmark = finland = france = germany = italy = netherlands = norway = nz = russia = spain = sweden = UK = ukraine = USA
## null estimate:
##   australia      canada     denmark     finland      france     germany       italy netherlands 
##      0.0625      0.0625      0.0625      0.0625      0.0625      0.0625      0.0625      0.0625 
##      norway          nz      russia       spain      sweden          UK     ukraine         USA 
##      0.0625      0.0625      0.0625      0.0625      0.0625      0.0625      0.0625      0.0625 
## (argmax, constrained optimization)
## Support for null:  -31.192 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##   australia      canada     denmark     finland      france     germany       italy netherlands 
##    0.016034    0.081594    0.138581    0.081594    0.030264    0.138582    0.081594    0.027242 
##      norway          nz      russia       spain      sweden          UK     ukraine         USA 
##    0.027242    0.027242    0.098921    0.081594    0.133623    0.011283    0.011283    0.013330 
## (argmax, free optimization)
## Support for alternative:  -22.405 + K
## 
## degrees of freedom: 15
## support difference = 8.7868
## p-value: 0.28575

Package dataset

Following lines create soling.rda, residing in the data/ directory of the package.

save(soling,soling_after, soling_maxp,soling_after_maxp,soling_table,
     soling_rr1, soling_rr2, soling_qf,
  file="soling.rda")

References

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.

soling

R. K. S. Hankin

Package dataset

References