To cite the hyper2 package in publications, please use Hankin (2017). Here I modify standard Plackett-Luce likelihood functions to include an element of non-independence among the competitors. There seem to be a few distinct statistical models for nonindependence:

1 Start simple: three runners

The simplest nontrivial case is three runners, \(\left\lbrace 1,2,3\right\rbrace\) where runner 1 helps runner 2 with multiplier \(\lambda\) [and that is the only non PL interaction]. Note that this is not a symmetric interaction: 1 helps 2, but 2 does not help 1. The model is that 1 helps 2 if and only if 1 has finished and 2 has not.

123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)

132: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)

213: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_3}\)

231: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_3}{p_1+p_3}\)

312: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_2}\)

321: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_2}{p_1+p_2}\)

(Above, the probabilities clearly sum to 1). Next simplest: 1 and 2 mutually support each other with \(\lambda\):

123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)

132: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)

213: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{\lambda p_1}{p_1+p_3}\)

231: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_1+p_3}\)

312: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_2}\)

321: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_2}{p_1+p_2}\)

Now we retain the mutual support between 1 and 2 but allow an asymmetry: 1 helps 2 with \(\lambda\) as before, but now in addition, 2 helps 1 with \(\mu\):

123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)

132: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)

213: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{\mu p_1}{\mu p_1+p_3}\)

231: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_3}{\mu p_1+p_3}\)

312: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_2}\)

321: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_2}{p_1+p_2}\)

Now how about 1 must beat 2. He does not care whether he comes first or second, only that he must beat 2. Player 1 has sufficient energy to win if he wants, but is subject to some sort of resource constraint (money, perhaps) and wishes to expend only enough money to beat 2. So conventional P-L likelihoods are modified for 1, but only if 2 is still running.

123: \(\frac{\lambda p_1}{\lambda p_1+p_2+p_3}\cdot\frac{p_2}{p_2+p_3}\)

132: \(\frac{\lambda p_1}{\lambda p_1+p_2+p_3}\cdot\frac{p_3}{p_2+p_3}\)

213: \(\frac{p_2}{\lambda p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_3}\)

231: \(\frac{p_2}{\lambda p_1+p_2+p_3}\cdot\frac{p_3}{p_1+p_3}\)

312: \(\frac{p_3}{\lambda p_1+p_2+p_3}\cdot\frac{\lambda p_1}{\lambda p_1+p_2}\)

321: \(\frac{p_3}{\lambda p_1+p_2+p_3}\cdot\frac{p_2}{\lambda p_1+p_2}\)

1.1 Four runners

Now we have four runners \(\left\lbrace 1,2,3,4\right\rbrace\), again with the only interaction being runner 1 helping runner 2. Thus, in a situation where \(p_1\) has finished and \(p_2\) is still running, we have \(p_2\longrightarrow\lambda p_2\) [although the \(\lambda\) cancels if 2 is the only competitor still running, that is, if all the remaining competitors are helped]

1234: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_3}{p_3+p_4}\)

1243: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{p_3+p_4}\)

1324: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\)

1342: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_4}\)

1324: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\)

1423: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{p_1+\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)

1432: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{p_1+\lambda p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3}\)

2xxx: P-L [no help given by 2 to 1]

3124: \(\frac{p_3}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\)

3142: \(\frac{p_3}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_4}\cdot\frac{p_4}{\lambda p_2+p_4}\)

32xx: P-L

3412: P-L [1 helps 2 but when 1 crosses the finishing line, only 2 is still running; there are no non-helped competitors]

4123: \(\frac{p_4}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)

4132: \(\frac{p_4}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)

42xx: P-L

43xx: P-L

Now 1 and 2 mutually support one another by \(\lambda\). Here I will put the last term in, to show explicitly that the \(\lambda\) term remains in the expressions for numerator and denominator.

1234: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_3}{p_3+p_4}\cdot\frac{p_4}{p_4}\)

1243: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{p_3+p_4}\cdot\frac{p_3}{p_3}\)

1324: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\cdot\frac{p_4}{p_4}\)

1342: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_4}\cdot\frac{\lambda p_2}{\lambda p_2}\)

1423: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\cdot\frac{p_3}{p_3}\)

1432: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2}\)

2134: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_1}{\lambda p_1+p_3+p_4}\cdot\frac{p_3}{p_3+p_4}\cdot\frac{p_4}{p_4}\)

2143: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_1}{\lambda p_1+p_3+p_4}\cdot\frac{p_4}{p_3+p_4}\cdot\frac{p_3}{p_3}\)

2314: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_3}{\lambda p_1+p_3+p_4}\cdot\frac{\lambda p_1}{\lambda p_1+p_4}\cdot\frac{p_4}{p_4}\)

2341: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_3}{\lambda p_1+p_3+p_4}\cdot\frac{p_4}{\lambda p_1+p_4}\cdot\frac{\lambda p_1}{\lambda p_1}\)

We will calculate a Plackett-Luce likelihood function for four runners \(\left\lbrace 1,2,3,4\right\rbrace\) but with a reified Bradley-Terry monster that corresponds to having subsets of mutual friends who support one another. For example, we might have runners 1 and 2 being friends, and runners 3 and 4 being friends: we would represent this as \(\left\lbrace\left\lbrace 1,2\right\rbrace,\left\lbrace 3,4\right\rbrace\right\rbrace\).

The runner metaphor is not the only one. We can apply the resulting likelihood functions to other cases including the red bus-blue bus problem (see inst/red_bus_blue_bus.Rmd for more discussion).

First do three runners, and suppose the equivalence classes are \(\left\lbrace\left\lbrace 1,2\right\rbrace,\left\lbrace 3\right\rbrace\right\rbrace\).

123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_2+M}{p_2+p_3+M}\qquad\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)

132: \(\frac{p_1}{p_1+p_3+p_2}\cdot\frac{p_3}{p_3+p_2+M}\qquad\frac{p_1}{p_1+p_3+p_2}\cdot\frac{p_3}{p_3+\lambda p_2}\)

213: \(\frac{p_2}{p_2+p_1+p_3}\cdot\frac{p_1+M}{p_1+p_3+M}\qquad\frac{p_2}{p_2+p_1+p_3}\cdot\frac{\lambda p_1}{\lambda p_1+p_3}\)

231: \(\frac{p_2}{p_2+p_3+p_1}\cdot\frac{p_3}{p_3+p_1+M}\qquad\frac{p_2}{p_2+p_1+p_3}\cdot\frac{\lambda p_1}{p_3+\lambda p_1}\)

312: \(\frac{p_3}{p_3+p_1+p_2}\cdot\frac{p_1}{p_2+p_1}\qquad\frac{p_3}{p_3+p_1+p_2}\cdot\frac{\lambda p_1}{p_3+p_1}\)

321: \(\frac{p_3}{p_3+p_2+p_1}\cdot\frac{p_2}{p_2+p_1}\qquad\frac{p_3}{p_3+p_2+p_1}\cdot\frac{p_2}{p_2+p_1}\)

Now four runners with classes \(\left\lbrace\left\lbrace 1,2\right\rbrace,\left\lbrace 3,4\right\rbrace\right\rbrace\).

1234: \(\frac{1}{1+2+3+4}\cdot\frac{2+M}{2+3+4+M}\cdot\frac{3}{3+4}\)

1243: \(\frac{1}{1+2+4+3}\cdot\frac{2+M}{2+4+3+M}\cdot\frac{4}{4+3}\)

1324: \(\frac{1}{1+3+2+4}\cdot\frac{3}{3+2+4+M}\cdot\frac{2}{2+4+M}\)

1342: \(\frac{1}{1+3+4+2}\cdot\frac{3}{3+4+2+M}\cdot\frac{4+M}{2+4+M}\)

1423: \(\frac{1}{1+4+2+3}\cdot\frac{4}{4+2+3+M}\cdot\frac{2}{2+3}\)

1432: \(\frac{1}{1+4+3+2}\cdot\frac{4}{4+3+2+M}\cdot\frac{3}{2+3}\)

2134: \(\frac{2}{2+1+3+4}\cdot\frac{1+M}{1+3+4+M}\cdot\frac{3}{3+4}\)

2143: \(\frac{2}{2+1+4+3}\cdot\frac{1+M}{1+4+3+M}\cdot\frac{4}{4+3}\)

2314: \(\frac{2}{2+3+1+4}\cdot\frac{3}{3+1+4+M}\cdot\frac{1}{1+4+M}\)

2341: \(\frac{2}{2+3+4+1}\cdot\frac{3}{3+4+1+M}\cdot\frac{4+M}{1+4+M}\)

2413: \(\frac{2}{2+4+1+3}\cdot\frac{4}{4+1+3+M}\cdot\frac{1}{1+3}\)

2431: \(\frac{2}{2+4+3+1}\cdot\frac{4}{4+3+1+M}\cdot\frac{3}{1+3}\)

etc.

Now 123 tethered, 45 tethered:

12345: \(\frac{1}{1+2+3+4+5}\cdot\frac{2+M}{2+3+4+5+M}\cdot\frac{3+M}{3+4+5+M}\cdot\frac{4}{4+5}\)

12354: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+3+5+4+M}\cdot\frac{3+M}{3+5+4+M}\cdot\frac{5}{5+4}\)

12435: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+3+5+4+M}\cdot\frac{4}{3+5+4+M}\cdot\frac{3}{3+5+M}\)

12453: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+3+5+4+M}\cdot\frac{4}{3+5+4+M}\cdot\frac{5+M}{5+3+M}\)

12534: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+5+3+4+M}\cdot\frac{5}{3+5+4+M}\cdot\frac{3}{3+4+M}\)

12543: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+5+4+3+M}\cdot\frac{5}{3+5+4+M}\cdot\frac{4+M}{4+3+M}\)

Might be better to consider \({5\choose 2\,3}\) arrangements of \(aaabb\ldots bbaaa\). FOLLOWING IS NOT RIGHT: THE PROBABILITY MODEL IS NOT PRECISELY DEFINED.

aaabb: \(\frac{a}{a+a+a+b+b}\cdot\frac{a+M}{a+a+b+b+M}\cdot\frac{a+M}{a+b+b+M}\cdot\frac{b}{b+b+M}\)

aabab: \(\frac{a}{a+a+a+b+b}\cdot\frac{a+M}{a+a+b+b+M}\cdot\frac{b}{a+b+b+M}\cdot\frac{a}{a+b+M}\)

aabba: \(\frac{a}{a+a+a+b+b}\cdot\frac{a+M}{a+a+b+b+M}\cdot\frac{b}{a+b+b+M}\cdot\frac{b+M}{b+a+M}\)

abaab: \(\frac{a}{a+a+a+b+b}\cdot\frac{b}{a+a+b+b+M}\cdot\frac{a}{a+a+b+M}\cdot\frac{a+M}{a+b+M}\)

ababa: \(\frac{a}{a+a+a+b+b}\cdot\frac{b}{a+a+b+b+M}\cdot\frac{a}{a+a+b+M}\cdot\frac{b}{b+a+M}\)

abbaa: \(\frac{a}{a+a+a+b+b}\cdot\frac{b}{a+a+b+b+M}\cdot\frac{b+M}{b+a+a+M}\cdot\frac{a}{a+a+M}\)

baaab: \(\frac{b}{a+a+a+b+b}\cdot\frac{a}{a+a+a+b+M}\cdot\frac{a+M}{a+a+b+M}\cdot\frac{a+M}{a+b+M}\)

baaba: \(\frac{b}{a+a+a+b+b}\cdot\frac{a}{a+a+a+b+M}\cdot\frac{a+M}{a+a+b+M}\cdot\frac{b}{a+b+M}\)

babaa: \(\frac{b}{a+a+a+b+b}\cdot\frac{a}{a+a+a+b+M}\cdot\frac{b}{b+a+a+M}\cdot\frac{a}{a+a}\)

bbaaa: \(\frac{b}{a+a+a+b+b}\cdot\frac{b+M}{a+a+a+b+M}\cdot\frac{a}{a+a+a}\cdot\frac{a}{a+a}\)

f <- function (a,b,M){
return(
  factorial(3)*factorial(2)*(
   a/(a+a+a+b+b)*(a+M)/(a+a+b+b+M)*(a+M)/(a+b+b+M)*(b  )/(b+b+M) +
   a/(a+a+a+b+b)*(a+M)/(a+a+b+b+M)*(b  )/(a+b+b+M)*(a  )/(a+b+M) +
   a/(a+a+a+b+b)*(a+M)/(a+a+b+b+M)*(b  )/(a+b+b+M)*(b+M)/(b+a+M) +
   a/(a+a+a+b+b)*(b  )/(a+a+b+b+M)*(a  )/(a+a+b+M)*(a+M)/(a+b+M) +
   a/(a+a+a+b+b)*(b  )/(a+a+b+b+M)*(a  )/(a+a+b+M)*(b  )/(b+a+M) +
   a/(a+a+a+b+b)*(b  )/(a+a+b+b+M)*(b+M)/(b+a+a+M)*(a  )/(a+a+M) +
   b/(a+a+a+b+b)*(a  )/(a+a+a+b+M)*(a+M)/(a+a+b+M)*(a+M)/(a+b+M) +
   b/(a+a+a+b+b)*(a  )/(a+a+a+b+M)*(a+M)/(a+a+b+M)*(b  )/(a+b+M) +
   b/(a+a+a+b+b)*(a  )/(a+a+a+b+M)*(b  )/(b+a+a  )*(a  )/(a+a  ) +
   b/(a+a+a+b+b)*(b+M)/(a+a+a+b+M)*(a  )/(a+a+a  )*(a  )/(a+a  )
   ) 
   )
}
f(4.4,3.3, 0.2)*12
## [1] 12.09
f(4.4,3.3, 0.0)*12
## [1] 12
Partial probability tree for five competitors $a-e$ with $a$ supporting $b$, hyper3 approach

Figure 1.1: Partial probability tree for five competitors \(a-e\) with \(a\) supporting \(b\), hyper3 approach

Partial probability tree for five competitors $a-e$ with mutually supporting  group $abc$, hyper3 approach.  If any competitor in the set {$a,b,c$} finishes, they lend their strength to the other competitors in the set who are still running

Figure 1.2: Partial probability tree for five competitors \(a-e\) with mutually supporting group \(abc\), hyper3 approach. If any competitor in the set {\(a,b,c\)} finishes, they lend their strength to the other competitors in the set who are still running

Partial probability tree for five competitors $a$-$e$ with mutually supporting subsets $\left\lbrace a,b,c\right\rbrace$ [with support term $\lambda$] and $(de)$ [with support term $\mu$], hyper3 approach

Figure 1.3: Partial probability tree for five competitors \(a\)-\(e\) with mutually supporting subsets \(\left\lbrace a,b,c\right\rbrace\) [with support term \(\lambda\)] and \((de)\) [with support term \(\mu\)], hyper3 approach

2 Package idiom

We can investigate red bus-blue bus phenomenon (as discussed, in a slightly different context, in inst/red_bus_blue_bus.Rmd). Here, we consider a person who is given the choice of five transport methods:

Now, he does not really care what colour the bus is. If we ask him to rank his options, it is highly probable that he will put RB and BB consecutively (because they are essentially indistinguishable). Can we quantify the strength of this effect? To do this, we define a bespoke function RB_BB_LF() which returns a hyper3 log-likelihood function corresponding to repeated observations of our commuter’s reported ranks for the five options:

`RB_BB_LF` <- function(s){
    ec <- c(C=1,T=2,RB=3,BB=3,W=4) # equivalence classes
    h <- c(1,1,s,1)                # strength of support
    (
        cheering3(v=c("RB","BB","C" ,"T","W"),e=ec,h=h)*3 + 
        cheering3(v=c("BB","RB","T" ,"C","W"),e=ec,h=h)*2 + 
        cheering3(v=c("T" ,"BB","RB","C","W"),e=ec,h=h)*2 + 
        cheering3(v=c("W" ,"BB","RB","T","C"),e=ec,h=h)*4 + 
        cheering3(v=c("C" ,"RB","BB","W","T"),e=ec,h=h)*4 + 
        cheering3(v=c("BB","C" ,"RB","T","W"),e=ec,h=h)*3
    )
}

Above, we see from the function body that he reported RB,BB,C,T,W three times [first row], BB,RB,T,C,W twice [second row], and so on; perhaps his ranking depends on the weather or how tired he is on any given day. Observe that in almost every case he ranks RB and BB consecutively. Function RB_BB_LF() takes argument s that quantifies the perceived similarity between RB and BB. For example:

(H <- RB_BB_LF(s=1.8888))
## log( (BB=1)^11 * (BB=1, C=1, RB=1, T=1)^-4 * (BB=1, C=1, RB=1, T=1, W=1)^-18 * (BB=1, C=1, RB=1, W=1)^-2 *
## (BB=1, RB=1, T=1, W=1)^-4 * (BB=1.8888)^7 * (BB=1.8888, C=1, T=1, W=1)^-3 * (BB=1.8888, T=1, W=1)^-4 *
## (C=1)^14 * (C=1, RB=1.8888, T=1)^-4 * (C=1, RB=1.8888, T=1, W=1)^-5 * (C=1, RB=1.8888, W=1)^-2 * (C=1,
## T=1)^-4 * (C=1, T=1, W=1)^-5 * (C=1, W=1)^-4 * (RB=1)^7 * (RB=1.8888)^11 * (RB=1.8888, T=1, W=1)^-3 *
## (T=1)^14 * (T=1, W=1)^-10 * (W=1)^8)
(mH <- maxp(H,n=1))
##       BB        C       RB        T        W 
## 0.426153 0.147463 0.281683 0.091811 0.052890

Now to find a profile likelihood function for s:

o <- function(s){maxp(RB_BB_LF(s),give=TRUE,n=1)$likes} # optimand
s <- exp(seq(from=log(1.3),to=log(47),len=17)) # putative similarity measures
L <- sapply(s,o)
L <- L-max(L)

We can plot these:

plot(s,L,type="b")
abline(h=c(0,-2))
abline(v=1)

plot(log(s),L,type="b")
abline(h=c(0,-2))
abline(v=0)

And formally maximize the likelihood:

(osup <- optimize(o,c(6,10),maximum=TRUE))
## $maximum
## [1] 7.0406
## 
## $objective
## [1] -66.406

So a likelihood ratio test of the null that \(S=1\) would be:

(suppdiff <- o(osup$maximum) - o(1))
## [1] 3.7514

Easily satisfying Edwards’s two-units-of-support criterion; Wilks gives us an asymptotic \(p\)-value:

pchisq(suppdiff*2,df=1,lower.tail=FALSE)
## [1] 0.00616

Now use the evaluate for the likelihood function:

maxHmax <- maxp(RB_BB_LF(s = osup$maximum))
maxHmax
##       BB        C       RB        T        W 
## 0.393591 0.195070 0.215299 0.124513 0.071527

3 University ranking analysis

Here we use a dataset of university rankings, timesData.csv, taken from https://github.com/arnaudbenard/university-ranking/blob/master/timesData.csv.

a <- read.table("timesData.txt",sep=",", header=TRUE)
wanted <- c("California Institute of Technology", "Harvard University", 
"Massachusetts Institute of Technology", "Princeton University", 
"Stanford University", "University of Cambridge", "University of Oxford")
names(wanted) <- c("cal","harv","mass","prin","stan","cam","ox")

a <- a[a$university_name %in% wanted,]
a <- cbind(a,"top7rank"=0)
for(y in unique(a$year)){
    a[a$year==y,"top7rank"] <- order(
                  as.numeric(a[a$year==y,"world_rank"]) + 
                                     a[a$year==y,"research"]/1e6,
decreasing=TRUE)}
a <- a[,c("top7rank","university_name","year")]
a <- reshape(a,idvar="university_name",timevar="year",direction="wide")
for(i in seq_len(nrow(a))){
   a$university_name[i] <- names(which(wanted == a$university_name[i]))
}
rownames(a) <- a$university_name
a <- a[,-1]
colnames(a) <- paste("Y",2011:2016,sep="")
a
##      Y2011 Y2012 Y2013 Y2014 Y2015 Y2016
## harv     6     6     4     6     6     2
## cal      7     7     7     7     7     7
## mass     5     1     2     2     2     3
## stan     4     5     6     4     4     5
## prin     3     3     3     3     1     1
## cam      2     2     1     1     3     4
## ox       1     4     5     5     5     6
H <- ordertable2supp(a)
equalp.test(H)
## 
##  Constrained support maximization
## 
## data:  H
## null hypothesis: cal = cam = harv = mass = ox = prin = stan
## null estimate:
##     cal     cam    harv    mass      ox    prin    stan 
## 0.14286 0.14286 0.14286 0.14286 0.14286 0.14286 0.14286 
## (argmax, constrained optimization)
## Support for null:  -51.151 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##      cal      cam     harv     mass       ox     prin     stan 
## 0.000001 0.348942 0.015828 0.265173 0.038985 0.287311 0.043760 
## (argmax, free optimization)
## Support for alternative:  -29.415 + K
## 
## degrees of freedom: 6
## support difference = 21.736
## p-value: 9.4034e-08
samep.test(H,c("ox","cam"))
## 
##  Constrained support maximization
## 
## data:  H
## null hypothesis: ox = cam
## null estimate:
##        cal        cam       harv       mass         ox       prin       stan 
## 1.0002e-06 1.3346e-01 2.9711e-02 2.8104e-01 1.3346e-01 3.4667e-01 7.5669e-02 
## (argmax, constrained optimization)
## Support for null:  -33.153 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##      cal      cam     harv     mass       ox     prin     stan 
## 0.000001 0.348942 0.015828 0.265173 0.038985 0.287311 0.043760 
## (argmax, free optimization)
## Support for alternative:  -29.415 + K
## 
## degrees of freedom: 1
## support difference = 3.7385
## p-value: 0.0062492

Start to use hyper3 idiom:

H3 <- function(oxcam){
  out <- hyper3()
  for(i in seq_len(ncol(a))){
    jj <- rep("",nrow(a))
    jj[a[,i]] <- rownames(a)
    out <- out + cheering3(v=jj,e=c(ox=1,cam=1,prin=2, stan=3, mass=4, harv=5, cal=6), help=c(oxcam,1,1,1,1,1))
  }
  return(out)
}
o <- function(oxcam){maxp(H3(oxcam),give=TRUE,n=1)$likes}
oc <- exp(seq(from=log(0.5),to=log(5),len=15))
L <- sapply(oc,o)
L <- L - max(L)
plot(log(oc),L,type="b")
abline(v=0)

4 Five nations championship

The five nations rugby championship was held from 1910 to 1999 and file five_nations.txt shows the order statistic for England (E), Scotland (S), Ireland (I), Wales (W), and France (F).

https://en.wikipedia.org/wiki/Six_Nations_Championship

Here is hyper2 analysis:

a <- as.matrix(read.table("five_nations.txt",header=FALSE))
head(a)
##      V1     V2  V3  V4  V5  V6 
## [1,] "1910" "E" "W" "S" "I" "F"
## [2,] "1911" "W" "I" "E" "F" "S"
## [3,] "1912" "E" "I" "S" "W" "F"
## [4,] "1913" "E" "W" "S" "I" "F"
## [5,] "1914" "E" "W" "I" "S" "F"
## [6,] "1920" "W" "S" "E" "F" "I"
H <- hyper2()
for(i in seq_len(nrow(a))){
  H <- H + race(a[i,-1])
} 
mH <- maxp(H)
pie(mH)

equalp.test(H)
## 
##  Constrained support maximization
## 
## data:  H
## null hypothesis: E = F = I = S = W
## null estimate:
##   E   F   I   S   W 
## 0.2 0.2 0.2 0.2 0.2 
## (argmax, constrained optimization)
## Support for null:  -335.12 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##       E       F       I       S       W 
## 0.22044 0.21524 0.15607 0.15988 0.24837 
## (argmax, free optimization)
## Support for alternative:  -331.03 + K
## 
## degrees of freedom: 4
## support difference = 4.0955
## p-value: 0.08483

Now use hyper3 to see whether teams do better following a win:

rugby <- function(lambda){
  H3a <- hyper3()
  for(i in seq(from=2,to=nrow(a))){
        last_year_winner <- a[i-1,2] # e.g. "W" or "E"
        H3a <- H3a + ordervec2supp3a(a[i,-1],nonfinishers=NULL,helped=last_year_winner,lambda=lambda)
  }
  return(H3a)
}
rugby(1.888)
## log( (E=1)^38 * (E=1, F=1)^-7 * (E=1, F=1, I=1)^-6 * (E=1, F=1, I=1, S=1)^-23 * (E=1, F=1, I=1, S=1,
## W=1)^-69 * (E=1, F=1, I=1, W=1)^-6 * (E=1, F=1, S=1)^-4 * (E=1, F=1, S=1, W=1)^-6 * (E=1, F=1, W=1)^-4 *
## (E=1, I=1)^-7 * (E=1, I=1, S=1)^-11 * (E=1, I=1, S=1, W=1)^-16 * (E=1, I=1, W=1)^-4 * (E=1, S=1)^-9 * (E=1,
## S=1, W=1)^-7 * (E=1, W=1)^-2 * (E=1.888)^18 * (F=1)^42 * (F=1, I=1)^-4 * (F=1, I=1, S=1)^-12 * (F=1, I=1,
## S=1, W=1)^-18 * (F=1, I=1, W=1)^-5 * (F=1, S=1)^-10 * (F=1, S=1, W=1)^-5 * (F=1, W=1)^-5 * (F=1.888)^16 *
## (I=1)^45 * (I=1, S=1)^-11 * (I=1, S=1, W=1)^-11 * (I=1, W=1)^-10 * (I=1.888)^6 * (S=1)^47 * (S=1, W=1)^-4 *
## (S=1.888)^6 * (W=1)^35 * (W=1.888)^23)
rugby(1.111111)
## log( (E=1)^38 * (E=1, F=1)^-7 * (E=1, F=1, I=1)^-6 * (E=1, F=1, I=1, S=1)^-23 * (E=1, F=1, I=1, S=1,
## W=1)^-69 * (E=1, F=1, I=1, W=1)^-6 * (E=1, F=1, S=1)^-4 * (E=1, F=1, S=1, W=1)^-6 * (E=1, F=1, W=1)^-4 *
## (E=1, I=1)^-7 * (E=1, I=1, S=1)^-11 * (E=1, I=1, S=1, W=1)^-16 * (E=1, I=1, W=1)^-4 * (E=1, S=1)^-9 * (E=1,
## S=1, W=1)^-7 * (E=1, W=1)^-2 * (E=1.1111)^18 * (F=1)^42 * (F=1, I=1)^-4 * (F=1, I=1, S=1)^-12 * (F=1, I=1,
## S=1, W=1)^-18 * (F=1, I=1, W=1)^-5 * (F=1, S=1)^-10 * (F=1, S=1, W=1)^-5 * (F=1, W=1)^-5 * (F=1.1111)^16 *
## (I=1)^45 * (I=1, S=1)^-11 * (I=1, S=1, W=1)^-11 * (I=1, W=1)^-10 * (I=1.1111)^6 * (S=1)^47 * (S=1, W=1)^-4
## * (S=1.1111)^6 * (W=1)^35 * (W=1.1111)^23)
maxp(rugby(1.8),n=1,give=TRUE)
## $par
##       E       F       I       S 
## 0.21677 0.22212 0.15621 0.15881 
## 
## $value
## [1] -285.72
## 
## $counts
## function gradient 
##       35        8 
## 
## $convergence
## [1] 0
## 
## $message
## NULL
## 
## $outer.iterations
## [1] 2
## 
## $barrier.value
## [1] 0.00015929
## 
## $likes
## [1] -285.72
## 
## $evaluate
##       E       F       I       S       W 
## 0.21677 0.22212 0.15621 0.15881 0.24610
maxp(rugby(1.9),n=1,give=TRUE)
## $par
##       E       F       I       S 
## 0.21677 0.22212 0.15621 0.15881 
## 
## $value
## [1] -281.98
## 
## $counts
## function gradient 
##       35        8 
## 
## $convergence
## [1] 0
## 
## $message
## NULL
## 
## $outer.iterations
## [1] 2
## 
## $barrier.value
## [1] 0.00015929
## 
## $likes
## [1] -281.98
## 
## $evaluate
##       E       F       I       S       W 
## 0.21677 0.22212 0.15621 0.15881 0.24610

References

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.