To cite the hyper2 package in publications, please use Hankin (2017).
Here I modify standard Plackett-Luce likelihood functions to include
an element of non-independence among the competitors. There seem to
be a few distinct statistical models for nonindependence:
Runner \(i\) might wish to support \(j\). If runner \(i\) finishes, once his rank is established he might be able to help \(j\) in some way. He might cheer his still-running friend, spurring him to run faster and augmenting his competitive strength.
In general, we can consider \(n\) ordered pairs \((i_r,j_r)\), \(1\leq r\leq n\), with runner \(i_1\) helping runner \(j_1\), runner \(i_2\) helping \(j_2\), and so on. One simple case would be equivalence classes of \([n]=\{1,2,\ldots,n\}\) with each equivalence class comprising mutually supporting runners. In this model, with a pair of mutually supporting runners only the higher-placed competitor can affect the other. Each equivalence class might have a different supportive proclivity.
Runner \(i\) might hate runner \(j\). If \(i\) finishes, he might expend his spare energy hindering \(j\) (by throwing stones or booing, for example). Note that equivalence classes of mutual hatred might exist.
Also, the help or hindrance might have a finite lifetime. Under one model, if \(i\) finishes then he can stay helping (or hindering) \(j\) wherever \(j\) places. Alternatively, suppose \(i\) places \(n\)th. Then perhaps he can affect \(j\) only place \(n+1\), and after a competitor finishes in place \(n+1\), then \(i\) can exert no further influence.
The help or hindrance discussed above might be contingent on placing: perhaps only the winnner (that is, first-placing competitor) would be able to influence other competitors. Or maybe competitors with any placing could influence other competitors.
Runners \(i\) and \(j\) might elect to cross the finishing line holding hands, with the apparent tie broken randomly. This would be a symmetrical relationship. In general, we could consider hand-holding equivalence classes.
We might have a resource allocation problem, with runners having a certain [possibly unknown] amount of “strength” but also the ability to expend this strength on objectives other than ranking in the race. A runner might be able to expend just enough energy to remain slightly ahead of a rival, reserving energy for other purposes. If runner \(i\) wishes to beat runner \(j\), and has sufficient resources to do so, then runners \(i\) and \(j\) will finish consecutively with \(i\) ahead of \(j\).
The simplest nontrivial case is three runners, \(\left\lbrace 1,2,3\right\rbrace\) where runner 1 helps runner 2 with multiplier \(\lambda\) [and that is the only non PL interaction]. Note that this is not a symmetric interaction: 1 helps 2, but 2 does not help 1. The model is that 1 helps 2 if and only if 1 has finished and 2 has not.
123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)
132: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)
213: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_3}\)
231: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_3}{p_1+p_3}\)
312: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_2}\)
321: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_2}{p_1+p_2}\)
(Above, the probabilities clearly sum to 1). Next simplest: 1 and 2 mutually support each other with \(\lambda\):
123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)
132: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)
213: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{\lambda p_1}{p_1+p_3}\)
231: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_1+p_3}\)
312: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_2}\)
321: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_2}{p_1+p_2}\)
Now we retain the mutual support between 1 and 2 but allow an asymmetry: 1 helps 2 with \(\lambda\) as before, but now in addition, 2 helps 1 with \(\mu\):
123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)
132: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)
213: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{\mu p_1}{\mu p_1+p_3}\)
231: \(\frac{p_2}{p_1+p_2+p_3}\cdot\frac{p_3}{\mu p_1+p_3}\)
312: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_2}\)
321: \(\frac{p_3}{p_1+p_2+p_3}\cdot\frac{p_2}{p_1+p_2}\)
Now how about 1 must beat 2. He does not care whether he comes first or second, only that he must beat 2. Player 1 has sufficient energy to win if he wants, but is subject to some sort of resource constraint (money, perhaps) and wishes to expend only enough money to beat 2. So conventional P-L likelihoods are modified for 1, but only if 2 is still running.
123: \(\frac{\lambda p_1}{\lambda p_1+p_2+p_3}\cdot\frac{p_2}{p_2+p_3}\)
132: \(\frac{\lambda p_1}{\lambda p_1+p_2+p_3}\cdot\frac{p_3}{p_2+p_3}\)
213: \(\frac{p_2}{\lambda p_1+p_2+p_3}\cdot\frac{p_1}{p_1+p_3}\)
231: \(\frac{p_2}{\lambda p_1+p_2+p_3}\cdot\frac{p_3}{p_1+p_3}\)
312: \(\frac{p_3}{\lambda p_1+p_2+p_3}\cdot\frac{\lambda p_1}{\lambda p_1+p_2}\)
321: \(\frac{p_3}{\lambda p_1+p_2+p_3}\cdot\frac{p_2}{\lambda p_1+p_2}\)
Now we have four runners \(\left\lbrace 1,2,3,4\right\rbrace\), again with the only interaction being runner 1 helping runner 2. Thus, in a situation where \(p_1\) has finished and \(p_2\) is still running, we have \(p_2\longrightarrow\lambda p_2\) [although the \(\lambda\) cancels if 2 is the only competitor still running, that is, if all the remaining competitors are helped]
1234: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_3}{p_3+p_4}\)
1243: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{p_3+p_4}\)
1324: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\)
1342: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_4}\)
1324: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\)
1423: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{p_1+\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)
1432: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{p_1+\lambda p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3}\)
2xxx: P-L [no help given by 2 to 1]
3124: \(\frac{p_3}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\)
3142: \(\frac{p_3}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_4}\cdot\frac{p_4}{\lambda p_2+p_4}\)
32xx: P-L
3412: P-L [1 helps 2 but when 1 crosses the finishing line, only 2 is still running; there are no non-helped competitors]
4123: \(\frac{p_4}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)
4132: \(\frac{p_4}{p_1+p_2+p_3+p_4}\cdot\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_3}{\lambda p_2+p_3}\)
42xx: P-L
43xx: P-L
Now 1 and 2 mutually support one another by \(\lambda\). Here I will put the last term in, to show explicitly that the \(\lambda\) term remains in the expressions for numerator and denominator.
1234: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_3}{p_3+p_4}\cdot\frac{p_4}{p_4}\)
1243: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{p_3+p_4}\cdot\frac{p_3}{p_3}\)
1324: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_4}\cdot\frac{p_4}{p_4}\)
1342: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_4}\cdot\frac{\lambda p_2}{\lambda p_2}\)
1423: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_3+p_4}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\cdot\frac{p_3}{p_3}\)
1432: \(\frac{p_1}{p_1+p_2+p_3+p_4}\cdot\frac{p_4}{\lambda p_2+p_3+p_4}\cdot\frac{p_3}{\lambda p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2}\)
2134: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_1}{\lambda p_1+p_3+p_4}\cdot\frac{p_3}{p_3+p_4}\cdot\frac{p_4}{p_4}\)
2143: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_1}{\lambda p_1+p_3+p_4}\cdot\frac{p_4}{p_3+p_4}\cdot\frac{p_3}{p_3}\)
2314: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_3}{\lambda p_1+p_3+p_4}\cdot\frac{\lambda p_1}{\lambda p_1+p_4}\cdot\frac{p_4}{p_4}\)
2341: \(\frac{p_2}{p_1+p_2+p_3+p_4}\cdot\frac{\lambda p_3}{\lambda p_1+p_3+p_4}\cdot\frac{p_4}{\lambda p_1+p_4}\cdot\frac{\lambda p_1}{\lambda p_1}\)
We will calculate a Plackett-Luce likelihood function for four runners \(\left\lbrace 1,2,3,4\right\rbrace\) but with a reified Bradley-Terry monster that corresponds to having subsets of mutual friends who support one another. For example, we might have runners 1 and 2 being friends, and runners 3 and 4 being friends: we would represent this as \(\left\lbrace\left\lbrace 1,2\right\rbrace,\left\lbrace 3,4\right\rbrace\right\rbrace\).
The runner metaphor is not the only one. We can apply the resulting
likelihood functions to other cases including the red bus-blue bus
problem (see inst/red_bus_blue_bus.Rmd for more discussion).
First do three runners, and suppose the equivalence classes are \(\left\lbrace\left\lbrace 1,2\right\rbrace,\left\lbrace 3\right\rbrace\right\rbrace\).
123: \(\frac{p_1}{p_1+p_2+p_3}\cdot\frac{p_2+M}{p_2+p_3+M}\qquad\frac{p_1}{p_1+p_2+p_3}\cdot\frac{\lambda p_2}{\lambda p_2+p_3}\)
132: \(\frac{p_1}{p_1+p_3+p_2}\cdot\frac{p_3}{p_3+p_2+M}\qquad\frac{p_1}{p_1+p_3+p_2}\cdot\frac{p_3}{p_3+\lambda p_2}\)
213: \(\frac{p_2}{p_2+p_1+p_3}\cdot\frac{p_1+M}{p_1+p_3+M}\qquad\frac{p_2}{p_2+p_1+p_3}\cdot\frac{\lambda p_1}{\lambda p_1+p_3}\)
231: \(\frac{p_2}{p_2+p_3+p_1}\cdot\frac{p_3}{p_3+p_1+M}\qquad\frac{p_2}{p_2+p_1+p_3}\cdot\frac{\lambda p_1}{p_3+\lambda p_1}\)
312: \(\frac{p_3}{p_3+p_1+p_2}\cdot\frac{p_1}{p_2+p_1}\qquad\frac{p_3}{p_3+p_1+p_2}\cdot\frac{\lambda p_1}{p_3+p_1}\)
321: \(\frac{p_3}{p_3+p_2+p_1}\cdot\frac{p_2}{p_2+p_1}\qquad\frac{p_3}{p_3+p_2+p_1}\cdot\frac{p_2}{p_2+p_1}\)
Now four runners with classes \(\left\lbrace\left\lbrace 1,2\right\rbrace,\left\lbrace 3,4\right\rbrace\right\rbrace\).
1234: \(\frac{1}{1+2+3+4}\cdot\frac{2+M}{2+3+4+M}\cdot\frac{3}{3+4}\)
1243: \(\frac{1}{1+2+4+3}\cdot\frac{2+M}{2+4+3+M}\cdot\frac{4}{4+3}\)
1324: \(\frac{1}{1+3+2+4}\cdot\frac{3}{3+2+4+M}\cdot\frac{2}{2+4+M}\)
1342: \(\frac{1}{1+3+4+2}\cdot\frac{3}{3+4+2+M}\cdot\frac{4+M}{2+4+M}\)
1423: \(\frac{1}{1+4+2+3}\cdot\frac{4}{4+2+3+M}\cdot\frac{2}{2+3}\)
1432: \(\frac{1}{1+4+3+2}\cdot\frac{4}{4+3+2+M}\cdot\frac{3}{2+3}\)
2134: \(\frac{2}{2+1+3+4}\cdot\frac{1+M}{1+3+4+M}\cdot\frac{3}{3+4}\)
2143: \(\frac{2}{2+1+4+3}\cdot\frac{1+M}{1+4+3+M}\cdot\frac{4}{4+3}\)
2314: \(\frac{2}{2+3+1+4}\cdot\frac{3}{3+1+4+M}\cdot\frac{1}{1+4+M}\)
2341: \(\frac{2}{2+3+4+1}\cdot\frac{3}{3+4+1+M}\cdot\frac{4+M}{1+4+M}\)
2413: \(\frac{2}{2+4+1+3}\cdot\frac{4}{4+1+3+M}\cdot\frac{1}{1+3}\)
2431: \(\frac{2}{2+4+3+1}\cdot\frac{4}{4+3+1+M}\cdot\frac{3}{1+3}\)
etc.
Now 123 tethered, 45 tethered:
12345: \(\frac{1}{1+2+3+4+5}\cdot\frac{2+M}{2+3+4+5+M}\cdot\frac{3+M}{3+4+5+M}\cdot\frac{4}{4+5}\)
12354: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+3+5+4+M}\cdot\frac{3+M}{3+5+4+M}\cdot\frac{5}{5+4}\)
12435: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+3+5+4+M}\cdot\frac{4}{3+5+4+M}\cdot\frac{3}{3+5+M}\)
12453: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+3+5+4+M}\cdot\frac{4}{3+5+4+M}\cdot\frac{5+M}{5+3+M}\)
12534: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+5+3+4+M}\cdot\frac{5}{3+5+4+M}\cdot\frac{3}{3+4+M}\)
12543: \(\frac{1}{1+2+3+5+4}\cdot\frac{2+M}{2+5+4+3+M}\cdot\frac{5}{3+5+4+M}\cdot\frac{4+M}{4+3+M}\)
Might be better to consider \({5\choose 2\,3}\) arrangements of \(aaabb\ldots bbaaa\). FOLLOWING IS NOT RIGHT: THE PROBABILITY MODEL IS NOT PRECISELY DEFINED.
aaabb: \(\frac{a}{a+a+a+b+b}\cdot\frac{a+M}{a+a+b+b+M}\cdot\frac{a+M}{a+b+b+M}\cdot\frac{b}{b+b+M}\)
aabab: \(\frac{a}{a+a+a+b+b}\cdot\frac{a+M}{a+a+b+b+M}\cdot\frac{b}{a+b+b+M}\cdot\frac{a}{a+b+M}\)
aabba: \(\frac{a}{a+a+a+b+b}\cdot\frac{a+M}{a+a+b+b+M}\cdot\frac{b}{a+b+b+M}\cdot\frac{b+M}{b+a+M}\)
abaab: \(\frac{a}{a+a+a+b+b}\cdot\frac{b}{a+a+b+b+M}\cdot\frac{a}{a+a+b+M}\cdot\frac{a+M}{a+b+M}\)
ababa: \(\frac{a}{a+a+a+b+b}\cdot\frac{b}{a+a+b+b+M}\cdot\frac{a}{a+a+b+M}\cdot\frac{b}{b+a+M}\)
abbaa: \(\frac{a}{a+a+a+b+b}\cdot\frac{b}{a+a+b+b+M}\cdot\frac{b+M}{b+a+a+M}\cdot\frac{a}{a+a+M}\)
baaab: \(\frac{b}{a+a+a+b+b}\cdot\frac{a}{a+a+a+b+M}\cdot\frac{a+M}{a+a+b+M}\cdot\frac{a+M}{a+b+M}\)
baaba: \(\frac{b}{a+a+a+b+b}\cdot\frac{a}{a+a+a+b+M}\cdot\frac{a+M}{a+a+b+M}\cdot\frac{b}{a+b+M}\)
babaa: \(\frac{b}{a+a+a+b+b}\cdot\frac{a}{a+a+a+b+M}\cdot\frac{b}{b+a+a+M}\cdot\frac{a}{a+a}\)
bbaaa: \(\frac{b}{a+a+a+b+b}\cdot\frac{b+M}{a+a+a+b+M}\cdot\frac{a}{a+a+a}\cdot\frac{a}{a+a}\)
f <- function (a,b,M){
return(
factorial(3)*factorial(2)*(
a/(a+a+a+b+b)*(a+M)/(a+a+b+b+M)*(a+M)/(a+b+b+M)*(b )/(b+b+M) +
a/(a+a+a+b+b)*(a+M)/(a+a+b+b+M)*(b )/(a+b+b+M)*(a )/(a+b+M) +
a/(a+a+a+b+b)*(a+M)/(a+a+b+b+M)*(b )/(a+b+b+M)*(b+M)/(b+a+M) +
a/(a+a+a+b+b)*(b )/(a+a+b+b+M)*(a )/(a+a+b+M)*(a+M)/(a+b+M) +
a/(a+a+a+b+b)*(b )/(a+a+b+b+M)*(a )/(a+a+b+M)*(b )/(b+a+M) +
a/(a+a+a+b+b)*(b )/(a+a+b+b+M)*(b+M)/(b+a+a+M)*(a )/(a+a+M) +
b/(a+a+a+b+b)*(a )/(a+a+a+b+M)*(a+M)/(a+a+b+M)*(a+M)/(a+b+M) +
b/(a+a+a+b+b)*(a )/(a+a+a+b+M)*(a+M)/(a+a+b+M)*(b )/(a+b+M) +
b/(a+a+a+b+b)*(a )/(a+a+a+b+M)*(b )/(b+a+a )*(a )/(a+a ) +
b/(a+a+a+b+b)*(b+M)/(a+a+a+b+M)*(a )/(a+a+a )*(a )/(a+a )
)
)
}
f(4.4,3.3, 0.2)*12
## [1] 12.09
f(4.4,3.3, 0.0)*12
## [1] 12
Figure 1.1: Partial probability tree for five competitors \(a-e\) with \(a\) supporting \(b\), hyper3 approach
Figure 1.2: Partial probability tree for five competitors \(a-e\) with mutually supporting group \(abc\), hyper3 approach. If any competitor in the set {\(a,b,c\)} finishes, they lend their strength to the other competitors in the set who are still running
Figure 1.3: Partial probability tree for five competitors \(a\)-\(e\) with mutually supporting subsets \(\left\lbrace a,b,c\right\rbrace\) [with support term \(\lambda\)] and \((de)\) [with support term \(\mu\)], hyper3 approach
We can investigate red bus-blue bus phenomenon (as discussed, in a
slightly different context, in inst/red_bus_blue_bus.Rmd). Here, we
consider a person who is given the choice of five transport methods:
C, carT, trainRB a red busBB a blue busW walkingNow, he does not really care what colour the bus is. If we ask him to
rank his options, it is highly probable that he will put RB and BB
consecutively (because they are essentially indistinguishable). Can
we quantify the strength of this effect? To do this, we define a
bespoke function RB_BB_LF() which returns a hyper3 log-likelihood
function corresponding to repeated observations of our commuter’s
reported ranks for the five options:
`RB_BB_LF` <- function(s){
ec <- c(C=1,T=2,RB=3,BB=3,W=4) # equivalence classes
h <- c(1,1,s,1) # strength of support
(
cheering3(v=c("RB","BB","C" ,"T","W"),e=ec,h=h)*3 +
cheering3(v=c("BB","RB","T" ,"C","W"),e=ec,h=h)*2 +
cheering3(v=c("T" ,"BB","RB","C","W"),e=ec,h=h)*2 +
cheering3(v=c("W" ,"BB","RB","T","C"),e=ec,h=h)*4 +
cheering3(v=c("C" ,"RB","BB","W","T"),e=ec,h=h)*4 +
cheering3(v=c("BB","C" ,"RB","T","W"),e=ec,h=h)*3
)
}
Above, we see from the function body that he reported RB,BB,C,T,W
three times [first row], BB,RB,T,C,W twice [second row], and so on;
perhaps his ranking depends on the weather or how tired he is on any
given day. Observe that in almost every case he ranks RB and BB
consecutively. Function RB_BB_LF() takes argument s that
quantifies the perceived similarity between RB and BB. For
example:
(H <- RB_BB_LF(s=1.8888))
## log( (BB=1)^11 * (BB=1, C=1, RB=1, T=1)^-4 * (BB=1, C=1, RB=1, T=1, W=1)^-18 * (BB=1, C=1, RB=1, W=1)^-2 *
## (BB=1, RB=1, T=1, W=1)^-4 * (BB=1.8888)^7 * (BB=1.8888, C=1, T=1, W=1)^-3 * (BB=1.8888, T=1, W=1)^-4 *
## (C=1)^14 * (C=1, RB=1.8888, T=1)^-4 * (C=1, RB=1.8888, T=1, W=1)^-5 * (C=1, RB=1.8888, W=1)^-2 * (C=1,
## T=1)^-4 * (C=1, T=1, W=1)^-5 * (C=1, W=1)^-4 * (RB=1)^7 * (RB=1.8888)^11 * (RB=1.8888, T=1, W=1)^-3 *
## (T=1)^14 * (T=1, W=1)^-10 * (W=1)^8)
(mH <- maxp(H,n=1))
## BB C RB T W
## 0.426153 0.147463 0.281683 0.091811 0.052890
Now to find a profile likelihood function for s:
o <- function(s){maxp(RB_BB_LF(s),give=TRUE,n=1)$likes} # optimand
s <- exp(seq(from=log(1.3),to=log(47),len=17)) # putative similarity measures
L <- sapply(s,o)
L <- L-max(L)
We can plot these:
plot(s,L,type="b")
abline(h=c(0,-2))
abline(v=1)
plot(log(s),L,type="b")
abline(h=c(0,-2))
abline(v=0)
And formally maximize the likelihood:
(osup <- optimize(o,c(6,10),maximum=TRUE))
## $maximum
## [1] 7.0406
##
## $objective
## [1] -66.406
So a likelihood ratio test of the null that \(S=1\) would be:
(suppdiff <- o(osup$maximum) - o(1))
## [1] 3.7514
Easily satisfying Edwards’s two-units-of-support criterion; Wilks gives us an asymptotic \(p\)-value:
pchisq(suppdiff*2,df=1,lower.tail=FALSE)
## [1] 0.00616
Now use the evaluate for the likelihood function:
maxHmax <- maxp(RB_BB_LF(s = osup$maximum))
maxHmax
## BB C RB T W
## 0.393591 0.195070 0.215299 0.124513 0.071527
Here we use a dataset of university rankings, timesData.csv, taken
from
https://github.com/arnaudbenard/university-ranking/blob/master/timesData.csv.
a <- read.table("timesData.txt",sep=",", header=TRUE)
wanted <- c("California Institute of Technology", "Harvard University",
"Massachusetts Institute of Technology", "Princeton University",
"Stanford University", "University of Cambridge", "University of Oxford")
names(wanted) <- c("cal","harv","mass","prin","stan","cam","ox")
a <- a[a$university_name %in% wanted,]
a <- cbind(a,"top7rank"=0)
for(y in unique(a$year)){
a[a$year==y,"top7rank"] <- order(
as.numeric(a[a$year==y,"world_rank"]) +
a[a$year==y,"research"]/1e6,
decreasing=TRUE)}
a <- a[,c("top7rank","university_name","year")]
a <- reshape(a,idvar="university_name",timevar="year",direction="wide")
for(i in seq_len(nrow(a))){
a$university_name[i] <- names(which(wanted == a$university_name[i]))
}
rownames(a) <- a$university_name
a <- a[,-1]
colnames(a) <- paste("Y",2011:2016,sep="")
a
## Y2011 Y2012 Y2013 Y2014 Y2015 Y2016
## harv 6 6 4 6 6 2
## cal 7 7 7 7 7 7
## mass 5 1 2 2 2 3
## stan 4 5 6 4 4 5
## prin 3 3 3 3 1 1
## cam 2 2 1 1 3 4
## ox 1 4 5 5 5 6
H <- ordertable2supp(a)
equalp.test(H)
##
## Constrained support maximization
##
## data: H
## null hypothesis: cal = cam = harv = mass = ox = prin = stan
## null estimate:
## cal cam harv mass ox prin stan
## 0.14286 0.14286 0.14286 0.14286 0.14286 0.14286 0.14286
## (argmax, constrained optimization)
## Support for null: -51.151 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## cal cam harv mass ox prin stan
## 0.000001 0.348942 0.015828 0.265173 0.038985 0.287311 0.043760
## (argmax, free optimization)
## Support for alternative: -29.415 + K
##
## degrees of freedom: 6
## support difference = 21.736
## p-value: 9.4034e-08
samep.test(H,c("ox","cam"))
##
## Constrained support maximization
##
## data: H
## null hypothesis: ox = cam
## null estimate:
## cal cam harv mass ox prin stan
## 1.0002e-06 1.3346e-01 2.9711e-02 2.8104e-01 1.3346e-01 3.4667e-01 7.5669e-02
## (argmax, constrained optimization)
## Support for null: -33.153 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## cal cam harv mass ox prin stan
## 0.000001 0.348942 0.015828 0.265173 0.038985 0.287311 0.043760
## (argmax, free optimization)
## Support for alternative: -29.415 + K
##
## degrees of freedom: 1
## support difference = 3.7385
## p-value: 0.0062492
Start to use hyper3 idiom:
H3 <- function(oxcam){
out <- hyper3()
for(i in seq_len(ncol(a))){
jj <- rep("",nrow(a))
jj[a[,i]] <- rownames(a)
out <- out + cheering3(v=jj,e=c(ox=1,cam=1,prin=2, stan=3, mass=4, harv=5, cal=6), help=c(oxcam,1,1,1,1,1))
}
return(out)
}
o <- function(oxcam){maxp(H3(oxcam),give=TRUE,n=1)$likes}
oc <- exp(seq(from=log(0.5),to=log(5),len=15))
L <- sapply(oc,o)
L <- L - max(L)
plot(log(oc),L,type="b")
abline(v=0)
The five nations rugby championship was held from 1910 to 1999 and
file five_nations.txt shows the order statistic for England (E),
Scotland (S), Ireland (I), Wales (W), and France (F).
https://en.wikipedia.org/wiki/Six_Nations_Championship
Here is hyper2 analysis:
a <- as.matrix(read.table("five_nations.txt",header=FALSE))
head(a)
## V1 V2 V3 V4 V5 V6
## [1,] "1910" "E" "W" "S" "I" "F"
## [2,] "1911" "W" "I" "E" "F" "S"
## [3,] "1912" "E" "I" "S" "W" "F"
## [4,] "1913" "E" "W" "S" "I" "F"
## [5,] "1914" "E" "W" "I" "S" "F"
## [6,] "1920" "W" "S" "E" "F" "I"
H <- hyper2()
for(i in seq_len(nrow(a))){
H <- H + race(a[i,-1])
}
mH <- maxp(H)
pie(mH)
equalp.test(H)
##
## Constrained support maximization
##
## data: H
## null hypothesis: E = F = I = S = W
## null estimate:
## E F I S W
## 0.2 0.2 0.2 0.2 0.2
## (argmax, constrained optimization)
## Support for null: -335.12 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## E F I S W
## 0.22044 0.21524 0.15607 0.15988 0.24837
## (argmax, free optimization)
## Support for alternative: -331.03 + K
##
## degrees of freedom: 4
## support difference = 4.0955
## p-value: 0.08483
Now use hyper3 to see whether teams do better following a win:
rugby <- function(lambda){
H3a <- hyper3()
for(i in seq(from=2,to=nrow(a))){
last_year_winner <- a[i-1,2] # e.g. "W" or "E"
H3a <- H3a + ordervec2supp3a(a[i,-1],nonfinishers=NULL,helped=last_year_winner,lambda=lambda)
}
return(H3a)
}
rugby(1.888)
## log( (E=1)^38 * (E=1, F=1)^-7 * (E=1, F=1, I=1)^-6 * (E=1, F=1, I=1, S=1)^-23 * (E=1, F=1, I=1, S=1,
## W=1)^-69 * (E=1, F=1, I=1, W=1)^-6 * (E=1, F=1, S=1)^-4 * (E=1, F=1, S=1, W=1)^-6 * (E=1, F=1, W=1)^-4 *
## (E=1, I=1)^-7 * (E=1, I=1, S=1)^-11 * (E=1, I=1, S=1, W=1)^-16 * (E=1, I=1, W=1)^-4 * (E=1, S=1)^-9 * (E=1,
## S=1, W=1)^-7 * (E=1, W=1)^-2 * (E=1.888)^18 * (F=1)^42 * (F=1, I=1)^-4 * (F=1, I=1, S=1)^-12 * (F=1, I=1,
## S=1, W=1)^-18 * (F=1, I=1, W=1)^-5 * (F=1, S=1)^-10 * (F=1, S=1, W=1)^-5 * (F=1, W=1)^-5 * (F=1.888)^16 *
## (I=1)^45 * (I=1, S=1)^-11 * (I=1, S=1, W=1)^-11 * (I=1, W=1)^-10 * (I=1.888)^6 * (S=1)^47 * (S=1, W=1)^-4 *
## (S=1.888)^6 * (W=1)^35 * (W=1.888)^23)
rugby(1.111111)
## log( (E=1)^38 * (E=1, F=1)^-7 * (E=1, F=1, I=1)^-6 * (E=1, F=1, I=1, S=1)^-23 * (E=1, F=1, I=1, S=1,
## W=1)^-69 * (E=1, F=1, I=1, W=1)^-6 * (E=1, F=1, S=1)^-4 * (E=1, F=1, S=1, W=1)^-6 * (E=1, F=1, W=1)^-4 *
## (E=1, I=1)^-7 * (E=1, I=1, S=1)^-11 * (E=1, I=1, S=1, W=1)^-16 * (E=1, I=1, W=1)^-4 * (E=1, S=1)^-9 * (E=1,
## S=1, W=1)^-7 * (E=1, W=1)^-2 * (E=1.1111)^18 * (F=1)^42 * (F=1, I=1)^-4 * (F=1, I=1, S=1)^-12 * (F=1, I=1,
## S=1, W=1)^-18 * (F=1, I=1, W=1)^-5 * (F=1, S=1)^-10 * (F=1, S=1, W=1)^-5 * (F=1, W=1)^-5 * (F=1.1111)^16 *
## (I=1)^45 * (I=1, S=1)^-11 * (I=1, S=1, W=1)^-11 * (I=1, W=1)^-10 * (I=1.1111)^6 * (S=1)^47 * (S=1, W=1)^-4
## * (S=1.1111)^6 * (W=1)^35 * (W=1.1111)^23)
maxp(rugby(1.8),n=1,give=TRUE)
## $par
## E F I S
## 0.21677 0.22212 0.15621 0.15881
##
## $value
## [1] -285.72
##
## $counts
## function gradient
## 35 8
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $outer.iterations
## [1] 2
##
## $barrier.value
## [1] 0.00015929
##
## $likes
## [1] -285.72
##
## $evaluate
## E F I S W
## 0.21677 0.22212 0.15621 0.15881 0.24610
maxp(rugby(1.9),n=1,give=TRUE)
## $par
## E F I S
## 0.21677 0.22212 0.15621 0.15881
##
## $value
## [1] -281.98
##
## $counts
## function gradient
## 35 8
##
## $convergence
## [1] 0
##
## $message
## NULL
##
## $outer.iterations
## [1] 2
##
## $barrier.value
## [1] 0.00015929
##
## $likes
## [1] -281.98
##
## $evaluate
## E F I S W
## 0.21677 0.22212 0.15621 0.15881 0.24610
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.