To cite the hyper2 package in publications, please use
Hankin (2017). This vignette shows some features of the
uninformative argument of function ordertable2supp() and
ordertable2supp3().
(o <- data.frame(x=c("DSQ",1:3), row.names=letters[1:4]))
## x
## a DSQ
## b 1
## c 2
## d 3
Above we see a perfectly reasonable order table, in which competitor
a has a DSQ [“disqualified”] result. This means that a
effectively came last and the order statistic would be \(b\succ c\succ
d\succ a\). The Plackett-Luce likelihood function would be
\[ \frac{b}{a+b+c+d}\cdot \frac{c}{a+ c+d}\cdot \frac{d}{a+ d}\cdot \frac{a}{a } \]
In R:
ordertable2supp(o)
## log((a + b + c + d)^-1 * (a + c + d)^-1 * (a + d)^-1 * b * c * d)
Now we imagine that competitor a did not compete for some reason
unrelated to his Bradley-Terry strength. He might be ill; he might
have accumulated enough points to be certain of victory and declined
to compete [fearing injury]; or decided (or was forced) not to compete
for some other reason in which we are not interested. We indicate
this using the noninformative argument of ordertable2supp(), for
which a NULL argument is c("dead", "won"):
(o <- data.frame(x=c("won",1:3), row.names=letters[1:4]))
## x
## a won
## b 1
## c 2
## d 3
The PL likelihood function would then be:
\[ \frac{b}{b+c+d}\cdot \frac{c}{ c+d}\cdot \frac{d}{ d} \]
Above, see how the likelihood function is uninformative about a: we
are making no inferences about his strength from the fact that he did
not compete. We might combine these two phenomena in a single
dataset. Suppose a had already won, and c was disqualified:
(o <- data.frame(x=c("won",1,"DSQ",2), row.names=letters[1:4]))
## x
## a won
## b 1
## c DSQ
## d 2
\[ \frac{b}{b+c+d}\cdot \frac{d}{ c+d}\cdot \frac{c}{ c } \]
See how c is on the denominator of the first two terms, but a is
absent.
ordertable2supp(o)
## log( b * (b + c + d)^-1 * (c + d)^-1 * d)
We can have more than one competitor in each class:
jj <- 1:7
jj[2:3] <- "dead"
jj[6:7] <- "DSQ"
(o <- data.frame(x=jj, row.names=letters[1:7]))
## x
## a 1
## b dead
## c dead
## d 4
## e 5
## f DSQ
## g DSQ
PL likelihood function would be
\[ \frac{a}{a+d+e+f+g}\cdot \frac{d}{ d+e+f+g}\cdot \frac{e}{ e+f+g} \]
ordertable2supp(o)
## log( a * (a + d + e + f + g)^-1 * d * (d + e + f + g)^-1 * e * (e + f +
## g)^-1)
The curling dataset, documented at inst/curling.Rmd and
curling.Rd, gives a nice use-case for this functionality.
hyper3 support functionsFunction ordertable2supp3() includes a noninformative option. We
can illustrate its use with a simple case.
obs <- data.frame(foo = c("a","a","b","a"), place=c("won",2,3,4), points=4:1)
obs
## foo place points
## 1 a won 4
## 2 a 2 3
## 3 b 3 2
## 4 a 4 1
If “won” means “came last” we would have \(a\succ b\succ a\succ a\), with PL likelihood function
\[\begin{equation} \propto \frac{a}{3a+b}\cdot \frac{b}{2a+b}\cdot \frac{a}{2a}\cdot \frac{a}{a}\propto \frac{a}{3a+b}\cdot \frac{b}{2a+b} \end{equation}\]
S1 <- ordertable2supp3(obs)
S1
## log( (a=1)^2 * (a=2)^-1 * (a=2, b=1)^-1 * (a=3, b=1)^-1 * (b=1)^1)
Above we see that ordertable2supp3() has returned a more complicated
function than necessary because it has not cancelled out one of the
constant terms [here it has cancelled out \(\frac{a}{a}\) but not
\(\frac{2}{2a}\)].
Now if we define “won” to be noninformative the order statistic is simply \(a \succ b\succ a\), with PL likelihood function
\[\begin{equation} \propto \frac{a}{2a+b}\cdot \frac{b}{a+b} \end{equation}\]
and indeed we see:
S2 <- ordertable2supp3(obs, noninformative = "won")
S2
## log( (a=1)^1 * (a=1, b=1)^-1 * (a=2, b=1)^-1 * (b=1)^1)
maxp(S1)
## a b
## 0.28992 0.71008
maxp(S2)
## a b
## 0.41421 0.58579
The best use-case would be the 2025 MotoGP World Championship. Marc Marquez declined to compete in the last four races of the season due to his having accumulated sufficient points to guarantee victory. We can make no inferences about his BT strength from these final four races.
first we consider a simplified table:
o
## constructor THA GBR ARG AUS MAL points
## 1 ducati 1 6 1 won won 33
## 2 ducati 2 5 2 4 1 2
## 3 aprilia 6 4 NA 3 11 1
## 4 KTM 19 3 8 5 2 3
## 5 ducati 3 2 4 <NA> <NA> 44
## 6 ducati 10 1 5 2 6 4
x1 <- ordertable2supp3(o)
x1
## log( (KTM=1)^4 * (KTM=1, aprilia=1)^-1 * (KTM=1, aprilia=1,
## ducati=1)^-2 * (KTM=1, aprilia=1, ducati=2)^-3 * (KTM=1, aprilia=1,
## ducati=3)^-5 * (KTM=1, aprilia=1, ducati=4)^-5 * (KTM=1, ducati=1)^-1 *
## (KTM=1, ducati=2)^-1 * (KTM=1, ducati=3)^-1 * (aprilia=1)^4 *
## (aprilia=1, ducati=2)^-2 * (aprilia=1, ducati=3)^-1 * (ducati=1)^17 *
## (ducati=2)^-3)
x2 <- ordertable2supp3(o, noninformative = "won")
x2
## log( (KTM=1)^4 * (KTM=1, aprilia=1)^-1 * (KTM=1, aprilia=1,
## ducati=1)^-2 * (KTM=1, aprilia=1, ducati=2)^-5 * (KTM=1, aprilia=1,
## ducati=3)^-5 * (KTM=1, aprilia=1, ducati=4)^-3 * (KTM=1, ducati=1)^-2 *
## (KTM=1, ducati=2)^-1 * (aprilia=1)^4 * (aprilia=1, ducati=1)^-1 *
## (aprilia=1, ducati=2)^-2 * (ducati=1)^15 * (ducati=2)^-1)
maxp(x1)
## aprilia ducati KTM
## 0.30167 0.40874 0.28958
maxp(x2)
## aprilia ducati KTM
## 0.28084 0.44915 0.27001
Above, see how the evaluate for ducati increases when we specify
that "won" is noninformative: if "won" is informative, it is
interpreted as “came last” which decreases estimated performance.
I will put up a full constructor table for MotoGP when I get round to
it [currently, file motoGP_2019.txt does not include the
constructor].
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.