dirichlet()
function in the
hyper2
packagevignettes/dirichlet.Rmd
dirichlet.Rmd
dirichlet
function (powers, alpha)
{
if (!xor(missing(powers), missing(alpha))) {
stop("supply exactly one of powers, alpha")
}
if (missing(powers)) {
powers <- alpha - 1
}
np <- names(powers)
if (is.null(np)) {
stop("supply a named vector")
}
hyper2(as.list(np), d = powers) - hyper2(list(np), d = sum(powers))
}
To cite the hyper2
package in publications, please use
Hankin (2017). We say that non-negative
satisfying
follow a Dirichlet distribution, and write
,
if their density function is
for some parameters . It is interesting in the current context because it is conjugate to the multinomial distribution; specifically, given a Dirichlet prior and likelihood function then the posterior is .
In the context of the hyper2
package, function
dirichlet()
presents some peculiarities, discussed
here.
Consider a three-way trial between players a
,
b
, and c
with eponymous Bradley-Terry
strengths. Suppose that, out of
trials, a
wins 4, b
wins 5, and c
wins 3. Then an appropriate likelihood function would be:
## log(a^4 * (a + b + c)^-12 * b^5 * c^3)
Note the (a+b+c)^-11
term, which is not strictly
necessary according the Dirichlet density function above which has no
such term. However, we see that the returned likelihood function is
${\propto}
p_{a\vphantom{abc}}^4p_{b\vphantom{abc}}^5p_{c\vphantom{abc}}^3$
(because
).
## log(a^4 * (a + b + c)^-12 * b^5 * c^3)
Now suppose we observe three-way trials between b
,
c
, and d
:
## log( b * (b + c + d)^-10 * c * d^8)
The overall likelihood function would be :
L1+L2
## log(a^4 * (a + b + c)^-12 * b^6 * (b + c + d)^-10 * c^4 * d^8)
maxp(L1+L2)
## a b c d
## 0.09090991 0.10909114 0.07272658 0.72727237
Observe the dominance of competitor d
, reasonable on the
grounds that d
won 8 of the 10 trials against
b
and c
; and a
, b
,
c
are more or less evenly matched [chi square test,
].
Observe further that we can include four-way observations easily:
## log(a^4 * (a + b + c + d)^-15 * b^3 * c^2 * d^6)
L1+L2+L3
## log(a^8 * (a + b + c)^-12 * (a + b + c + d)^-15 * b^9 * (b + c + d)^-10
## * c^6 * d^14)
Package idiom L1+L2+L3
operates as expected because
dirichlet()
includes the denominator. Sometimes we see
situations in which a competitor does not win any trials. Consider the
following:
## log(a^5 * (a + b + c + d)^-11 * b^3 * d^3)
Above, we see that c
won no trials and is not present in
the numerator of the expression. However, L4
is informative
about competitor c
:
maxp(L4)
## a b c d
## 4.546526e-01 2.727224e-01 1.020077e-06 2.726240e-01
We see that the maximum likelihood estimate for c
is
zero (to within numerical tolerance). Further, we may reject the
hypothesis that
[which might be a reasonable consequence of the assumption that all four
competitors have the same strength]:
specificp.test(L4,'c',0.25)
##
## Constrained support maximization
##
## data: H
## null hypothesis: sum p_i=1, c = 0.25
## null estimate:
## a b c d
## 0.3122093 0.2162741 0.2500000 0.2215167
## (argmax, constrained optimization)
## Support for null: -14.93581 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## a b c d
## 4.546526e-01 2.727224e-01 1.020077e-06 2.726240e-01
## (argmax, free optimization)
## Support for alternative: -11.738 + K
##
## degrees of freedom: 1
## support difference = 3.197811
## p-value: 0.01144022 (two sided)
However, observe that we cannot reject the equality hypothesis, that is, :
equalp.test(L4)
##
## Constrained support maximization
##
## data: L4
## null hypothesis: a = b = c = d
## null estimate:
## a b c d
## 0.25 0.25 0.25 0.25
## (argmax, constrained optimization)
## Support for null: -15.24924 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## a b c d
## 4.546526e-01 2.727224e-01 1.020077e-06 2.726240e-01
## (argmax, free optimization)
## Support for alternative: -11.738 + K
##
## degrees of freedom: 3
## support difference = 3.511242
## p-value: 0.07118459