To cite the hyper2 package in publications, please use Hankin (2017).
Here we analyse results from the men’s javelin, 2020 Summer Olympics.
javelin_table <- as.attemptstable(read.table("javelin.txt", header=TRUE))
a <- javelin_table # saves typing
options("bold_personal_best" = FALSE)
javelin_table
## An attemptstable:
## throw1 throw2 throw3 throw4 throw5 throw6
## Chopra 87.03 87.58 76.79 X X 84.24
## Vadlejch 83.98 X X 82.86 86.67 X
## Vesely 79.73 80.30 85.44 X 84.98 X
## Weber 85.30 77.90 78.00 83.10 85.15 75.72
## Nadeem 82.40 X 84.62 82.91 81.98 X
## Katkavets 82.49 81.03 83.71 79.24 X X
## Mardare 81.16 81.73 82.84 81.90 83.30 81.09
## Etelatalo 78.43 76.59 83.28 79.20 79.99 83.05
Above, we see that Chopra threw 87.03 on his first throw, 87.58 on his
second, and so on (all measurements are in meters). Vadlejch threw
83.98 on his first throw, etc. Print option bold_personal_best
controls whether the personal (row-wise) best attempt is printed in
bold. This does not work in rmarkdown documents but does work on the
terminal and Rstudio console. We may convert javelin_table to a
named vector with elements being the throw distances, and names being
the competitors:
javelin_vector <- as.vector(javelin_table)
javelin_vector
## Chopra Chopra Vadlejch Vesely Weber Weber Vesely Nadeem
## 87.58 87.03 86.67 85.44 85.30 85.15 84.98 84.62
## Chopra Vadlejch Katkavets Mardare Etelatalo Weber Etelatalo Nadeem
## 84.24 83.98 83.71 83.30 83.28 83.10 83.05 82.91
## Vadlejch Mardare Katkavets Nadeem Nadeem Mardare Mardare Mardare
## 82.86 82.84 82.49 82.40 81.98 81.90 81.73 81.16
## Mardare Katkavets Vesely Etelatalo Vesely Katkavets Etelatalo Etelatalo
## 81.09 81.03 80.30 79.99 79.73 79.24 79.20 78.43
## Weber Weber Chopra Etelatalo Weber Vadlejch Nadeem Vadlejch
## 78.00 77.90 76.79 76.59 75.72 NA NA NA
## Chopra Vesely Chopra Katkavets Vadlejch Vesely Nadeem Katkavets
## NA NA NA NA NA NA NA NA
o <- javelin_vector # saves typing
Above we see that Chopra threw the longest and second-longest throws
of 87.58 and 87.03 respectively;
Vadlejch threw the third-longest throw of 86.67, and so on. NA
entries correspond to no-throws. It is easy to plot the data in
visual form:
n <- length(levels(as.factor(names(o))))
plot(o, pch=16, col=rainbow(n)[as.factor(names(o))])
legend("topright", pch=16, col=rainbow(n), legend=levels(as.factor(names(o))))
Now convert the attempts table to a hyper3 object, using function
. However, note that there are two reasonable
interpretations: firstly, we simply ignore no-throws [in the package
docs, these are sometimes referred to as “DNF”, or “Did not finish”,
following Formula 1 terminology] and to do this we pass
dnf.last=FALSE:
javelin <- suppfun(javelin_table, dnf.last=FALSE)
javelin_maxp <- maxp(javelin)
javelin_maxp
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.093048 0.048181 0.092853 0.117342 0.173030 0.320628 0.114027 0.040890
Secondly, we may treat no-throws as being inferior to any actual
measured throw. I don’t think this is realistic: noone cares about
no-throws and certainly any model in which a thrower with five
no-throws and one excellent throw is given a small strength is
defective. However, there are some circumstances in which this might
be reasonable [Formula 1 being one example] and this is implemented by
specifying dnf.last=TRUE:
javelin2 <- suppfun(javelin_table, dnf.last=TRUE)
javelin2_maxp <- maxp(javelin2)
Investigation:
rbind(javelin_maxp, javelin2_maxp)
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely
## javelin_maxp 0.093048 0.048181 0.092853 0.11734 0.17303 0.320628 0.11403
## javelin2_maxp 0.105713 0.145507 0.089125 0.22323 0.10493 0.069671 0.09806
## Weber
## javelin_maxp 0.04089
## javelin2_maxp 0.16376
pie(javelin_maxp)
pie(javelin2_maxp)
dotchart(javelin_maxp, pch=16)
dotchart(javelin2_maxp, pch=16)
ordertransplot(javelin_maxp, javelin2_maxp)
equalp.test(javelin)
##
## Constrained support maximization
##
## data: javelin
## null hypothesis: Chopra = Etelatalo = Katkavets = Mardare = Nadeem = Vadlejch = Vesely = Weber
## null estimate:
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
## (argmax, constrained optimization)
## Support for null: -99.331 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.093048 0.048181 0.092853 0.117342 0.173030 0.320628 0.114027 0.040890
## (argmax, free optimization)
## Support for alternative: -94.897 + K
##
## degrees of freedom: 7
## support difference = 4.4337
## p-value: 0.26232
equalp.test(javelin2)
##
## Constrained support maximization
##
## data: javelin2
## null hypothesis: Chopra = Etelatalo = Katkavets = Mardare = Nadeem = Vadlejch = Vesely = Weber
## null estimate:
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
## (argmax, constrained optimization)
## Support for null: -123.17 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.105713 0.145507 0.089125 0.223231 0.104929 0.069671 0.098060 0.163765
## (argmax, free optimization)
## Support for alternative: -121.09 + K
##
## degrees of freedom: 7
## support difference = 2.0861
## p-value: 0.75975
The competition is won by the player who has the longest throw: the sufficient statistic would be the maximum of the six attempts. Bradley-Terry cannot handle this if different players have different variance: short throws are disregarded.
As an example, suppose two throwers have normally distributed distances; thrower A has a smaller mean distance than thrower B but a larger variance. Numerically:
set.seed(0)
thrower_A <- round(rnorm(6, 80.2, 2.6), 2)
thrower_B <- round(rnorm(6, 80.3, 1.3), 2)
D <- as.attemptstable(as.data.frame(rbind(thrower_A, thrower_B)))
D
## An attemptstable:
## V1 V2 V3 V4 V5 V6
## thrower_A 83.48 79.35 83.66 83.51 81.28 76.20
## thrower_B 79.09 79.92 80.29 83.43 81.29 79.26
apply(D, 1, max)
## thrower_A thrower_B
## 83.66 83.43
Above we see that thrower A wins, despite having a smaller mean distance.
Chopra won the competition by virtue of his longest throw of 87.58 being longer than anyone else’s throw. Observe that his Bradley-Terry strength is small:
(M <- maxp(javelin))
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.093048 0.048181 0.092853 0.117342 0.173030 0.320628 0.114027 0.040890
By that criterion, Mardare wins, but not significantly:
specificp.test(javelin2, "Mardare", 1/8)
##
## Constrained support maximization
##
## data: H
## null hypothesis: sum p_i=1, Mardare = 0.125
## null estimate:
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.122818 0.162254 0.101619 0.125000 0.118717 0.081322 0.111904 0.176366
## (argmax, constrained optimization)
## Support for null: -122.07 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## Chopra Etelatalo Katkavets Mardare Nadeem Vadlejch Vesely Weber
## 0.105713 0.145507 0.089125 0.223231 0.104929 0.069671 0.098060 0.163765
## (argmax, free optimization)
## Support for alternative: -121.09 + K
##
## degrees of freedom: 1
## support difference = 0.98174
## p-value: 0.16114 (two sided)
It seems that Mardare’s consistent performance gives him BT strength, but not a long longest throw. How to modify BT to account for this?
Following lines create javelin.rda, residing in the data/
directory of the package.
save(javelin, javelin2, javelin_table, javelin_vector, javelin_maxp, javelin2_maxp, file="javelin.rda")
https://en.wikipedia.org/w/index.php?title=Athletics_at_the_2020_Summer_Olympics_%E2%80%93_Men%27s_javelin_throw&oldid=1051014985
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.