To cite the hyper2 package in publications, please use Hankin (2017). This short document shows how to apply hyper2 idiom to the salad dataset of the prefmod package (Hatzinger and Dittrich 2012). From salad.Rd:

The dataset contains the rankings of four salad dressings
concerning tartness by 32 judges, with values ranging
from 1 (most tart) to 4 (least tart).

head(salad)

##   A B C D
## 1 1 2 3 4
## 2 1 2 3 4
## 3 2 1 3 4
## 4 2 1 4 3
## 5 2 1 4 3
## 6 2 3 1 4

nrow(salad)

## [1] 32

From row 3, for example, we see that salad A was ranked as second most tart, B the most tart, C the third most tart, and D the least tart of the four. We may process this using two methods, explicit and slick. First, explicit:

H1 <- hyper2()
for(i in seq_len(nrow(salad))){
   H1 <-  H1 + suppfun(as.matrix(salad)[i,])
}
H1

## log(A^10 * (A + B)^-3 * (A + B + C)^-2 * (A + B + C + D)^-32 * (A + B + D)^-5 * (A +
## C)^-7 * (A + C + D)^-23 * (A + D)^-16 * B^31 * (B + C + D)^-2 * (B + D)^-1 * C^29 * (C +
## D)^-5 * D^26)

And second, a slick method. We take the transpose of salad (things are clearer if we name the rows):

rownames(salad) <-
   paste("j",formatC(seq_len(nrow(salad)),width=2,format="d",flag="0"),sep="")
ts <- ordertable(t(salad))

Above, we see that object ts is an order table. It is just like the formula 1 result table F1_table_2016, except that venues are judges and drivers are the salads [which are “competing” for the “who has the most tart dressing” prize]. Converting this to a support function is accomplished with ordertable2supp():

(H2 <- suppfun(ts))

## log(A^10 * (A + B)^-3 * (A + B + C)^-2 * (A + B + C + D)^-32 * (A + B + D)^-5 * (A +
## C)^-7 * (A + C + D)^-23 * (A + D)^-16 * B^31 * (B + C + D)^-2 * (B + D)^-1 * C^29 * (C +
## D)^-5 * D^26)

Just to check:

H1 == H2

## [1] TRUE

equalp.test(H1)

## 
##  Constrained support maximization
## 
## data:  H1
## null hypothesis: A = B = C = D
## null estimate:
##    A    B    C    D 
## 0.25 0.25 0.25 0.25 
## (argmax, constrained optimization)
## Support for null:  -101.7 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##        A        B        C        D 
## 0.041861 0.641737 0.199516 0.116886 
## (argmax, free optimization)
## Support for alternative:  -76.415 + K
## 
## degrees of freedom: 3
## support difference = 25.282
## p-value: 6.0561e-11

The null estimate agrees to six places of decimals with that presented by Turner et al. (2020):

standardPL_PlackettLuce <- PlackettLuce(salad, npseudo = 0)
(p1 <- itempar(standardPL_PlackettLuce))

## Item response item parameters (PlackettLuce):
##      A      B      C      D 
## 0.0419 0.6418 0.1995 0.1169

(p2 <- maxp(H1))

##        A        B        C        D 
## 0.041861 0.641737 0.199516 0.116886

p1-p2

## Item response item parameters (PlackettLuce):
##         A         B         C         D 
## -2.21e-06  2.15e-05 -1.12e-05 -8.08e-06

We may use hyper2 to test some hypotheses. Firstly, \(H_0\colon p_B=\frac{1}{4}\), which we test with specificp.test():

specificp.test(H2,"B",1/4)

## 
##  Constrained support maximization
## 
## data:  H
## null hypothesis: sum p_i=1, B = 0.25
## null estimate:
##        A        B        C        D 
## 0.096602 0.250000 0.435773 0.217625 
## (argmax, constrained optimization)
## Support for null:  -90.911 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##        A        B        C        D 
## 0.041861 0.641737 0.199516 0.116886 
## (argmax, free optimization)
## Support for alternative:  -76.415 + K
## 
## degrees of freedom: 1
## support difference = 14.495
## p-value: 7.2727e-08 (two sided)

Above we see a \(p\)-value of about \(10^{-7}\), clearly significant. However, we may also test the hypothesis that \(p_B\) is less than the second strongest, \(p_C\):

samep.test(H2,c("B","C"))

## 
##  Constrained support maximization
## 
## data:  H2
## null hypothesis: B = C
## null estimate:
##       A       B       C       D 
## 0.05612 0.39393 0.39393 0.15602 
## (argmax, constrained optimization)
## Support for null:  -82.144 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##        A        B        C        D 
## 0.041861 0.641737 0.199516 0.116886 
## (argmax, free optimization)
## Support for alternative:  -76.415 + K
## 
## degrees of freedom: 1
## support difference = 5.7286
## p-value: 0.00071218

We see a \(p\)-value of about \(7\times 10^{-4}\), much higher than the value from specificp.test() of \(H_0\colon p_B=\frac{1}{4}\). This is because samep.test() allows one to change the value of \(p_C\) and we would expect a less significant result.

1 Acid analysis

Turner goes on to assess whether a model of tartness using acetic and gluconic acids is appropriate. This is possible using the hyper2 formalism as follows.

features <- data.frame(salad = LETTERS[1:4],
                       acetic = c(0.5, 0.5, 1, 0),
                       gluconic = c(0, 10, 0, 10))
features

##   salad acetic gluconic
## 1     A    0.5        0
## 2     B    0.5       10
## 3     C    1.0        0
## 4     D    0.0       10

(M <- as.matrix(features[,-1]))

##      acetic gluconic
## [1,]    0.5        0
## [2,]    0.5       10
## [3,]    1.0        0
## [4,]    0.0       10

makestrengths <- function(v){
    out <- exp(M %*% v)
    out <- c(out/sum(out))
    out <- pmax(out,0)
    names(out) <- LETTERS[1:4]
    return(out)
}
makestrengths(c(2,1))

##          A          B          C          D 
## 3.3186e-05 7.3097e-01 9.0209e-05 2.6891e-01

f <- function(v){loglik(makestrengths(v),H2)}
(maxo <- optim(par=c(3, 0.3), fn=f,control=list(fnscale = -1)))

## $par
## [1] 3.27419 0.27392
## 
## $value
## [1] -76.452
## 
## $counts
## function gradient 
##       53       NA 
## 
## $convergence
## [1] 0
## 
## $message
## NULL

makestrengths(maxo$par)

##        A        B        C        D 
## 0.040608 0.628403 0.208736 0.122252

loglik(makestrengths(maxo$pa),H1)

## [1] -76.452

maxp(H1,give=1)

## [[1]]
##        A        B        C        D 
## 0.041861 0.641737 0.199516 0.116886 
## 
## $`Log-likelihood`
## [1] -76.415

References

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.

Hatzinger, R., and R. Dittrich. 2012. “Prefmod: An R Package for Modeling Preferences Based on Paired Comparisons, Rankings, or Ratings.” Journal of Statistical Software 48: 1–31.

Turner, H. L., Jacob van Etten, David Firth, and Ioannis Kosmidis. 2020. “Modelling Rankings in R: The PlackettLuce Package.” Computational Statistics 35: 1027–57. https://doi.org/10.1007/s00180-020-00959-3.

salad

R. K. S. Hankin

1 Acid analysis

References