(takes about ten minutes run without cache)

Here I use the concept of dominated set to simplify a Plackett-Luce likelihood function.

This analysis is essentially without value, see the very end for why

To cite the hyper2 package in publications, please use Hankin (2017). The global liveability ranking shows the nicest places to live. Here I apply Plackett-Luce likelihoods but use flawed logic to eliminate poorly-ranked cities.

The dataset comprises seven observations covering years 2015-2022. Each year, ten or eleven cities are ranked in order of liveability, according to a number of criteria such as prevalence of crime, healthcare, education, and infrastructure.

(glr <- readLines("global_liveability_ranking.txt"))
## [1] "2022 Vienna Copenhagen Zurich Calgary Vancouver Geneva Frankfurt Toronto Amsterdam Osaka"      
## [2] "2021 Auckland Osaka Adelaide Wellington Tokyo Perth Zurich Geneva Melbourne Brisbane"          
## [3] "2019 Vienna Melbourne Sydney Osaka Calgary Vancouver Tokyo Toronto Copenhagen Adelaide"        
## [4] "2018 Vienna Melbourne Osaka Calgary Sydney Vancouver Tokyo Toronto Copenhagen Adelaide"        
## [5] "2017 Melbourne Vienna Vancouver Toronto Adelaide Calgary Perth Auckland Helsinki Hamburg"      
## [6] "2016 Melbourne Vienna Vancouver Toronto Adelaide Calgary Perth Auckland Helsinki Hamburg"      
## [7] "2015 Melbourne Vienna Vancouver Toronto Adelaide Calgary Sydney Perth Auckland Helsinki Zurich"
(o <- glr  %>% strsplit(" ") %>% lapply(function(l){l[-1]}))
## [[1]]
##  [1] "Vienna"     "Copenhagen" "Zurich"     "Calgary"    "Vancouver" 
##  [6] "Geneva"     "Frankfurt"  "Toronto"    "Amsterdam"  "Osaka"     
## 
## [[2]]
##  [1] "Auckland"   "Osaka"      "Adelaide"   "Wellington" "Tokyo"     
##  [6] "Perth"      "Zurich"     "Geneva"     "Melbourne"  "Brisbane"  
## 
## [[3]]
##  [1] "Vienna"     "Melbourne"  "Sydney"     "Osaka"      "Calgary"   
##  [6] "Vancouver"  "Tokyo"      "Toronto"    "Copenhagen" "Adelaide"  
## 
## [[4]]
##  [1] "Vienna"     "Melbourne"  "Osaka"      "Calgary"    "Sydney"    
##  [6] "Vancouver"  "Tokyo"      "Toronto"    "Copenhagen" "Adelaide"  
## 
## [[5]]
##  [1] "Melbourne" "Vienna"    "Vancouver" "Toronto"   "Adelaide"  "Calgary"  
##  [7] "Perth"     "Auckland"  "Helsinki"  "Hamburg"  
## 
## [[6]]
##  [1] "Melbourne" "Vienna"    "Vancouver" "Toronto"   "Adelaide"  "Calgary"  
##  [7] "Perth"     "Auckland"  "Helsinki"  "Hamburg"  
## 
## [[7]]
##  [1] "Melbourne" "Vienna"    "Vancouver" "Toronto"   "Adelaide"  "Calgary"  
##  [7] "Sydney"    "Perth"     "Auckland"  "Helsinki"  "Zurich"

Will remove Helsinki, Brisbane, and Hamburg on account of their having very low rank scores (if we leave them in, the Hessian becomes non-positive-definite at the evaluate).

o %<>%  lapply(function(x){x[!(x %in% c("Hamburg","Helsinki","Brisbane"))]})
o
## [[1]]
##  [1] "Vienna"     "Copenhagen" "Zurich"     "Calgary"    "Vancouver" 
##  [6] "Geneva"     "Frankfurt"  "Toronto"    "Amsterdam"  "Osaka"     
## 
## [[2]]
## [1] "Auckland"   "Osaka"      "Adelaide"   "Wellington" "Tokyo"     
## [6] "Perth"      "Zurich"     "Geneva"     "Melbourne" 
## 
## [[3]]
##  [1] "Vienna"     "Melbourne"  "Sydney"     "Osaka"      "Calgary"   
##  [6] "Vancouver"  "Tokyo"      "Toronto"    "Copenhagen" "Adelaide"  
## 
## [[4]]
##  [1] "Vienna"     "Melbourne"  "Osaka"      "Calgary"    "Sydney"    
##  [6] "Vancouver"  "Tokyo"      "Toronto"    "Copenhagen" "Adelaide"  
## 
## [[5]]
## [1] "Melbourne" "Vienna"    "Vancouver" "Toronto"   "Adelaide"  "Calgary"  
## [7] "Perth"     "Auckland" 
## 
## [[6]]
## [1] "Melbourne" "Vienna"    "Vancouver" "Toronto"   "Adelaide"  "Calgary"  
## [7] "Perth"     "Auckland" 
## 
## [[7]]
##  [1] "Melbourne" "Vienna"    "Vancouver" "Toronto"   "Adelaide"  "Calgary"  
##  [7] "Sydney"    "Perth"     "Auckland"  "Zurich"

Now, turn it into a hyper2 object:

H <- hyper2()
for(i in seq_along(o)){
  H <- H  + race(o[[i]])
}

And do some tests:

(mH <- maxp(H))
##   Adelaide  Amsterdam   Auckland    Calgary Copenhagen  Frankfurt     Geneva 
##  0.0150413  0.0134252  0.0051853  0.0316141  0.0181273  0.0268233  0.0276787 
##  Melbourne      Osaka      Perth     Sydney      Tokyo    Toronto  Vancouver 
##  0.0874875  0.0295047  0.0115708  0.0412829  0.0337384  0.0291859  0.0650246 
##     Vienna Wellington     Zurich 
##  0.4885222  0.0655711  0.0102166
pie(mH)

equalp.test(H)
## 
##  Constrained support maximization
## 
## data:  H
## null hypothesis: Adelaide = Amsterdam = Auckland = Calgary = Copenhagen = Frankfurt = Geneva = Melbourne = Osaka = Perth = Sydney = Tokyo = Toronto = Vancouver = Vienna = Wellington = Zurich
## null estimate:
##   Adelaide  Amsterdam   Auckland    Calgary Copenhagen  Frankfurt     Geneva 
##   0.058824   0.058824   0.058824   0.058824   0.058824   0.058824   0.058824 
##  Melbourne      Osaka      Perth     Sydney      Tokyo    Toronto  Vancouver 
##   0.058824   0.058824   0.058824   0.058824   0.058824   0.058824   0.058824 
##     Vienna Wellington     Zurich 
##   0.058824   0.058824   0.058824 
## (argmax, constrained optimization)
## Support for null:  -94.429 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
##   Adelaide  Amsterdam   Auckland    Calgary Copenhagen  Frankfurt     Geneva 
##  0.0150413  0.0134252  0.0051853  0.0316141  0.0181273  0.0268233  0.0276787 
##  Melbourne      Osaka      Perth     Sydney      Tokyo    Toronto  Vancouver 
##  0.0874875  0.0295047  0.0115708  0.0412829  0.0337384  0.0291859  0.0650246 
##     Vienna Wellington     Zurich 
##  0.4885222  0.0655711  0.0102166 
## (argmax, free optimization)
## Support for alternative:  -78.746 + K
## 
## degrees of freedom: 16
## support difference = 15.683
## p-value: 0.01208
jjm <- maxp(H, n=1, give=TRUE, hessian=TRUE)
jjm$hessian
##            Adelaide Amsterdam Auckland Calgary Copenhagen Frankfurt Geneva
## Adelaide      27390     13252    18321   12420      10469     13252  13422
## Amsterdam     13252     17693    19149   13292      13061     12849  13162
## Auckland      18321     19149    87546   17474      19130     19149  19346
## Calgary       12420     13292    17474   17513      13199     13292  13502
## Copenhagen    10469     13061    19130   13199      19435     13061  13271
## Frankfurt     13252     12849    19149   13292      13061     14241  13162
## Geneva        13422     13162    19346   13502      13271     13162  15700
## Melbourne     13363     13273    19307   13499      13235     13273  13197
## Osaka         13184     12139    19161   13249      13018     12874  13162
## Perth         14194     15049    10423   13374      15031     15049  15113
## Sydney        13441     13527    19113   13461      13418     13527  13737
## Tokyo         12837     13154    19193   13292      12705     13154  13271
## Toronto       11984     12732    18826   12856      12190     12924  13236
## Vancouver     12949     13041    19060   13090      12873     13041  13250
## Vienna        13207     13065    19106   13301      13044     13065  13275
## Wellington    13234     13120    19159   13371      13102     13120  13271
##            Melbourne Osaka Perth Sydney Tokyo Toronto Vancouver Vienna
## Adelaide       13363 13184 14194  13441 12837   11984     12949  13207
## Amsterdam      13273 12139 15049  13527 13154   12732     13041  13065
## Auckland       19307 19161 10423  19113 19193   18826     19060  19106
## Calgary        13499 13249 13374  13461 13292   12856     13090  13301
## Copenhagen     13235 13018 15031  13418 12705   12190     12873  13044
## Frankfurt      13273 12874 15049  13527 13154   12924     13041  13065
## Geneva         13197 13162 15113  13737 13271   13236     13250  13275
## Melbourne      13792 13254 15073  13682 13218   13323     13275  13234
## Osaka          13254 15521 15049  13504 13086   12689     12997  13087
## Perth          15073 15049 36291  15013 15013   14727     14960  15007
## Sydney         13682 13504 15013  15161 13493   13436     13436  13490
## Tokyo          13218 13086 15013  13493 15325   12798     12966  13121
## Toronto        13323 12689 14727  13436 12798   18516     12785  13125
## Vancouver      13275 12997 14960  13436 12966   12785     14156  13076
## Vienna         13234 13087 15007  13490 13121   13125     13076  13045
## Wellington     13237 13120 15013  13549 13118   13195     13146  13089
##            Wellington
## Adelaide        13234
## Amsterdam       13120
## Auckland        19159
## Calgary         13371
## Copenhagen      13102
## Frankfurt       13120
## Geneva          13271
## Melbourne       13237
## Osaka           13120
## Perth           15013
## Sydney          13549
## Tokyo           13118
## Toronto         13195
## Vancouver       13146
## Vienna          13089
## Wellington      13317
diag(jjm$hessian)
##   Adelaide  Amsterdam   Auckland    Calgary Copenhagen  Frankfurt     Geneva 
##      27390      17693      87546      17513      19435      14241      15700 
##  Melbourne      Osaka      Perth     Sydney      Tokyo    Toronto  Vancouver 
##      13792      15521      36291      15161      15325      18516      14156 
##     Vienna Wellington 
##      13045      13317
eigen(jjm$hessian)$values
##  [1] 238481.617  57579.661  16426.937  13723.200   6643.586   5007.572
##  [7]   4632.761   3064.410   2326.895   2146.349   1753.594   1193.100
## [13]   1028.760    394.781    212.962     26.642

So the Hessian is positive-definite [all the eigenvalues are positive]. Also, the evaluate is robust, as indicated by consistency():

consistency(H)

Now we tether some of the cities together:

elastic <- list(
Austria     = c("Vienna"),
Denmark     = c("Copenhagen"),
Switzerland = c("Zurich","Geneva"),
Canada      = c("Calgary","Vancouver","Toronto"),
Germany     = c("Frankfurt","Hamburg"),
Netherlands = c("Amsterdam"),
Japan       = c("Osaka","Tokyo"),
Australia   = c("Melbourne","Adelaide","Perth","Brisbane","Sydney"),
NZ          = c("Auckland","Wellington"),
Finland     = c("Helsinki")
)

Now create a hyper3 object:

ML <- function(h){
   H3 <- hyper3()
   for(r in o){ H3 <- H3 + cheering3(r, e=list2nv(elastic), help=rep(h,10))}
   return(H3)
}
f <- function(h){maxp(ML(h), give=TRUE, n=1)$likes}

And plot a profile likelihood

l <- seq(from=1, to=5, len=10)
L <- sapply(l, f)

and plot it:

plot(log(l), L-max(L), type='b')

1 A coarser division

We will try dividing the countries into continents:

elastic_continent <- list(
Europe      = c("Vienna","Copenhagen","Zurich","Geneva","Frankfurt","Hamburg","Amsterdam","Helsinki"),
America     = c("Calgary","Vancouver","Toronto"),
Asia        = c("Osaka","Tokyo"),
Australasia = c("Melbourne","Adelaide","Perth","Brisbane","Sydney","Auckland","Wellington")
)

ec <- NULL
for(i in seq_along(elastic_continent)){
  jj <- rep(i,length(elastic_continent[[i]]))
  names(jj) <- elastic_continent[[i]]
  ec <- c(ec,jj)
}
ec 
##     Vienna Copenhagen     Zurich     Geneva  Frankfurt    Hamburg  Amsterdam 
##          1          1          1          1          1          1          1 
##   Helsinki    Calgary  Vancouver    Toronto      Osaka      Tokyo  Melbourne 
##          1          2          2          2          3          3          4 
##   Adelaide      Perth   Brisbane     Sydney   Auckland Wellington 
##          4          4          4          4          4          4
MLc <- function(h){
   H3 <- hyper3()
   for(r in o){ H3 <- H3 + cheering3(r,e=ec,help=rep(h,4))}
   return(H3)
}
fc <- function(h){maxp(ML(h),give=TRUE,n=1)$likes}
l <- seq(from=0.8,to=3,len=10)
L <- sapply(l,fc)
plot(l,L-max(L),type="b")
abline(v=1)

1.1 Country-level analysis

We need to replace each city with its country:

elastic
## $Austria
## [1] "Vienna"
## 
## $Denmark
## [1] "Copenhagen"
## 
## $Switzerland
## [1] "Zurich" "Geneva"
## 
## $Canada
## [1] "Calgary"   "Vancouver" "Toronto"  
## 
## $Germany
## [1] "Frankfurt" "Hamburg"  
## 
## $Netherlands
## [1] "Amsterdam"
## 
## $Japan
## [1] "Osaka" "Tokyo"
## 
## $Australia
## [1] "Melbourne" "Adelaide"  "Perth"     "Brisbane"  "Sydney"   
## 
## $NZ
## [1] "Auckland"   "Wellington"
## 
## $Finland
## [1] "Helsinki"
(jj <- list2nv(elastic,FALSE))
##        Vienna    Copenhagen        Zurich        Geneva       Calgary 
##     "Austria"     "Denmark" "Switzerland" "Switzerland"      "Canada" 
##     Vancouver       Toronto     Frankfurt       Hamburg     Amsterdam 
##      "Canada"      "Canada"     "Germany"     "Germany" "Netherlands" 
##         Osaka         Tokyo     Melbourne      Adelaide         Perth 
##       "Japan"       "Japan"   "Australia"   "Australia"   "Australia" 
##      Brisbane        Sydney      Auckland    Wellington      Helsinki 
##   "Australia"   "Australia"          "NZ"          "NZ"     "Finland"
o_countries <- lapply(o, function(x){jj[names(jj) %in% x]})
(o_countries <- lapply(o_countries, as.vector))
## [[1]]
##  [1] "Austria"     "Denmark"     "Switzerland" "Switzerland" "Canada"     
##  [6] "Canada"      "Canada"      "Germany"     "Netherlands" "Japan"      
## 
## [[2]]
## [1] "Switzerland" "Switzerland" "Japan"       "Japan"       "Australia"  
## [6] "Australia"   "Australia"   "NZ"          "NZ"         
## 
## [[3]]
##  [1] "Austria"   "Denmark"   "Canada"    "Canada"    "Canada"    "Japan"    
##  [7] "Japan"     "Australia" "Australia" "Australia"
## 
## [[4]]
##  [1] "Austria"   "Denmark"   "Canada"    "Canada"    "Canada"    "Japan"    
##  [7] "Japan"     "Australia" "Australia" "Australia"
## 
## [[5]]
## [1] "Austria"   "Canada"    "Canada"    "Canada"    "Australia" "Australia"
## [7] "Australia" "NZ"       
## 
## [[6]]
## [1] "Austria"   "Canada"    "Canada"    "Canada"    "Australia" "Australia"
## [7] "Australia" "NZ"       
## 
## [[7]]
##  [1] "Austria"     "Switzerland" "Canada"      "Canada"      "Canada"     
##  [6] "Australia"   "Australia"   "Australia"   "Australia"   "NZ"

We see that NZ and Australia are a dominated set [that is, they come last in every race], so we should remove them:

(o_countries <- lapply(o_countries, function(x){x[!(x %in% c("NZ","Australia","Japan"))]}))
## [[1]]
## [1] "Austria"     "Denmark"     "Switzerland" "Switzerland" "Canada"     
## [6] "Canada"      "Canada"      "Germany"     "Netherlands"
## 
## [[2]]
## [1] "Switzerland" "Switzerland"
## 
## [[3]]
## [1] "Austria" "Denmark" "Canada"  "Canada"  "Canada" 
## 
## [[4]]
## [1] "Austria" "Denmark" "Canada"  "Canada"  "Canada" 
## 
## [[5]]
## [1] "Austria" "Canada"  "Canada"  "Canada" 
## 
## [[6]]
## [1] "Austria" "Canada"  "Canada"  "Canada" 
## 
## [[7]]
## [1] "Austria"     "Switzerland" "Canada"      "Canada"      "Canada"

The next dominated set to remove would be Canada, Germany, Netherlands; and then Switzerland. This leaves us with Austria. So that was a big fat waste of time.

References

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.