(takes about ten minutes run without cache)
Here I use the concept of dominated set to simplify a Plackett-Luce likelihood function.
This analysis is essentially without value, see the very end for why
To cite the hyper2 package in publications, please use
Hankin (2017). The global liveability
ranking
shows the nicest places to live. Here I apply Plackett-Luce
likelihoods but use flawed logic to eliminate poorly-ranked cities.
The dataset comprises seven observations covering years 2015-2022. Each year, ten or eleven cities are ranked in order of liveability, according to a number of criteria such as prevalence of crime, healthcare, education, and infrastructure.
(glr <- readLines("global_liveability_ranking.txt"))
## [1] "2022 Vienna Copenhagen Zurich Calgary Vancouver Geneva Frankfurt Toronto Amsterdam Osaka"
## [2] "2021 Auckland Osaka Adelaide Wellington Tokyo Perth Zurich Geneva Melbourne Brisbane"
## [3] "2019 Vienna Melbourne Sydney Osaka Calgary Vancouver Tokyo Toronto Copenhagen Adelaide"
## [4] "2018 Vienna Melbourne Osaka Calgary Sydney Vancouver Tokyo Toronto Copenhagen Adelaide"
## [5] "2017 Melbourne Vienna Vancouver Toronto Adelaide Calgary Perth Auckland Helsinki Hamburg"
## [6] "2016 Melbourne Vienna Vancouver Toronto Adelaide Calgary Perth Auckland Helsinki Hamburg"
## [7] "2015 Melbourne Vienna Vancouver Toronto Adelaide Calgary Sydney Perth Auckland Helsinki Zurich"
(o <- glr %>% strsplit(" ") %>% lapply(function(l){l[-1]}))
## [[1]]
## [1] "Vienna" "Copenhagen" "Zurich" "Calgary" "Vancouver"
## [6] "Geneva" "Frankfurt" "Toronto" "Amsterdam" "Osaka"
##
## [[2]]
## [1] "Auckland" "Osaka" "Adelaide" "Wellington" "Tokyo"
## [6] "Perth" "Zurich" "Geneva" "Melbourne" "Brisbane"
##
## [[3]]
## [1] "Vienna" "Melbourne" "Sydney" "Osaka" "Calgary"
## [6] "Vancouver" "Tokyo" "Toronto" "Copenhagen" "Adelaide"
##
## [[4]]
## [1] "Vienna" "Melbourne" "Osaka" "Calgary" "Sydney"
## [6] "Vancouver" "Tokyo" "Toronto" "Copenhagen" "Adelaide"
##
## [[5]]
## [1] "Melbourne" "Vienna" "Vancouver" "Toronto" "Adelaide" "Calgary"
## [7] "Perth" "Auckland" "Helsinki" "Hamburg"
##
## [[6]]
## [1] "Melbourne" "Vienna" "Vancouver" "Toronto" "Adelaide" "Calgary"
## [7] "Perth" "Auckland" "Helsinki" "Hamburg"
##
## [[7]]
## [1] "Melbourne" "Vienna" "Vancouver" "Toronto" "Adelaide" "Calgary"
## [7] "Sydney" "Perth" "Auckland" "Helsinki" "Zurich"
Will remove Helsinki, Brisbane, and Hamburg on account of their having very low rank scores (if we leave them in, the Hessian becomes non-positive-definite at the evaluate).
o %<>% lapply(function(x){x[!(x %in% c("Hamburg","Helsinki","Brisbane"))]})
o
## [[1]]
## [1] "Vienna" "Copenhagen" "Zurich" "Calgary" "Vancouver"
## [6] "Geneva" "Frankfurt" "Toronto" "Amsterdam" "Osaka"
##
## [[2]]
## [1] "Auckland" "Osaka" "Adelaide" "Wellington" "Tokyo"
## [6] "Perth" "Zurich" "Geneva" "Melbourne"
##
## [[3]]
## [1] "Vienna" "Melbourne" "Sydney" "Osaka" "Calgary"
## [6] "Vancouver" "Tokyo" "Toronto" "Copenhagen" "Adelaide"
##
## [[4]]
## [1] "Vienna" "Melbourne" "Osaka" "Calgary" "Sydney"
## [6] "Vancouver" "Tokyo" "Toronto" "Copenhagen" "Adelaide"
##
## [[5]]
## [1] "Melbourne" "Vienna" "Vancouver" "Toronto" "Adelaide" "Calgary"
## [7] "Perth" "Auckland"
##
## [[6]]
## [1] "Melbourne" "Vienna" "Vancouver" "Toronto" "Adelaide" "Calgary"
## [7] "Perth" "Auckland"
##
## [[7]]
## [1] "Melbourne" "Vienna" "Vancouver" "Toronto" "Adelaide" "Calgary"
## [7] "Sydney" "Perth" "Auckland" "Zurich"
Now, turn it into a hyper2 object:
H <- hyper2()
for(i in seq_along(o)){
H <- H + race(o[[i]])
}
And do some tests:
(mH <- maxp(H))
## Adelaide Amsterdam Auckland Calgary Copenhagen Frankfurt Geneva
## 0.0150413 0.0134252 0.0051853 0.0316141 0.0181273 0.0268233 0.0276787
## Melbourne Osaka Perth Sydney Tokyo Toronto Vancouver
## 0.0874875 0.0295047 0.0115708 0.0412829 0.0337384 0.0291859 0.0650246
## Vienna Wellington Zurich
## 0.4885222 0.0655711 0.0102166
pie(mH)
equalp.test(H)
##
## Constrained support maximization
##
## data: H
## null hypothesis: Adelaide = Amsterdam = Auckland = Calgary = Copenhagen = Frankfurt = Geneva = Melbourne = Osaka = Perth = Sydney = Tokyo = Toronto = Vancouver = Vienna = Wellington = Zurich
## null estimate:
## Adelaide Amsterdam Auckland Calgary Copenhagen Frankfurt Geneva
## 0.058824 0.058824 0.058824 0.058824 0.058824 0.058824 0.058824
## Melbourne Osaka Perth Sydney Tokyo Toronto Vancouver
## 0.058824 0.058824 0.058824 0.058824 0.058824 0.058824 0.058824
## Vienna Wellington Zurich
## 0.058824 0.058824 0.058824
## (argmax, constrained optimization)
## Support for null: -94.429 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## Adelaide Amsterdam Auckland Calgary Copenhagen Frankfurt Geneva
## 0.0150413 0.0134252 0.0051853 0.0316141 0.0181273 0.0268233 0.0276787
## Melbourne Osaka Perth Sydney Tokyo Toronto Vancouver
## 0.0874875 0.0295047 0.0115708 0.0412829 0.0337384 0.0291859 0.0650246
## Vienna Wellington Zurich
## 0.4885222 0.0655711 0.0102166
## (argmax, free optimization)
## Support for alternative: -78.746 + K
##
## degrees of freedom: 16
## support difference = 15.683
## p-value: 0.01208
jjm <- maxp(H, n=1, give=TRUE, hessian=TRUE)
jjm$hessian
## Adelaide Amsterdam Auckland Calgary Copenhagen Frankfurt Geneva
## Adelaide 27390 13252 18321 12420 10469 13252 13422
## Amsterdam 13252 17693 19149 13292 13061 12849 13162
## Auckland 18321 19149 87546 17474 19130 19149 19346
## Calgary 12420 13292 17474 17513 13199 13292 13502
## Copenhagen 10469 13061 19130 13199 19435 13061 13271
## Frankfurt 13252 12849 19149 13292 13061 14241 13162
## Geneva 13422 13162 19346 13502 13271 13162 15700
## Melbourne 13363 13273 19307 13499 13235 13273 13197
## Osaka 13184 12139 19161 13249 13018 12874 13162
## Perth 14194 15049 10423 13374 15031 15049 15113
## Sydney 13441 13527 19113 13461 13418 13527 13737
## Tokyo 12837 13154 19193 13292 12705 13154 13271
## Toronto 11984 12732 18826 12856 12190 12924 13236
## Vancouver 12949 13041 19060 13090 12873 13041 13250
## Vienna 13207 13065 19106 13301 13044 13065 13275
## Wellington 13234 13120 19159 13371 13102 13120 13271
## Melbourne Osaka Perth Sydney Tokyo Toronto Vancouver Vienna
## Adelaide 13363 13184 14194 13441 12837 11984 12949 13207
## Amsterdam 13273 12139 15049 13527 13154 12732 13041 13065
## Auckland 19307 19161 10423 19113 19193 18826 19060 19106
## Calgary 13499 13249 13374 13461 13292 12856 13090 13301
## Copenhagen 13235 13018 15031 13418 12705 12190 12873 13044
## Frankfurt 13273 12874 15049 13527 13154 12924 13041 13065
## Geneva 13197 13162 15113 13737 13271 13236 13250 13275
## Melbourne 13792 13254 15073 13682 13218 13323 13275 13234
## Osaka 13254 15521 15049 13504 13086 12689 12997 13087
## Perth 15073 15049 36291 15013 15013 14727 14960 15007
## Sydney 13682 13504 15013 15161 13493 13436 13436 13490
## Tokyo 13218 13086 15013 13493 15325 12798 12966 13121
## Toronto 13323 12689 14727 13436 12798 18516 12785 13125
## Vancouver 13275 12997 14960 13436 12966 12785 14156 13076
## Vienna 13234 13087 15007 13490 13121 13125 13076 13045
## Wellington 13237 13120 15013 13549 13118 13195 13146 13089
## Wellington
## Adelaide 13234
## Amsterdam 13120
## Auckland 19159
## Calgary 13371
## Copenhagen 13102
## Frankfurt 13120
## Geneva 13271
## Melbourne 13237
## Osaka 13120
## Perth 15013
## Sydney 13549
## Tokyo 13118
## Toronto 13195
## Vancouver 13146
## Vienna 13089
## Wellington 13317
diag(jjm$hessian)
## Adelaide Amsterdam Auckland Calgary Copenhagen Frankfurt Geneva
## 27390 17693 87546 17513 19435 14241 15700
## Melbourne Osaka Perth Sydney Tokyo Toronto Vancouver
## 13792 15521 36291 15161 15325 18516 14156
## Vienna Wellington
## 13045 13317
eigen(jjm$hessian)$values
## [1] 238481.617 57579.661 16426.937 13723.200 6643.586 5007.572
## [7] 4632.761 3064.410 2326.895 2146.349 1753.594 1193.100
## [13] 1028.760 394.781 212.962 26.642
So the Hessian is positive-definite [all the eigenvalues are
positive]. Also, the evaluate is robust, as indicated by
consistency():
consistency(H)
Now we tether some of the cities together:
elastic <- list(
Austria = c("Vienna"),
Denmark = c("Copenhagen"),
Switzerland = c("Zurich","Geneva"),
Canada = c("Calgary","Vancouver","Toronto"),
Germany = c("Frankfurt","Hamburg"),
Netherlands = c("Amsterdam"),
Japan = c("Osaka","Tokyo"),
Australia = c("Melbourne","Adelaide","Perth","Brisbane","Sydney"),
NZ = c("Auckland","Wellington"),
Finland = c("Helsinki")
)
Now create a hyper3 object:
ML <- function(h){
H3 <- hyper3()
for(r in o){ H3 <- H3 + cheering3(r, e=list2nv(elastic), help=rep(h,10))}
return(H3)
}
f <- function(h){maxp(ML(h), give=TRUE, n=1)$likes}
And plot a profile likelihood
l <- seq(from=1, to=5, len=10)
L <- sapply(l, f)
and plot it:
plot(log(l), L-max(L), type='b')
We will try dividing the countries into continents:
elastic_continent <- list(
Europe = c("Vienna","Copenhagen","Zurich","Geneva","Frankfurt","Hamburg","Amsterdam","Helsinki"),
America = c("Calgary","Vancouver","Toronto"),
Asia = c("Osaka","Tokyo"),
Australasia = c("Melbourne","Adelaide","Perth","Brisbane","Sydney","Auckland","Wellington")
)
ec <- NULL
for(i in seq_along(elastic_continent)){
jj <- rep(i,length(elastic_continent[[i]]))
names(jj) <- elastic_continent[[i]]
ec <- c(ec,jj)
}
ec
## Vienna Copenhagen Zurich Geneva Frankfurt Hamburg Amsterdam
## 1 1 1 1 1 1 1
## Helsinki Calgary Vancouver Toronto Osaka Tokyo Melbourne
## 1 2 2 2 3 3 4
## Adelaide Perth Brisbane Sydney Auckland Wellington
## 4 4 4 4 4 4
MLc <- function(h){
H3 <- hyper3()
for(r in o){ H3 <- H3 + cheering3(r,e=ec,help=rep(h,4))}
return(H3)
}
fc <- function(h){maxp(ML(h),give=TRUE,n=1)$likes}
l <- seq(from=0.8,to=3,len=10)
L <- sapply(l,fc)
plot(l,L-max(L),type="b")
abline(v=1)
We need to replace each city with its country:
elastic
## $Austria
## [1] "Vienna"
##
## $Denmark
## [1] "Copenhagen"
##
## $Switzerland
## [1] "Zurich" "Geneva"
##
## $Canada
## [1] "Calgary" "Vancouver" "Toronto"
##
## $Germany
## [1] "Frankfurt" "Hamburg"
##
## $Netherlands
## [1] "Amsterdam"
##
## $Japan
## [1] "Osaka" "Tokyo"
##
## $Australia
## [1] "Melbourne" "Adelaide" "Perth" "Brisbane" "Sydney"
##
## $NZ
## [1] "Auckland" "Wellington"
##
## $Finland
## [1] "Helsinki"
(jj <- list2nv(elastic,FALSE))
## Vienna Copenhagen Zurich Geneva Calgary
## "Austria" "Denmark" "Switzerland" "Switzerland" "Canada"
## Vancouver Toronto Frankfurt Hamburg Amsterdam
## "Canada" "Canada" "Germany" "Germany" "Netherlands"
## Osaka Tokyo Melbourne Adelaide Perth
## "Japan" "Japan" "Australia" "Australia" "Australia"
## Brisbane Sydney Auckland Wellington Helsinki
## "Australia" "Australia" "NZ" "NZ" "Finland"
o_countries <- lapply(o, function(x){jj[names(jj) %in% x]})
(o_countries <- lapply(o_countries, as.vector))
## [[1]]
## [1] "Austria" "Denmark" "Switzerland" "Switzerland" "Canada"
## [6] "Canada" "Canada" "Germany" "Netherlands" "Japan"
##
## [[2]]
## [1] "Switzerland" "Switzerland" "Japan" "Japan" "Australia"
## [6] "Australia" "Australia" "NZ" "NZ"
##
## [[3]]
## [1] "Austria" "Denmark" "Canada" "Canada" "Canada" "Japan"
## [7] "Japan" "Australia" "Australia" "Australia"
##
## [[4]]
## [1] "Austria" "Denmark" "Canada" "Canada" "Canada" "Japan"
## [7] "Japan" "Australia" "Australia" "Australia"
##
## [[5]]
## [1] "Austria" "Canada" "Canada" "Canada" "Australia" "Australia"
## [7] "Australia" "NZ"
##
## [[6]]
## [1] "Austria" "Canada" "Canada" "Canada" "Australia" "Australia"
## [7] "Australia" "NZ"
##
## [[7]]
## [1] "Austria" "Switzerland" "Canada" "Canada" "Canada"
## [6] "Australia" "Australia" "Australia" "Australia" "NZ"
We see that NZ and Australia are a dominated set [that is, they come last in every race], so we should remove them:
(o_countries <- lapply(o_countries, function(x){x[!(x %in% c("NZ","Australia","Japan"))]}))
## [[1]]
## [1] "Austria" "Denmark" "Switzerland" "Switzerland" "Canada"
## [6] "Canada" "Canada" "Germany" "Netherlands"
##
## [[2]]
## [1] "Switzerland" "Switzerland"
##
## [[3]]
## [1] "Austria" "Denmark" "Canada" "Canada" "Canada"
##
## [[4]]
## [1] "Austria" "Denmark" "Canada" "Canada" "Canada"
##
## [[5]]
## [1] "Austria" "Canada" "Canada" "Canada"
##
## [[6]]
## [1] "Austria" "Canada" "Canada" "Canada"
##
## [[7]]
## [1] "Austria" "Switzerland" "Canada" "Canada" "Canada"
The next dominated set to remove would be Canada, Germany, Netherlands; and then Switzerland. This leaves us with Austria. So that was a big fat waste of time.
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.