To cite the hyper2 package in publications, please use Hankin (2017).
In this short vignette I use the hyper2 package to analyse an
influential Counter-Strike match between two professional teams (FaZe
Clan and Cloud9) in the major championships at Boston 2018. I use
likelihood methods to demonstrate that one player, Skadoodle, has a
legitimate claim to be the strongest player on the field; but the data
is indistinguishable from a null of equal player strength.
Counter-strike dataset kindly supplied by Zachary Hankin.
E-sports are a form of competition using video games. E-sports are becoming increasingly popular, with high-profile tournaments attracting over 400 million viewers, and prize pools exceeding US$20m.
Counter Strike: Global Offensive (CS-GO) is a multiplayer first-person shooter game in which two teams of five compete in an immersive virtual reality combat environment. CS-GO is distinguished by the ability to download detailed gamefiles in which every (sic) aspect of an entire match is recorded, it being possible to replay the match at will.
Statistical analysis of such gamefiles is extremely difficult, primarily due to complex gameplay features such as cooperative teamwork, within-team communication, and real-time strategic fluidity.
“ELEAGUE Major: Boston 2018”, here denoted “Boston 2018”, was a major championship held in Boston Massachusetts from 26-28 January 2018 featuring 24 professional teams; the purse was US$1000000.
The final pitted FaZe Clan against Cloud9, with Cloud9 beating favourites FaZe Clan 2-1.
The dataset used here is a list whose six elements correspond to six rounds of play, which are assumed to be statistically independent. The R idiom is:
zacslist <- list(
round1 = c("Skadoodle","olofmeister","tarik","GuardiaN","RUSH",
"rain", "Stewie2k","karrigan","autimatic","NiKo"),
round2 = c("karrigan", "NiKo", "Stewie2K", "RUSH", "rain", "GuardiaN",
"autimatic", "olofmeister"),
round3 = c("rain","tarik","autimatic", "karrigan","RUSH","GuardiaN",
"Stewie2K","NiKo","olofmeister"),
round4 = c("rain","GuardiaN","karrigan", "NiKo","olofmeister"),
round5 = c("olofmeister","rain","karrigan", "tarik","Stewie2K","autimatic"),
round6 = c("GuardiaN","karrigan")
)
(data kindly supplied by Zachary Hankin). Thus in round 1, the death order was Skadoodle first, followed by olofmeister, then tarik.
Suppose team1 = (a,b,c,d) and team2 = (x,y,z); suppose the
deathorder were (a,b,y,c). Then the likelihood function for a’s
death would be (x+y+z)/(a+b+c+d+x+y+z) [that is, the likelihood
function for a Bernoulli trial between team (x,y,z) and
(a,b,c,d)]. Note the appearance of (a) in the denominator: he
[surely!] was on the losing team. The likelihood function for the
full deathorder is then
(x+y+z)/(a+b+c+d+x+y+z) * (x+y+z)/(b+c+d+x+y+z) * (c+d)/(c+d+x+y+z) * (x+z)/(c+d+x+z)
where the four terms correspond to the deaths of a,b,y,c
respectively. See how the denominator gets shorter as the teams die
one by one.
team1 <- c("autimatic","tarik", "Skadoodle","Stewie2k","RUSH") #Cloud9
team2 <- c("NiKo","olofmeister","karrigan","GuardiaN","rain") # FaZe Clan
Consider the following function:
`counterstrike_maker` <- function(team1,team2,deathorder){
if(identical(sort(c(team1,team2)),sort(deathorder))){
deathorder <- deathorder[-length(deathorder)] # last player not killed
}
H <- hyper2(pnames=c(team1,team2))
for(killed_player in deathorder){
if(killed_player %in% team1){
H[team2] %<>% inc
H[c(team1,team2)] %<>% dec
team1 %<>% "["(team1 != killed_player)
} else {
H[team1] %<>% inc
H[c(team1,team2)] %<>% dec
team2 %<>% "["(team2 != killed_player)
}
}
return(H)
}
Function counterstrike_maker() defined above returns a
log-likelihood function for the strengths of the players in team1
and team2 above. The function needs the identities of all players
on each team. It assumes that players are always killed by the
opposing team; if a player is killed by his own team, function
counterstrike_maker() may return an error. The function does not
have strong error-checking functionality.
In the function, the deathorder argument specifies the order in
which players were killed (the first element is the first player to be
killed and so on). Note that the identity of the killer is not needed
as it is assumed that a death is due to the combined strength of the
team, rather than the individual shooter who actually fired the shot.
Note that it is not required for all the players to be killed;
short rounds are OK (but are not as statistically informative).
We can use it:
counterstrike <- hyper2(pnames=c(team1,team2))
for(i in zacslist){
counterstrike <- counterstrike + (counterstrike_maker(team1,team2, deathorder=i))
}
then plot it:
counterstrike_maxp <- maxp(counterstrike)
dotchart(counterstrike_maxp,pch=16,main='observed data')
showing that Skadoodle has the highest estimated strength.
it is possible to analyse synthetic data generated randomly by
function rdeath():
`rdeath` <- function(team1,team2){
out <- NULL
while(length(team1)>0 & length(team2)>0){
if(runif(1)>length(team1)/(length(c(team1,team2)))){ ## killed person is on team1
killed_player <- sample(team1,1)
team1 %<>% "["(team1 != killed_player)
} else { ## killed person is on team2
killed_player <- sample(team2,1)
team2 %<>% "["(team2 != killed_player)
}
out <- c(out,killed_player)
}
return(out)
}
Hrand <- hyper2(pnames=c(team1,team2))
for(i in 1:6){
pp <- rdeath(team1,team2)
Hrand <- Hrand + counterstrike_maker(team1,team2, pp)
}
dotchart(maxp(Hrand),pch=16,main='synthetic data')
We now run \(n=1000\) trials on the assumption of equal player strength (that is, random deathorder) and for each trial calculate the log-likelihood ratio [for the hypothesis p=equal_strengths, vs the hypothesis p=maxp(Hrand)]. It then draws a histogram of the loglikelihood ratio thus obtained, and superimposes a line showing the observed loglikelihood ratio.
datapoint <- loglik(indep(maxp(counterstrike)),counterstrike) - loglik(indep(equalp(counterstrike)),counterstrike)
n <- 100 # takes about 5 minutes for n=1000 trials
distrib <- rep(0,n)
for(i in seq_len(n)){
Hrand <- hyper2(pnames=c(team1,team2))
for(jj in 1:6){
pp <- rdeath(team1,team2)
Hrand <- Hrand + counterstrike_maker(team1,team2, pp)
}
distrib[i] <- loglik(indep(maxp(Hrand)),Hrand) - loglik(indep(equalp(counterstrike)),Hrand)
}
hist(distrib)
abline(v=datapoint,lwd=6)
Thus we can see that the observation is not significant: from this perspective, the deathorder may as well be totally random.
Following lines create counterstrike.rda, residing in the data/ directory of the package.
save(counterstrike,counterstrike_maxp,zacslist,file="counterstrike.rda")
I now analyse every game played by Cloud9 in the tournament.
https://www.hltv.org/stats/teams/matches/5752/cloud9?event=3247
As far as I can tell this is every match played at ELEAGUE Major 2018
in Boston by Cloud9.
a <- read.table("cloud9_boston_2018.txt",header=TRUE)
a
## date opponent map result
## 1 28/01/2018 FaZe inferno W
## 2 28/01/2018 FaZe overpass W
## 3 28/01/2018 FaZe mirage L
## 4 28/01/2018 SK inferno W
## 5 28/01/2018 SK cobblestone L
## 6 28/01/2018 SK mirage W
## 7 28/01/2018 G2 overpass W
## 8 28/01/2018 G2 mirage W
## 9 28/01/2018 Vega mirage W
## 10 28/01/2018 Astralis train W
## 11 28/01/2018 Virtus mirage W
## 12 28/01/2018 Space cobblestone L
## 13 28/01/2018 G2 cache L
## deathorder
## 1 Skadoodle, RUSH, tarik, autimatic, Stewie2K
## 2 RUSH, Stewie2K, Skadoodle, tarik, autimatic
## 3 Skadoodle, RUSH, autimatic, Stewie2K, tarik
## 4 Stewie2K, autimatic, tarik, RUSH, Skadoodle
## 5 Skadoodle, Stewie2K, autimatic, tarik, RUSH
## 6 Skadoodle, autimatic, RUSH, tarik, Stewie2K
## 7 RUSH, Skadoodle, autimatic, Stewie2K, tarik
## 8 Stewie2K, autimatic, tarik, RUSH, Skadoodle
## 9 tarik, RUSH, Stewie2K, autimatic, Skadoodle
## 10 Skadoodle, RUSH, autimatic, Stewie2K, tarik
## 11 Skadoodle, RUSH, tarik, autimatic, Stewie2K
## 12 RUSH, Skadoodle, autimatic, tarik, Stewie2K
## 13 tarik, Stewie2K, RUSH, autimatic, Skadoodle
Above we see from the first line that on 28 Jan 2018, playing FaZe
on the Inferno map we have a win, with deathorder Skadoodle, RUSH,
etc.
We will look at the effect of Skadoodle on Stewie2K.
H <- hyper2()
for(i in seq_len(nrow(a))){
d <- strsplit(a$deathorder[i],", ")[[1]]
H <- H + suppfun(d)
}
H
## log(RUSH^12 * (RUSH + Skadoodle)^-2 * (RUSH + Skadoodle + Stewie2K +
## autimatic)^-2 * (RUSH + Skadoodle + Stewie2K + autimatic + tarik)^-13 *
## (RUSH + Skadoodle + autimatic)^-1 * (RUSH + Skadoodle + autimatic +
## tarik)^-2 * (RUSH + Skadoodle + tarik)^-2 * (RUSH + Stewie2K +
## autimatic + tarik)^-6 * (RUSH + Stewie2K + tarik)^-1 * (RUSH +
## autimatic + tarik)^-1 * (RUSH + tarik)^-1 * Skadoodle^9 * (Skadoodle +
## Stewie2K + autimatic)^-1 * (Skadoodle + Stewie2K + autimatic +
## tarik)^-3 * (Skadoodle + autimatic)^-2 * (Skadoodle + autimatic +
## tarik)^-1 * Stewie2K^9 * (Stewie2K + autimatic)^-2 * (Stewie2K +
## autimatic + tarik)^-6 * (Stewie2K + tarik)^-5 * autimatic^12 *
## (autimatic + tarik)^-1 * tarik^10)
samep.test(H,c("Skadoodle","Stewie2K"))
##
## Constrained support maximization
##
## data: H
## null hypothesis: Skadoodle = Stewie2K
## null estimate:
## autimatic RUSH Skadoodle Stewie2K tarik
## 0.19538 0.30628 0.17498 0.17498 0.14837
## (argmax, constrained optimization)
## Support for null: -60.944 + K
##
## alternative hypothesis: sum p_i=1
## alternative estimate:
## autimatic RUSH Skadoodle Stewie2K tarik
## 0.18900 0.30759 0.23541 0.13027 0.13773
## (argmax, free optimization)
## Support for alternative: -60.354 + K
##
## degrees of freedom: 1
## support difference = 0.59006
## p-value: 0.27733
ec <-c(autimatic=1, RUSH=1, Skadoodle=2, Stewie2K=2, tarik=1)
mk <- function(lam){
H3 <- hyper3()
for(i in seq_len(nrow(a))){
d <- strsplit(a$deathorder[i],", ")[[1]]
H3 <- H3 + cheering3(d, e=ec, help=c(1,lam))
}
return(H3)
}
Find the MLE for \(\lambda\):
f <- function(l){maxp(mk(l), give=1)$`Log-likelihood`}
optlam <- optimize(f,c(0.09, 0.14), maximum = TRUE)
optlam
## $maximum
## [1] 0.10402
##
## $objective
## [1] -54.763
And the BT strengths at the evaluate:
maxp(mk(optlam$maximum))
## autimatic RUSH Skadoodle Stewie2K tarik
## 0.102781 0.184221 0.365279 0.279514 0.068205
Now some profile likelihood
maxp(mk(0.5),give=1)
## [[1]]
## autimatic RUSH Skadoodle Stewie2K tarik
## 0.16193 0.26848 0.27956 0.17518 0.11485
##
## $`Log-likelihood`
## [1] -57.37
maxp(mk(1.01),give=1)
## [[1]]
## autimatic RUSH Skadoodle Stewie2K tarik
## 0.18938 0.30815 0.23476 0.12965 0.13806
##
## $`Log-likelihood`
## [1] -60.406
f <- function(l){maxp(mk(l), give=1)$`Log-likelihood`}
lam <- exp(seq(from = -4,to = log(1.1), len = 33))
L <- sapply(lam, f)
g <- function(l){f(l) - optlam$objective + 2}
jj1 <- uniroot(g, interval = exp(c(-4,-3)))
jj2 <- uniroot(g, interval = exp(c(-1,-0)))
jj1
## $root
## [1] 0.022161
##
## $f.root
## [1] -0.00090392
##
## $iter
## [1] 4
##
## $init.it
## [1] NA
##
## $estim.prec
## [1] 6.1035e-05
jj2
## $root
## [1] 0.41435
##
## $f.root
## [1] 1.1667e-07
##
## $iter
## [1] 3
##
## $init.it
## [1] NA
##
## $estim.prec
## [1] 8.9495e-05
plot(log(lam), L - max(L), type='l')
segments(x0=log(optlam$maximum),y0 = -0.5, y1 = 0.1,col='red')
abline(h=c(0,-2), col = 'gray')
segments(x0 = log(jj1$root), y0 = -2.2, y1 = -1.8, col = 'blue')
segments(x0 = log(jj2$root), y0 = -2.2, y1 = -1.8, col = 'blue')
abline(v = 0, lty = 2)
hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.