To cite the hyper2 package in publications, please use Hankin (2017). In this short vignette I use the hyper2 package to analyse an influential Counter-Strike match between two professional teams (FaZe Clan and Cloud9) in the major championships at Boston 2018. I use likelihood methods to demonstrate that one player, Skadoodle, has a legitimate claim to be the strongest player on the field; but the data is indistinguishable from a null of equal player strength. Counter-strike dataset kindly supplied by Zachary Hankin.

1 Counterstrike

E-sports are a form of competition using video games. E-sports are becoming increasingly popular, with high-profile tournaments attracting over 400 million viewers, and prize pools exceeding US$20m.

Counter Strike: Global Offensive (CS-GO) is a multiplayer first-person shooter game in which two teams of five compete in an immersive virtual reality combat environment. CS-GO is distinguished by the ability to download detailed gamefiles in which every (sic) aspect of an entire match is recorded, it being possible to replay the match at will.

Statistical analysis of such gamefiles is extremely difficult, primarily due to complex gameplay features such as cooperative teamwork, within-team communication, and real-time strategic fluidity.

1.1 Boston 2018

“ELEAGUE Major: Boston 2018”, here denoted “Boston 2018”, was a major championship held in Boston Massachusetts from 26-28 January 2018 featuring 24 professional teams; the purse was US$1000000.

The final pitted FaZe Clan against Cloud9, with Cloud9 beating favourites FaZe Clan 2-1.

2 Dataset

The dataset used here is a list whose six elements correspond to six rounds of play, which are assumed to be statistically independent. The R idiom is:

zacslist <- list(
    round1 = c("Skadoodle","olofmeister","tarik","GuardiaN","RUSH",
         "rain", "Stewie2k","karrigan","autimatic","NiKo"),
    round2 = c("karrigan", "NiKo", "Stewie2K", "RUSH", "rain", "GuardiaN",
         "autimatic", "olofmeister"),
    round3 = c("rain","tarik","autimatic", "karrigan","RUSH","GuardiaN",
         "Stewie2K","NiKo","olofmeister"),
    round4 = c("rain","GuardiaN","karrigan", "NiKo","olofmeister"),
    round5 = c("olofmeister","rain","karrigan", "tarik","Stewie2K","autimatic"),
    round6 = c("GuardiaN","karrigan")
)

(data kindly supplied by Zachary Hankin). Thus in round 1, the death order was Skadoodle first, followed by olofmeister, then tarik.

2.1 Likelihood function for a simple deathorder statistic

Suppose team1 = (a,b,c,d) and team2 = (x,y,z); suppose the deathorder were (a,b,y,c). Then the likelihood function for a’s death would be (x+y+z)/(a+b+c+d+x+y+z) [that is, the likelihood function for a Bernoulli trial between team (x,y,z) and (a,b,c,d)]. Note the appearance of (a) in the denominator: he [surely!] was on the losing team. The likelihood function for the full deathorder is then

 (x+y+z)/(a+b+c+d+x+y+z) * (x+y+z)/(b+c+d+x+y+z) * (c+d)/(c+d+x+y+z) * (x+z)/(c+d+x+z)

where the four terms correspond to the deaths of a,b,y,c respectively. See how the denominator gets shorter as the teams die one by one.

3 Likelihood function for the observed dataset

team1  <- c("autimatic","tarik", "Skadoodle","Stewie2k","RUSH")   #Cloud9
team2 <- c("NiKo","olofmeister","karrigan","GuardiaN","rain")  # FaZe Clan

Consider the following function:

`counterstrike_maker` <- function(team1,team2,deathorder){

  if(identical(sort(c(team1,team2)),sort(deathorder))){
    deathorder <- deathorder[-length(deathorder)]  # last player not killed
  }

  H <- hyper2(pnames=c(team1,team2))
  
  for(killed_player in deathorder){
    if(killed_player %in% team1){
      H[team2] %<>% inc
      H[c(team1,team2)] %<>% dec
      team1 %<>% "["(team1 != killed_player)      
    } else {
      H[team1] %<>% inc
      H[c(team1,team2)] %<>% dec
      team2 %<>% "["(team2 != killed_player)      
    }
  }
  return(H)
}

Function counterstrike_maker() defined above returns a log-likelihood function for the strengths of the players in team1 and team2 above. The function needs the identities of all players on each team. It assumes that players are always killed by the opposing team; if a player is killed by his own team, function counterstrike_maker() may return an error. The function does not have strong error-checking functionality.

In the function, the deathorder argument specifies the order in which players were killed (the first element is the first player to be killed and so on). Note that the identity of the killer is not needed as it is assumed that a death is due to the combined strength of the team, rather than the individual shooter who actually fired the shot. Note that it is not required for all the players to be killed; short rounds are OK (but are not as statistically informative).

We can use it:

counterstrike <- hyper2(pnames=c(team1,team2))
for(i in zacslist){
  counterstrike <- counterstrike + (counterstrike_maker(team1,team2, deathorder=i))
}

then plot it:

counterstrike_maxp <- maxp(counterstrike)
dotchart(counterstrike_maxp,pch=16,main='observed data')

showing that Skadoodle has the highest estimated strength.

4 Random data

it is possible to analyse synthetic data generated randomly by function rdeath():

`rdeath` <- function(team1,team2){
  out <- NULL
  while(length(team1)>0 & length(team2)>0){
    if(runif(1)>length(team1)/(length(c(team1,team2)))){ ## killed person is on team1
      killed_player <- sample(team1,1)
      team1 %<>% "["(team1 != killed_player)      
    } else { ## killed person is on team2
      killed_player <- sample(team2,1)
      team2 %<>% "["(team2 != killed_player) 
    }
    out <- c(out,killed_player)
  }
  return(out)
}

Hrand <- hyper2(pnames=c(team1,team2))
for(i in 1:6){
  pp <- rdeath(team1,team2)
  Hrand <- Hrand + counterstrike_maker(team1,team2, pp)
}

dotchart(maxp(Hrand),pch=16,main='synthetic data')

We now run \(n=1000\) trials on the assumption of equal player strength (that is, random deathorder) and for each trial calculate the log-likelihood ratio [for the hypothesis p=equal_strengths, vs the hypothesis p=maxp(Hrand)]. It then draws a histogram of the loglikelihood ratio thus obtained, and superimposes a line showing the observed loglikelihood ratio.

datapoint <- loglik(indep(maxp(counterstrike)),counterstrike) - loglik(indep(equalp(counterstrike)),counterstrike)
    
n <- 100  # takes about 5 minutes for n=1000 trials
distrib <- rep(0,n)
for(i in seq_len(n)){
  Hrand <- hyper2(pnames=c(team1,team2))
  for(jj in 1:6){
    pp <- rdeath(team1,team2)
    Hrand <- Hrand + counterstrike_maker(team1,team2, pp)
    }
  distrib[i] <- loglik(indep(maxp(Hrand)),Hrand) - loglik(indep(equalp(counterstrike)),Hrand)
}

hist(distrib)
abline(v=datapoint,lwd=6)

Thus we can see that the observation is not significant: from this perspective, the deathorder may as well be totally random.

4.0.1 Package dataset

Following lines create counterstrike.rda, residing in the data/ directory of the package.

save(counterstrike,counterstrike_maxp,zacslist,file="counterstrike.rda")

4.1 A red bus blue bus analysis.

I now analyse every game played by Cloud9 in the tournament.

https://www.hltv.org/stats/teams/matches/5752/cloud9?event=3247

As far as I can tell this is every match played at ELEAGUE Major 2018 in Boston by Cloud9.

a <- read.table("cloud9_boston_2018.txt",header=TRUE)
a
##          date opponent         map result
## 1  28/01/2018     FaZe     inferno      W
## 2  28/01/2018     FaZe    overpass      W
## 3  28/01/2018     FaZe      mirage      L
## 4  28/01/2018       SK     inferno      W
## 5  28/01/2018       SK cobblestone      L
## 6  28/01/2018       SK      mirage      W
## 7  28/01/2018       G2    overpass      W
## 8  28/01/2018       G2      mirage      W
## 9  28/01/2018     Vega      mirage      W
## 10 28/01/2018 Astralis       train      W
## 11 28/01/2018   Virtus      mirage      W
## 12 28/01/2018    Space cobblestone      L
## 13 28/01/2018       G2       cache      L
##                                     deathorder
## 1  Skadoodle, RUSH, tarik, autimatic, Stewie2K
## 2  RUSH, Stewie2K, Skadoodle, tarik, autimatic
## 3  Skadoodle, RUSH, autimatic, Stewie2K, tarik
## 4  Stewie2K, autimatic, tarik, RUSH, Skadoodle
## 5  Skadoodle, Stewie2K, autimatic, tarik, RUSH
## 6  Skadoodle, autimatic, RUSH, tarik, Stewie2K
## 7  RUSH, Skadoodle, autimatic, Stewie2K, tarik
## 8  Stewie2K, autimatic, tarik, RUSH, Skadoodle
## 9  tarik, RUSH, Stewie2K, autimatic, Skadoodle
## 10 Skadoodle, RUSH, autimatic, Stewie2K, tarik
## 11 Skadoodle, RUSH, tarik, autimatic, Stewie2K
## 12 RUSH, Skadoodle, autimatic, tarik, Stewie2K
## 13 tarik, Stewie2K, RUSH, autimatic, Skadoodle

Above we see from the first line that on 28 Jan 2018, playing FaZe on the Inferno map we have a win, with deathorder Skadoodle, RUSH, etc.

We will look at the effect of Skadoodle on Stewie2K.

H <- hyper2()
  for(i in seq_len(nrow(a))){
      d <- strsplit(a$deathorder[i],", ")[[1]]
      H <- H + suppfun(d)
  }
H
## log(RUSH^12 * (RUSH + Skadoodle)^-2 * (RUSH + Skadoodle + Stewie2K +
## autimatic)^-2 * (RUSH + Skadoodle + Stewie2K + autimatic + tarik)^-13 *
## (RUSH + Skadoodle + autimatic)^-1 * (RUSH + Skadoodle + autimatic +
## tarik)^-2 * (RUSH + Skadoodle + tarik)^-2 * (RUSH + Stewie2K +
## autimatic + tarik)^-6 * (RUSH + Stewie2K + tarik)^-1 * (RUSH +
## autimatic + tarik)^-1 * (RUSH + tarik)^-1 * Skadoodle^9 * (Skadoodle +
## Stewie2K + autimatic)^-1 * (Skadoodle + Stewie2K + autimatic +
## tarik)^-3 * (Skadoodle + autimatic)^-2 * (Skadoodle + autimatic +
## tarik)^-1 * Stewie2K^9 * (Stewie2K + autimatic)^-2 * (Stewie2K +
## autimatic + tarik)^-6 * (Stewie2K + tarik)^-5 * autimatic^12 *
## (autimatic + tarik)^-1 * tarik^10)
samep.test(H,c("Skadoodle","Stewie2K"))
## 
##  Constrained support maximization
## 
## data:  H
## null hypothesis: Skadoodle = Stewie2K
## null estimate:
## autimatic      RUSH Skadoodle  Stewie2K     tarik 
##   0.19538   0.30628   0.17498   0.17498   0.14837 
## (argmax, constrained optimization)
## Support for null:  -60.944 + K
## 
## alternative hypothesis:  sum p_i=1 
## alternative estimate:
## autimatic      RUSH Skadoodle  Stewie2K     tarik 
##   0.18900   0.30759   0.23541   0.13027   0.13773 
## (argmax, free optimization)
## Support for alternative:  -60.354 + K
## 
## degrees of freedom: 1
## support difference = 0.59006
## p-value: 0.27733
ec <-c(autimatic=1, RUSH=1, Skadoodle=2, Stewie2K=2, tarik=1)
mk <- function(lam){
  H3 <- hyper3()
  for(i in seq_len(nrow(a))){
    
    d <- strsplit(a$deathorder[i],", ")[[1]]
    H3 <- H3 + cheering3(d, e=ec, help=c(1,lam))
  }
  return(H3)
}

Find the MLE for \(\lambda\):

f <- function(l){maxp(mk(l), give=1)$`Log-likelihood`}
optlam <- optimize(f,c(0.09, 0.14), maximum = TRUE)
optlam
## $maximum
## [1] 0.10402
## 
## $objective
## [1] -54.763

And the BT strengths at the evaluate:

maxp(mk(optlam$maximum))
## autimatic      RUSH Skadoodle  Stewie2K     tarik 
##  0.102781  0.184221  0.365279  0.279514  0.068205

Now some profile likelihood

maxp(mk(0.5),give=1)
## [[1]]
## autimatic      RUSH Skadoodle  Stewie2K     tarik 
##   0.16193   0.26848   0.27956   0.17518   0.11485 
## 
## $`Log-likelihood`
## [1] -57.37
maxp(mk(1.01),give=1)
## [[1]]
## autimatic      RUSH Skadoodle  Stewie2K     tarik 
##   0.18938   0.30815   0.23476   0.12965   0.13806 
## 
## $`Log-likelihood`
## [1] -60.406
f <- function(l){maxp(mk(l), give=1)$`Log-likelihood`}
lam <- exp(seq(from = -4,to = log(1.1), len = 33))
L <- sapply(lam, f)
g <- function(l){f(l) - optlam$objective  + 2}
jj1 <- uniroot(g, interval = exp(c(-4,-3)))
jj2 <- uniroot(g, interval = exp(c(-1,-0)))
jj1
## $root
## [1] 0.022161
## 
## $f.root
## [1] -0.00090392
## 
## $iter
## [1] 4
## 
## $init.it
## [1] NA
## 
## $estim.prec
## [1] 6.1035e-05
jj2
## $root
## [1] 0.41435
## 
## $f.root
## [1] 1.1667e-07
## 
## $iter
## [1] 3
## 
## $init.it
## [1] NA
## 
## $estim.prec
## [1] 8.9495e-05
plot(log(lam), L - max(L), type='l')
segments(x0=log(optlam$maximum),y0 = -0.5, y1 = 0.1,col='red')
abline(h=c(0,-2), col = 'gray')
segments(x0 = log(jj1$root), y0 = -2.2, y1 = -1.8, col = 'blue')
segments(x0 = log(jj2$root), y0 = -2.2, y1 = -1.8, col = 'blue')
abline(v = 0, lty = 2)

References

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.