Overwatch is a 6-a-side online team-based first-person shooter (fps) game in which two teams battle to achieve victory over each other by collectively competing to complete a set objective. Players on each team pick a unique character or ‘hero’ (limited to one each per team), from a roster of 30 (as of 2019).
Characters are categorized into 3 groups depending upon their designed role in a team:
Tanks survive longer and protect teammates from damage,
Supports restore the health of teammates, and
Damage are designed to be more mobile and deal higher damage to enemy heroes.
The dataset was obtained from Overwatch League Statslab and contains in-game statistics collected during the Overwatch league (OWL) 2019 professional esports tournament; including a record of how long each hero was played for by each player in each tournament match, which indicates the popularity of the character in the ‘meta’ - the generally agreed upon best strategy.
The game’s esports scene reached peak popularity around 2018-19, during which time a specific strategy, termed GOATS (after the North American team which discovered it), was discovered and became the popular choice for professional teams. This strategy composed of picking 3 specific tank, and 3 specific support heroes, neglecting mobile dps characters in favor of increased ability to survive. This was controversial as the characters used in this strategy lacked heavy aim requirements - a primary focal point of other fps games. Furthermore, as only a few specific characters were used in this strategy, other characters were less likely to be picked, reducing variety within the game.
note: the acronyms OWL and GOATS are unrelated to the animals
The popularity of GOATS triggered a wave of attempts by the developers of the game to decrease the power of the strategy during 2019, culminating with the introduction of a ‘role lock’ in early 2020; whereby, each team is required to pick 2 heroes from each role. This was likely done to increase the variety of characters picked.
The game experienced a sudden decline in popularity going into 2020, believed to be partially due to the implementation of a role lock. Thus, it is important to review if this was indeed a necessary change to de-throne a single dominating strategy, or an unneeded action with severe negative long-term impact on the game’s popularity.
The primary aim of this project is to visualize the distribution of hero popularity, and the severity of dominance of a single strategy during the OWL 2019 tournament. This will allow us to observe the effect on overall hero pick variety, and thus evaluate the game’s developer’s decision to implement a role lock (limitations on the number of each character type that can be played).
The /data folder contains data required for the project, and /figs contains images required for the project, as well as visualization outputs.
A codebook describing variables and functions within this project is located at /codebook.txt, and a description of csv data column titles is located in /data/rawdata/datasource.txt
The renv package was used to store package versions used within the project, future-proofing the project against future package updates.
Package versions used in this project are listed within the file /renv.lock
#Load packages with renv
install.packages("renv")
library(renv)
renv::restore()
#Import packages
library(tidyverse)
library(ggplot2)
library(here)
library(readr)
library(showtext)
The dataset consists of three separate csv files, so the initial step was to combine these into a single dataframe to facilitate simpler and easier data retrieval.
#Read raw data files and combine into a single csv file
combinedData <- list.files(path = here("data", "rawdata"),
pattern = "*.csv", full.names = TRUE) %>%
lapply(read_csv) %>%
bind_rows
#Save combined data set
combinedData_FileName = paste(here("data"), "/phs_2019_combined.csv", sep = "")
write.csv(combinedData, combinedData_FileName) #Write dataset to new .csv in /data
combinedData
The combined dataset contains a large amount of unnecessary information; thus, it was cleaned by extracting required data to a new data frame. This increases readability and decreases data frame processing time.
Furthermore, the play_time data was converted from seconds to hours to increase readability of the final visualization. This code was saved as a function, allowing it to be used on differently grouped data later in the project.
#Obtain data to plot hero playtime
heroData <- combinedData[combinedData$hero != "All Heroes", ] #Remove rows with "All Heroes" in hero column
playTimeData <- heroData[heroData$stat_name == "Time Played", ] %>% #Remove rows without "Time Played" in stat_name column
group_by(hero) %>% #Sum playtime for each hero
summarise(play_time = sum(stat_amount)) %>%
ungroup()
#Creates the function clean_pt; which, reorders and converts play time data into hours
clean_pt <- function(df){
df %>%
arrange(., desc(play_time)) %>% #Arrange rows by descending value of play_time
mutate(play_time = play_time/3600) %>% #Convert play_time from seconds to hours and round to the nearest whole number
mutate(rounded_play_time = round(play_time)) %>%
transform(rounded_play_time = as.character(rounded_play_time))
}
playTimeData <- clean_pt(playTimeData)
Colours of bars in the visualization correspond to colours of the heroes within the game, increasing glance value for those familiar with the game. This is important as those most likely to be interested in the data are those familiar with the game.
Some colours with low contrast to the white plot background were increased in saturation to facilitate easier plot reading.
#Aesthetics for graph
#Retrieve colour associated with each hero and add to playTimeData df as column hero_colour
heroColourLocation <- here("data", "hero_colours.csv")
heroColourData <- read.csv(file = heroColourLocation, encoding = "UTF-8")
colnames(heroColourData)[1] <- gsub('^.........','',colnames(heroColourData)[1]) #Fix error with imported column name
#Creates the function add_colour; which, for each hero in the dataframe, writes the related colour hex value to a column called hero_colour
add_colour <- function(df){
i = 0
for (h in df$hero){
i = i + 1
hrow <- which(heroColourData$hero == h)
df$hero_colour[i] <- paste(heroColourData$hex_value[hrow])
}
df
}
playTimeData <- add_colour(playTimeData)
A custom text was imported and applied to the visualization using the showtext package. The font is similar to fonts used within the game, making the visualization appear more professional and/or official, while increasing overall appeal.
#Load new custom font for showtext package
font_add_google(name = "Bebas Neue", family = "Bebas Neue")
showtext_auto()
Similar graph and theme options, which were used multiple times within the code, were saved as variables to facilitate easier adjustment.
#Define graph aesthetic variables
backgroundColour = "white"
textColour = "#5c5c5c"
textColourFaded = "#5c5c5c8a"
barLabelSize = 11
A bar chart was most appropriate for the data to compare a grouped continuous variable. Furthermore, it is simple and commonly used, increasing ease of data interpretation.
Only 15 out of the 30 heroes were included in the plot in order to increase visual clarity; while the idea that the visualization portrays was maintained by ordering heroes by play time. This allows the distribution of the hero ‘popularity ranking’ to be visualized to see if a meta had formed (suggested by a non-smooth distribution, dominated by a few popular heroes).
#Define data to plot
nheros <- 15:1 #Define number of heroes to show on graph
hpt <- ggplot(playTimeData[nheros,], aes(x = hero, y = play_time, fill = hero)) #Assign hero play time plot data to a variable
#Align text bar label with numerical bar label. Two labels are used to allow font size difference
i = 0
hourLabel <- c(0)
for (i in nheros){ #If playtime has more than 2 digits, increases spacing of HOURS label to align with numerical bar label
hourLabel[i] <- "HOURS"
if (playTimeData$play_time[i] > 99){
hourLabel[i] <- " HOURS"
}
}
playTimeData
A minimalist and clean design was used to reduce clutter, emphasizing clarity while maintaining a modern appeal.
Text contrast with the background is used to guide the viewer to important information, such as the title and hero names. Less contrasting colour values are used for less important information. If the viewer is specifically looking for this information they will be able to find it, but lowered contrast prevents it becoming overbearing to the clarity of the overall design.
Bar labels allow for accurate data readings while reducing the number of axis ticks required, allowing for a simpler plot. Number values are larger in size than the text unit label, guiding viewer attention to the more important information in a similar way to contrast (as described above). Number values were rounded to the nearest whole number, and previously converted from seconds to hours to increase readability.
Low opacity bar limit shadow plots preserve plot structure when removing axis lines, allowing for a more appealing design while maintaining the ability to accurately visually compare data.
#Plot graph and customize various plot aesthetics
graph <- hpt +
geom_bar(stat = "identity", mapping = aes(x = hero, y = playTimeData$play_time[1]), fill = "#e6e6e6") +
geom_bar(stat = "identity", show.legend = FALSE) +
geom_text(aes(label = rounded_play_time), y = 2, family = "Bebas Neue", colour = "white", size = barLabelSize, hjust = 0) +
geom_text(aes(label = rev(hourLabel)), y = 2, family = "Bebas Neue", colour = "white", size = 6.3, hjust = -0.80, vjust = 0.85) +
scale_x_discrete(limits = playTimeData$hero[nheros]) +
scale_y_continuous(limits = c(0, (playTimeData$play_time[1] + 20)), expand = c(0, 0)) +
xlab("HERO") +
ylab("PLAY TIME (hours)") +
labs(title = "Overwatch League Character Usage", subtitle = "by playtime during OWL 2019 stages 1-3", size = 40) +
theme_bw() +
coord_flip() +
scale_fill_manual(values = playTimeData$hero_colour, limits = playTimeData$hero) +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = backgroundColour),
plot.background = element_rect(fill = backgroundColour),
panel.background = element_rect(fill = backgroundColour),
text = element_text(family = "Bebas Neue", colour = textColour),
plot.title = element_text(size = 35, colour = "#505050"),
plot.subtitle = element_text(size = 25, colour = textColourFaded),
axis.title.x = element_text(size = 30, margin = margin(t=20), colour = textColourFaded, hjust = 0.42),
axis.title.y = element_text(size = 40, margin = margin(r=19), colour = "#5c5c5c4f"),
axis.ticks.x = element_line(colour = textColourFaded, size = 0.8),
axis.ticks.y = element_blank(),
axis.ticks.length.x = unit(0.2, "cm"),
axis.ticks.length.y = unit(0.3, "cm"),
axis.text.x = element_text(colour = textColourFaded, size = 16, angle = 55, vjust = 0.9, hjust = 0.95),
axis.text.y = element_text(colour = textColour, size = 24))
#Display graph
print(graph)
I encountered an issue with preserving custom fonts when saving the plot, which was fixed by plotting to a windows graphics device using windows() before saving this output as a .png. However, I am unable to test whether this fix would work on other operating systems, as I currently only have access to windows. Nevertheless, my research suggests the quartz() command is a possible fix to replace windows() for mac users.
#Save plot to figs folder
windows(width = 1200, height = 1200) #Open windows graphics device
print(graph)
dev.print(file = here("figs", "PlayTimeGraph.png"), device = png, width = 1200, height = 1200)
#Possible alternate fix for mac operating systems to preserve font
quartz(width = 1200, height = 1200)
print(graph)
quartz_save(file = here("figs", "PlayTimeGraph.png"), type = "png", dpi = 300)
The plot shows a severe difference between the play times of the top 6 heroes compared to the rest of the 30-hero roster. In combination with the knowledge that the GOATS strategy contains these same top 6 most popular heroes: Lucio, Brigitte, Zarya, D.Va, Zenyatta, and Reinhardt, this suggests that this single strategy completely dominated the competitive Overwatch scene during 2019. This heavily supports that the decision to implement a role lock was essential to preserve hero pick variety within the game. Nevertheless, although this plot suggests at imbalance in hero variety caused by significant use of the GOATS strategy, through correlation of popular heroes, it does not explicitly show how muchh each strategy was used, but rather each hero.
This interpretation of the plot, suggesting the severely differing play times of heroes was due to the popularity of single strategy, could be tested by plotting play times of heroes with each other hero as a teammate. The expectation is that the top 6 heroes were played significantly more together than when paired with any of the rest of the hero roster. However, the information required to plot this could not be extracted from the dataset available.
Equally, the popularity of the strategies in a competitive environment suggests their ability to win games. Thus, the win rates of heroes should correspond to their play time, and popularity. Nevertheless, this information was also not available from the dataset.
Furthermore, this plot does not show if any change in dominance of the strategy occured during the tournament. Thus, an animated plot could accommodate visualization of changes in the hero popularity distribution over the duration of 2019 OWL, and the effect of tournament stage type. This extension has been carried out below.
Play time data was grouped by both stage and hero, and previously saved functions were used to prepare the data to be plotted.
library(gganimate)
library(gifski)
#Obtain data to plot play time by season from combined dataset using previously created functions
stagePlayTimeData <- heroData[heroData$stat_name == "Time Played", ] %>%
group_by(stage, hero) %>%
summarise(play_time = sum(stat_amount)) %>%
ungroup() %>%
clean_pt() %>%
add_colour()
#Calculate total playtime in each season
stageTime <- heroData[heroData$stat_name == "Time Played", ] %>%
group_by(stage) %>%
summarise(play_time = sum(stat_amount)) %>%
ungroup() %>%
clean_pt()
#Divide hero play times by total playtime for the corresponding season, giving a percentage of total season play time the hero was used
j = 0
for (s in stagePlayTimeData$stage){
j = j + 1
srow <- which(stageTime$stage == s)
stagePlayTimeData$percent_play_time[j] = (stagePlayTimeData$play_time[j] / stageTime$play_time[srow]) * 100 * 6
}
#Define data to plot
shpt <- ggplot(stagePlayTimeData, aes(x = hero, y = percent_play_time, fill = hero))
stagePlayTimeData
Because the data does not smoothly correlate between states, I decided not to reorder the plot for each state to reduce confusion in the visualization. This has the downside of making it harder to view the specific popularity ranking of each hero. Similarly, a line graph may be more appropriate to show changes in popularity over time. However, the main purpose of this visualization is to evaluate the dominance in popularity of a single strategy, and thus visualize the formation of a gap in popularity between heroes, which a bar chart better achieves. For similar reasons, all heroes were included in the graph, as the top popularity rankings did not remain constant unlike the previous visualization.
#Plot graph and customize various plot aesthetics
staticAnimGraph <- shpt +
geom_bar(stat = "identity", show.legend = FALSE) +
scale_y_continuous(limits = c(0, 100), expand = c(0, 0)) +
scale_x_discrete(limits = rev, expand = c(0, 0)) +
xlab("HERO") +
ylab("PLAY TIME (hours)") +
labs(title = "Overwatch League Character Usage", subtitle = ("playtime during {closest_state}"), size = 40) +
theme_bw() +
coord_flip() +
scale_fill_manual(values = playTimeData$hero_colour, limits = playTimeData$hero) +
geom_vline(xintercept = seq(1.5, length = 32), lwd = 2, colour = "white") +
geom_vline(xintercept = seq(-1.5, length = 32), lwd = 2, colour = "white") +
theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.margin = unit(c(1, 1, 1, 1), "cm"),
axis.line = element_line(colour = backgroundColour),
plot.background = element_rect(fill = backgroundColour),
panel.background = element_rect(fill = "#e6e6e6"),
text = element_text(family = "Bebas Neue", colour = textColour),
plot.title = element_text(size = 30, colour = "#505050", vjust = 2),
plot.subtitle = element_text(size = 20, colour = textColourFaded),
axis.title.x = element_text(size = 30, margin = margin(t=20), colour = textColourFaded, hjust = 0.42),
axis.title.y = element_text(size = 40, margin = margin(r=19), colour = "#5c5c5c4f"),
axis.ticks.x = element_line(colour = textColourFaded, size = 0.8),
axis.ticks.y = element_blank(),
axis.ticks.length.x = unit(0.2, "cm"),
axis.ticks.length.y = unit(0.3, "cm"),
axis.text.x = element_text(colour = textColourFaded, size = 16, angle = 55, vjust = 0.9, hjust = 0.95),
axis.text.y = element_text(colour = textColour, size = 20)) +
transition_states(stage, transition_length = 2, state_length = 3)
#Render the animated graph as gif and save to figs folder
animate(staticAnimGraph, 200, fps = 20, width = 900, height = 1200,
renderer = gifski_renderer(here("figs", "SeasonPlayTimeGraph.gif")))
The GOATS strategy appears to remain dominant throughout most of 2019. However, towards the end of the year, during OWL stage 3 and its title matches, the popularity of heroes appears to shift away from the previously dominant 6. This suggests that another strategy may have naturally emerged to counter GOATS, and thus the introduction of a ‘role lock’ may not have been necessary to increase the variation in hero picks within the game. Hence, this decision by the developers may have been an unnecessary detriment to the game’s popularity in the years after 2019.
Additionally, the difference between play times of the top 6 heroes and the other 24 seems larger in title matches for stages 1 and 2 than regular matches. This indicates either less experimentation by teams during these more important matches, or that the teams using GOATS were the ones that were winning, thus reaching the finals and suggesting the strength of GOATS over the other strategies.
The non-animated plot better represents the dominance of one popular strategy throughout the tournament, while the animated plot represents changes throughout. However, the visualisation takes longer to interpret as due to it being animated, not all data is immediately available to the reader. Therefore, the animated plot shows more information, but the simplicity of the non-animted plot allows it to quickly demonstrate the existence of imbalanced hero pick rates due to formation of a popular strategy.
In order to assess whether the teams reaching title matches were those using the GOATS strategy, hero play times could be plotted and grouped by team. Furthermore, hero pick distribution could be plotted against viewer amounts for the respective stages, investigating a possible link between game popularity and hero pick variety.
Repo: https://github.com/PeterBeech/PSY6422_project
Overwatch stats lab: https://www.overwatchleague.com/en-us/statstab
Article about the GOATS strategy: https://doesports.com/overwatch/news/overwatch-goats-explained