Mining player behaviour from a subscription-based online game

With tens of terabytes of data being generated daily within the videogames industry, data mining has become crucial in gaining an advantage over competitors. This research sought to address the challenge of utilising data-mining techniques to improve player retention in a subscription-based online game, specifically World of Warcraft, by proposing a system with three main characteristics:  churn prediction, player-activity prediction and player clustering.


To achieve the set objective, the study relied on a publicly available dataset, sourced from the above-mentioned game.  This dataset provided logs containing vital in-game player details for every 10 minutes over the course of 3 years (2006-2008). These details include: player location, level, character, and whether the player might be a member of a guild or not. Through these attributes, other important elements were generated – e.g., playtime and downtime ‒ all of which were deemed beneficial to the characteristics identified above. 

World of Warcraft was identified as the most suitable game for this study due to its being generally considered as one of the most successful and long-standing games of its kind (massively multiplayer online role-playing games), while also being subscription-based. This research focused on the data recorded in 2008, which was one of the most successful years for World of Warcraft in view of a much-anticipated update. The 2008 dataset made it possible to trace the progress of around 40,000 players through the year.


The aim of including churn prediction was to identify players who would be likely to opt out of the game (churn) in the near future, by calculating the percentage chance of a player retaining or ‘churning’ in the coming month. Player-activity prediction allowed the prediction of certain features about the player over the ensuing month, including player level progression, map activity and playtime, giving further insight into the player’s development. Implementing these two characteristics entailed experimenting with different machine learning algorithms, including neural networks, regression models, ensemble models and support vector machines. These helped define the most effective player features, and finding the optimal amount of historical data needed to obtain the best results.

  
Lastly, the player clustering feature was included to serve two purposes: the first being to identify those players with similar interests and patterns, allowing less generalisation by profiling these players with similar interests. The second purpose was to support the other two characteristics by exploring the potential benefits of developing separate prediction models for different player clusters. Clustering was addressed by investigating different clustering approaches and features, which yielded distinct clusters from the dense dataset.

 
This research promises to facilitate identifying which players would be about to quit playing, while also gaining further knowledge about the player base. By providing a means to understand the players, predict their behaviour and their future actions, it would be possible for the system to provide more personalised services, tailored for each different player.

Figure 1. Screenshot the game World of Warcraft, which was the case study in this project

Student: Daniel Calafato

Supervisor : Dr Joel Azzopardi