Pokémon is a fictional universe centered around capturing, training, and battling creatures known as Pokémon, each possessing unique traits and abilities. What is referred to as “Competitive Pokémon” involves two people battling their own Pokémon against each other making strategic choices to knock out all of their opponents team. The goal of my project is to create an evaluation of any given turn of a battle and to predict the eventual winner. Another strategy game, chess, uses an evaluation bar to calculate the advantage of each side given any board state. My neural net uses any given turn to evaluate the win percentage of the player. It takes in turn information such as Pokémon health, stats, and type for all Pokémon and teams. Overall this results in a prediction accuracy of roughly 74% with accuracy increasing based on turn number.
The goal is to be able to predict the outcome of a competitive Pokémon match using any available information in a given game state. I was inspired to do this project because of my interest in both chess and Pokémon. Chess commonly uses an evaluation metric for board states so I wanted to create something similar for a Pokémon match. Another reason making an evaluation is important is for building bots that play at a high level. To put it simply, you could make a bot that checks the evaluation of each possible move choice and then picks the one with the highest likelihood to win. I find this particularly useful because I have created bots that play Pokémon in the past and this neural net could be used to build a more intelligent bot.
PyTorch, Sklearn - Very useful for building the base of the neural net and to make it efficient.
Matplotlib - Used for data visualization (graphing).
Pokémon Showdown - An open source online Pokémon battling simulator, this is how I am able to interface with the game of Pokémon and get data from matches and get information about stats/type of a Pokémon.
poke-env - A library that allows you to interface with Pokémon Showdown allowing for the playing of thousands of games for the purposes of collecting data.
As stated before the goal is to use available information from a game state to make a prediction. Fully defining the state of a Pokémon game can become very convoluted. There are many things to consider that for simplicity’s sake were ignored such as weather, terrain, hazards and more. I decided to focus on what I thought to be the three most important factors of a Pokémon battle. The first and most important is Pokémon health. In a given state of Pokémon we can see every Pokémon's health, with 1 meaning a fully healthy Pokémon and 0 meaning it is fainted and unable to battle. In figure 1 it can be seen turn 1 of a Pokémon battle and the associated collected data.
The first turn of battle as represented by the actual game (left) and as represented by the input data (right)
The next piece of information I included was Pokémon types. In this game every Pokémon has 1 or 2 types. These types follow a rock, paper, scissors style of strategy where every type performs well against some types and poorly against other types. If one team had types that counter the other team then they are much more likely to win that match. In order to represent this information inside the neural net I used 1-hot encoding to define the types for each Pokémon. There are 18 different types so I use an array to represent them. For example take the Pokémon called Sandslash. It is a Ground so its type is represented as [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0].
The last piece of information I included was a Pokémon's stats. Every Pokémon has 5 stats that determine how much damage they deal and how much damage they take. These include Attack, Defense, Sp. Attack, Sp. Defense, and Speed. This helps determine the outcome of the game because you could for example have a high speed and high attack stat Pokémon and the opponent team had Pokémon with low defenses meaning their team is at a disadvantage. This was included by simply including the stats and dividing it by 200 to normalize it in the array. An example stat array looks like [0.99, 0.99, 0.725,0.725, 0.685].
For the actual collection of data I used previously made Pokémon bots and had them play against each other on a local Pokémon Showdown server. During the game it writes down the information associated with every turn. After the game concludes, it chooses one turn at random to include in the data set and labels it with a win or lose. The reason I choose to not take multiple samples from one game is to make sure all data samples are independent from one another which is useful for testing and validation.
The neural net has an input size of 289. This is turn number + (Pokémon type(18) + Pokémon health and stats (6)) * 12 because there are 6 Pokémon on each team. It uses two hidden layers of size 64 and 32 and an output size of 1. The hidden layers use a Relu activation function to extract features and for efficiency. The output uses a sigmoid activation to output a probability. It uses a Binary cross-entropy loss function because of the binary classification of win or loss.
The overall result was about a 74% accuracy for predicting the outcome of the game. When the neural net added the type classification for Pokémon it had massive overfitting problems with the data. In order to solve this I implemented cross validation to see the validation error alongside the training error.
This led to very poor generalization of around 60% accuracy. Reducing the number of epochs to around 30 improved generalization a lot.
The graph shows the final model with the implementation of health, types, and stats. There are still some overfitting problems but not as pronounced.
This is a graph of predication accuracy on the y-axis and turn number on the x-axis. There is a linear relationship observed from this graph. This makes sense because its easier to make a prediction about the outcome the closer it is to the end of the game.
The development of a neural network to predict the outcome of competitive Pokémon matches demonstrates a novel application of machine learning to a dynamic strategy game. By using aspects of the game state like health, type matchups, and stats, this project achieves a prediction accuracy of 74%, with improved performance as the battles progresses. Despite challenges like overfitting and the inherent complexity of Pokémon battles, careful preprocessing and model adjustments enabled meaningful insights into the relationship between game state information and match outcomes. This work not only parallels similar advancements in chess evaluation systems but also paves the way for creating intelligent bots capable of high levels of play.