Wednesday, September 6, 2017

How to Create a Bot to Play Games Like TicTacToe Using Statistical Reinforced Machine Learning

Creating your own bot that can play TicTacToe can be done in a few simple steps:

  1. First you'll need to sign into a Bot Libre account.
  2. Once you're signed in, create a bot and then import the script TicTacToeText.
  3. By typing start in the chat, your bot will be able to create a board to play against you in TicTacToe and eventually learn not to lose.

What the Script does

Script link:

This script TicTacToeText is written in Self, made up of multiple functions that carry out certain tasks. They get called upon when patterns find a match to what you say. The first function "start" is called when you enter "tictactoe." It has the bot ask if you want to be X or O. So when you enter X or O, the function "whoStarts" will be called and the bot will create the board using a function called "drawBoard" and depending on whether you wanted to be X or O the bot will either return the board for you to make a move if you entered X, or it will make a move and then return the board if you entered O. The whoStarts function also set the conversation topic. Now, the game has started and from now to the end of the game, whatever you do, unless you type start a function "play" will be called.


The drawBoard function takes how the bot sees the game, a string of nine "_", converts it to how you see the game, more or less a 3x3 grid and vice versa. The function assigns a variable to each space using A, B, C and 1, 2, 3 so each space now has a coordinate. Next, it creates a three row table with three cells each to make a board with coordinates A1, A2, A3 in the top row, B1, B2, B3 in the middle row and C1, C2, C3 on the bottom row. Then so you can click on the spaces in the table to make your move, the function adds a link to each cell.


This function takes the previous board, adds your move to create a new board, checks if you won, and then calls a function for the bot to make a move. When you click on a space it will be given a coordinate. This coordinate will be checked to see if it contains a X, O, or _ and if it's not a _ the function will return "Space is taken." Otherwise, your move will be added to the board and then a function will be used to check to see if you won. If not the "makeMove" function will be carried out.


This is where the bot makes its move. It uses other function to find a good move and then updates the board with its move and return the board to you in the chat for you to make your move. Before it returns the board, it will check to see if it has won using a function called "checkGameOver" or if there are no moves left it will say, "Tie game" along with giving you the board. 


One of the functions makeMove uses to find a good move is called lookAhead. Essentially, it looks ahead to see if it can win in a couple moves or block you from winning using loops. Firstly, it determine every move it can make and using the checkGameOver function, checks to see if it wins in any of those moves. Secondly, for every move it makes the function will look at every move you, the player, can make. It will again use checkGameOver only this time to see if you can win and if you can, the bot will move there to block you. Finally, this function will look another move ahead, looking at every possible move the bot can make for every possible move you could have made to see if it can win in two ways and proceed to make that move first move. It looks for two ways so that it will only make that move if the bot's win is guaranteed because even is you block one of the ways it can win, there is still the second way it can win.


If the lookAhead function concludes that the bot can't win and there are no moves where the player can win for it to block, the makeMove function will call on the findGoodMove function to find the best move for the bot, meaning it has lead to its win and your loss. In order to determine this every possible board is assigned a value. The bot establishes every possible move it can make and call the value of it which is stored on the conversation. For every move it can make, every move you can make is established as well and the value called. Then, the function will look at the value every move you can make in order to find the best one. After that it will find the difference of its possible move and your worst value and repeat this for all of its possible moves. Finally, it will look for the highest difference in order to determine the bot's statistically best move.


In order to find out if the bot has won or lost the function checkGameOver is used. It contains every possible way TicTacToe can be won. So when it's called, if someone has won it will either return true or false depending on who won.


In the makeMove function, when the bot wins the countWins function is called. It takes the board from every move it made and assigns a value to it, and according to its significance to the win, it adds a value to half of the board's previous value. This means the first board with only one move is less significant and less added to it than the final board where the bot won. Then for every move the player made, each board is assigned a value and according to its significance to the loss, half of a value is subtracted from half of the board's previous value. This value that is being added and subtracted is half of 1 divided by the number of boards in the game and after being added to each of the boards' value, this value is added onto itself so that when the value is added onto the winning board, its value is half of 1, 0.5. So, the more the bot wins the greater the bot's boards' values will be and the lower the player's boards' values will be.


In the play function, when the bot losses the countLosses function is called. Essentially it does the opposite of the countWins function, the value is subtracted from the bot's boards and added to player's boards. So, the more the bot losses the lower the bot's boards' values will be and the higher the player's boards' values will be. Therefore as the bot plays, the value of the boards will increase and decrease until eventually all the good moves will have a positive value between 0 and 1 and all the bad moves will have a negative value between 0 and 1.


The main thing the endGame function does is reset the boards counted from the previous game since the values have already been calculated. This ensures that when finding the value by dividing 1 by the number of boards, it won't include boards from the previous game.

Machine Learning

This script uses reinforced machine learning to learn how to play better the more games it plays. It uses the bot's knowledge base to do this, by storing each board that it has ever played. It tracks a score for each board that determines the probability it can win or loss from that board. This lets it learn from good and bad moves and not make the same mistake twice.

Since there are 5,812 possible different TicTacToe boards, it may take a while for the bot to play perfectly. The script does not currently rotate and flip boards to reduce equivalent boards, if it did this it would only need to learn 765 boards. Also, the look ahead greatly reduces the number of boards it needs to learn. Ignoring look ahead boards it only needs to learn 96 boards.

The scripts also has an "tictactoe autoplay *" function that lets it play against itself, so it can learn on its own.

The same functions and algorithms could be applied to most board games. However the number of possible boards for games like Chess would require the bot to learn too many boards to be perfect. For more complex games it would be better to use a neural network, or use strategy functions optimized with genetic algorithms.


Script link:

Similar to TicTacToeText this is another script that replaces the avatar with the board. So, the game doesn't occur in the conversation like it does with the TicTacToeText script.

TicTacToeText TicTacToeCommands

Play TicTacToe now with the GamesBot, or Brain Bot.

No comments:

Post a Comment