An algorithm trained with genetic evolution to play the most iconic video game ever. Scroll down to see how it works. I had a Java clone lying around for some time when I found out the Bizhawk emulator let's you run LUA scripts so did some research, ported the script and let it run wild.
A bit of Tetris history. Once you reach lvl 30, you enter the so-called kill screen. The blocks double their speed, moving down one line per frame making it unable for humans to succesfully continue playing. With expert play, you can hit 999.999 just before reaching the kill screen. This was done as a safeguard to prevent the game from crashing.
In the 8-bit era, memory was limited and a fixed amount of memory was allocated to store variables. For example, the number to keep track of the level uses 5 bits (11111 being 31). If the level would become higher, other bits of memory are overwritten. Bits that might store information for something else. This results in the unusual colour schemes in the second and third picture and the game eventually crashing.
So make sure humans can't get that far and the game is safe from crashing.
This limitation gets circumvented by the LUA script, which just keeps smashing buttons until the game eventually crashes after completing around 1500 lines.
You can find an unedited 17 minute video on YouTube of it playing NES Tetris to 999.999.
How does it work?
Each time the game presents a new block, the algorithm needs to figure out where to place it. To do this, a heuristic function is designed to give an objective score by calculating a value for several properties of our playing field. To come up with this heuristic, just think about how you would play Tetris and what is important to do well. Making lines seems good, having no holes in your playing field, the top being smooth rather than very bumpy. And write a little function to measure these metrics.
|Landing height||Number of rows the piece can move down|
|Lines cleared||Number of lines cleared|
|Row transitions||Number of times we go from an empty square to a filled square or vice versa in each row|
|Column transitions||Same but for columns|
|Holes||Number of cells with a filled square above|
|Wells||Number of stacked empty squares|
|Board height||Total height of all columns|
For example, given this next piece and board state
we calculate the score for all possibile positions and orientations of this piece and keep track of the best result. The position and orientation yielding the best score is where we put the piece. In our example, the case with red border.
This however is only half the puzzle. Not every metric in our heuristic is equally important. So we assign weights to them.
Score = a x (landing height) + b x (lines cleared) + c x (row transitions) + ...
To determine our weights a, b, c, ... we use a genetic algorithm mimicking natural selection. There are plenty of ways to do this and parameters to alter such as population size, % of previous generation to replace, amount of randomness introduced or how to create offspring. Some common traits are
- Fitness function to see how well each individual performs
- Randomly generate the first generation
- Use the best performing individuals as parents to produce offspring by crossover breeding using some genes (our weights) of each parent
- Replace the worst performing part of your current generation with the new offspring, becoming the next generation
- Add some randomness either to the offspring or replace part of your generation by new randomly generated individuals. This is needed to prevent getting stuck due to inbreeding. Passing on the same genes over and over.
For our Tetris game, we start by randomly generating our first generation of 1000 individuals. For each individual in this generation, we run 100 Tetris games (capped at 500 pieces to reduce time) and keep track of the ingame scores - our fitness - reached. We take the 10% best scoring individuals and let them be the parents of our next generation. Using some of the weights of each parent, we create new individuals. We do this until we have 300 new individuals, with a small added chance for mutation to prevent getting stuck down the same path. We replace the 30% worst performing individuals by our newly generated set, giving us our next generation. Repeat until we are satisfied with the performance.
We plug the weights of the best performing individual of our final generation into our heuristic function and we are all set to destroy Tetris.
I first implemented a Java Tetris clone to which I added the above AI. The program is optimised for speed, using bit operations wherever possible. Most notably, the playing field is a single array of 20 integers instead of a normal 20x10 double array. Each row is represented by an integer between [0, 1023] which gives you a 10 bit representation, one bit for each square. 1 if the square is occupied, 0 if it's empty.
It clears well over 1000 lines per second which translates to placing about 5 pieces per millisecond. The algorithm is able to clear several million lines before hitting a combination of too many bad pieces. For comparison, you only need a few 100 lines to reach maximum score in NES Tetris, which would take only a fraction of a second.
About a year later, I ported the algorithm to LUA to work with the BizHawk NES emulator.
The LUA version reads directly from memory to get the state of the playing field, game state and next piece information. Luckily a RAM map of NES Tetris exists. Mostly used for tool assisted speedruns, it also works great for this kind of coding projects.
|0048||value 8 - end of spawn delay; prepare next piece|
|00BF||PieceID of next piece (valid orientations: $02, $07, $08, $0A, $0B, $0E and $12)|
|0400-04C7||Playing field 20x10|
After calculating the best score, it sends button presses to the emulator to rotate and place the piece. Which results in the LUA script playing NES Tetris until memory errors occur.