As of today I have coded what I think is an accurate simulator for a finite board version of Conway's Game of Life. The actual version specifies an infinite board, but this is close enough. I should implement unit tests at some point to make sure everything is simulated correctly, especially with regards to the literal edge cases of board wraparound. This checks off the first two goals on the Project Outline, with the third having significant progress.

Overview

First, I will give an overview of the whole system. The program relies on raylib for graphics functions, with the raylib-zig bindings being used. It's only utilized in the program's main function, which mainly sets up the raylib window, establishes necessary memory, and runs an update loop. Inside the update loop, the board is stepped one tick and its bits are then translated into a byte-sized framebuffer to be drawn to the screen. This can be done on a keypress or every frame. All functionality related to the state update is in the function updateBoard.

The entire state of the board, contained in the variable board, is an array of bytes with one bit allocated for each cell on the game board. I learned about similar techniques like bitsets from the website One Million Checkboxes, more specifically the developer's writeup about the site (really watching a video of someone else covering it). Using such data structures allows maximizing the limited storage available on microcontrollers. Without using one, storing the entire board as an array of one-byte boolean values would require either a microcontrollers with a large amount of persistent memory or an external memory chip. The downside is that doing anything requires large amount of bitwise operations. However, these operations are some of the fastest for a CPU to execute, so this is OK. A slice, which to my understanding are just pointers into an array with a length attached, along with the width and height of the bitmap, are all that is passed into updateBoard.

updateBoard is not a very complicated function. It essentially just loops over each row of the board and ticks it one step forward before saving it back to the bitfield and moving one row down. It starts by saving a copy of the first row to be used when computing the final row of the board. For the screen I'm predicting to use (which will be explained later), one row is 360 pixels, or 45 bytes. The microcontroller I'm looking at has 4kB RAM available, so there should be no issue saving that. In addition, another row length array is initialized to hold the top row of pixels. This is to enable each updated row to be immediately written back to the array containing the board. Two slices are also created to hold references to the active and bottom rows. The last two row updates are broken out of the loop to deal with the special case of the bottom row without needing a branch instruction to be evaluated every loop.

The real core of the program is the function stepRow. Its job is to take three slices, each representing one row on the board, top, middle, and bottom. Since the next state of any cell is only dependent the cell and its neighbors, this information is enough to tell us the next state of each cell in the middle row. The function loops over each cell, performing this series of tasks:

Based on a table lookup, set a certain bit in a storage byte.
Once the byte is full, store it into the bitset.
Load a column into the lookup byte (this will be explained shortly).
The lookup byte is how the program actually determines the next state of the cell. It relies on two pregenerated bitsets made by Zig's comptime execution features. Remember how the next state of each cell only depends on the current states of that cell and its neighbors? On a 2D grid, each cell has eight neighbors, one for each bit in a byte. At comptime, two 32 byte, or 256 bit, arrays are generated. This means that any byte is a valid index into this bitset. The value at each index corresponds to the next state of the cell given the states of neighbors indicated by the byte. Since the rules for a cell differ based on whether the cell is alive or dead, two tables are needed.

Now that I think about it, this could be improved by having one table with a nine bit index, the 9th bit being the state of the cell being evaluated. This would massively simplify the logic needed to shift in a new column, requiring no bitwise rotate operations that do not get translated down into a single instruction (even though both x86 and MSP430 both have an instruction to do so). It would also cut down one of two uses of if statements in the updateBoard process. I will need to implement this tomorrow and compare the performance.

As the code is implemented now though, an if statement is used to determine which table to lookup from. The only remaining issue, and one of the biggest challenges I faced during this step (and which a 9-bit lookup would have made a lot easier), is figuring out how to pack the 8 neighboring cells into a byte in such a way that makes adding in a new row easy.

At first I thought about using a lllmmrrr byte order, but realized that this would take many instructions to add new byte. Again, I now think that there may have been a way to make this byte order work, but since I am already tearing up this system, there is no need to go into further mistaken detail.

The byte order that is currently being used is a lllrrrmm byte order. The process for shifting in four new bits (because we need to recover the old middle cell) is as follows.

Shift left by three. rrrmm000
Rotate left by one. rrmm000r
Clear the last four bits by ANDing with 0xF0. rrmm0000
Fill in the four bits as follows. rrmm(n)(n)(n)(n)
- Bit three: old middle cell
- Bit two: right middle
- Bit one: right top
- Bit zero: right bottom
Rotate left by two. mm(n)(n)(n)(n)rr

This process is done by two functions, shiftInRight and shiftInRightOverflow. The latter is only used in situations in which getting the old middle cell involves wrapping around the board. The stepRow function calls one of these two for each column of three cells to add it to the lookup byte and discard the leftmost row, effectively moving the right one column at a time. Besides two small helper functions, that is everything in the program.

Problem 1: Corrupted Mess

The first major issue I ran into was the output of the simulator looking similar to this:
Pasted image 20241130232126.png
Note the alternating lines of white and black. While some versions of this error had a narrow region of some cellular-autonoma-like activity, most of the screen was glitched out. While I was troubleshooting, I noticed that the lines could be interpreted as an alternating binary sequence. Once such sequence could be 0xAA, 10101010, which just so happens to be the value that Zig fills any uninitialized value with in debug mode. There were a lot of other signs that pointed to this:

The lines going away when compiled in a mode that turned off this safety feature.
GDB showing a slice of memory full of 0xAA values
What eventually led me to see the issue was to print the index that was being written to each time a new byte was saved. I realized that bytes were not being saved correctly, leaving many locations undefined and with a value of 0xAA. This was due to using the lookup variable instead of the loop counter in the check for whether or not to save the byte. Once it was changed to the correct variable, the program worked much better.

Problem 2: Unknown Rules

However, there was still an issue. Although the simulator looked mostly correct, I noticed some behavior that did not match the rules of Conway's Game of Life. For example, a vertical line of two cells was persisting forever instead of both cells dying in one step. In debugging this, I printed out cell state, the lookup byte used to find the next cell's state, and the result of the lookup. What I discovered was that each lookup was returning the correct value for the given state, but that the states did not line up with what was actually on the board. I noticed that the problem was that the bits corresponding to the left column would have incorrectly set bits. After thinking about it for a while, I realized that the method I had been using to shift new bits into the lookup byte was incorrect. Previously, I was looking in column while shifting in column for the central cell that was not included in the last lookup. However, if column is the column to the right of the area currently in the lookup byte, the column with the desired cell would be column . After changing my code to reflect this, the simulator runs as desired to my knowledge. I should implement tests to verify that everything works correctly, as I disabled some of the tests I had previously written after fixing this issue.

11-30-24

Overview

Problem 1: Corrupted Mess

Problem 2: Unknown Rules