Triangles Revamped

Last weekend I finished v1.0.0 of my new triangles program.

It was depressing to see that there were 3 other versions collecting dust on my neglected GitHub so I deleted them. I choose to forget. I live the noble lie that existence is worth the cost.

What Does It Do?

It takes an image as input, and tries to draw it using a fixed number of triangles to the best of its abilities. It uses a genetic algorithm to arrange the size, shape and color of the triangles. It also takes in a bunch of different parameters, but we’ll get into those later.

At least I tried

How Well Does It Work?

Not that well, which is supremely disappointing. I have two theories on why:

  1. Pure Genetic Algorithms may not work well for this use case
  2. The hyper-parameters might need tuning (which is step II)

I’ll be trying to optimize the hyper parameters as step two of this program, and I’ll try a pseudo-GA approach to see if the output can be improved; I’m certain that it can.

How Does It Work?

The way genetic algorithms work is that they encode the solution space as a string of bits, and manipulates these bits to search through the solution space to find a satisficing solution. In other words, genetic algorithms are a nature-inspired version of parallel hill climbing/gradient descent. To get started, we have to know how to encode our solution space as a string of bits, and how to manipulate this string of bits to improve our solution.

Encoding the Solution Space

The end goal is to have an image drawn with a finite number of translucent triangles. I choose to represent this a background color, followed by a list of triangles. This would then be drawn on a “blank” image of the same size. Encoding it this way reduces the number of bits needed to represent the image, and thus the solution space we need to search. Another thing important to the success of genetic algorithms is that the bit locations themselves have meaning. Since the triangles will be drawn in order, and order matters when you have transparency, a list is ideal way to provide this bit-location semantics.

Basic representation of triangle image genome

Background Color

The first 32 bits of my image encode the RGBA value of the image background color. Each byte (8 bits) represents one color value, and one translucency value, giving us 8 bit color depth, or ~16.7 million colors to choose from (not including transparency). Unfortunately, this also means finding the right background color has ~4 billion possible solutions.

Background Color Genome Representation

For those not in the know, “alpha” is how the translucency parameter is typically referred to. Adding an alpha channel means that this program can handle transparent images in addition to images with a defined background.

The program will read this value first, transform it into a color and set every pixel in the image to that color. For example, if the bit string looked like this:

\[11000100011000010110111111111111\]

We’d break it up into the component RGBA bytes:

\begin{align} r &= [11000100] \\ &= 196 \\\\ b &= [01100001] \\ &=  97 \\\\ g &= [01101111] \\ &= 111 \\\\ a &= [11111111] \\ &= 255 \end{align}

The rgb value is (196,97,111) which renders a nice red color. Opacity is determined as a value between 0 and 1, with 0 being invisible and 1 being completely opaque. The alpha value can be any number [0,255]. To derive opacity we simply divide by the maximum number. In this case:

$$255 \div 255 = 1$$

So the background is totally opaque. If our source image was 500×500 pixels, this bit string would produce an image that looks like this:

Red RGBA example image

This RGBA encoding is also how each triangle’s color will be represented.

Triangles

Triangles are encoded as a set of set of 3, (x,y) points, an RGBA color and a bit that encodes whether or not the triangle is active or not. Each triangle can be encoded in 11 bytes, or 88 bits.

Triangle bit encoding

Points

As you may know, triangles have three points, the question is, how to render them onto an image? Images can be thought of as a grid of pixels. Using discrete Cartesian coordinates, we can address pixels on this 2 dimensional grid.

In this program, the origin is at the upper-left corner of an image, with the x-axis indicating width offset and the y-axis the height offset. This means that the upper-left pixel can be addressed by the point (0,0). If the image was 100 pixels tall, and 100 pixels wide; the bottom-right pixel could be addressed by the point (99,99).

Here is some pixel art I made that is 16×16 pixels (scaled up to 128×128):

Amy 16x16 Pixel Art

The point (0,0) is purple. The point (0, 2) is gray. The point (1,4) is yellow. You get the idea.

Each point is represented by an (x,y) value:

Point Encoding

This is actually a lie. I tried to directly encode the points using ten bits, allowing me to directly address pixels in images up to 1024×1024 in size. I later switched to encoding the width/height percentage, where Why?

  1. If I use images smaller than that the bits are wasted (while still expanding the search space).
  2. If I uses images smaller than 1024×1024 the addressing of pixels would no longer be distributed evenly, distorting the algorithm.
  3. I cannot use the algorithm on images of any size.
  4. Making the numbers 8 bit gives nice byte alignment (without limiting me to 256×256 images, which are quite small).

So instead the (x, y) point indicates the percent width and height of the image. Since each point is 8 bits, they represent 1/255th of the image. For example, if the (x,y) values were (123, 16) and the source image was 512×512 pixels:

\begin{align} x &= [01111011] \\ &= 123 \\\\ 4 &= [00010000] \\ &= 16 \\\\  pixel_{x} &= (123 \div 256) \times 512 \\ &= (~0.48) \times 512  \\ &= 246 \\\\  pixel_{y} &= (16 \div 256) \times 512 \\ &= (0.0625) \times 512  \\ &= 32 \\\\ pixel_{x,y} &= (246,32) \end{align}

Using these 3 points, we can define which pixels are inside of the triangle, and therefore color them appropriately to “draw” them.

Color

Color is done much the same as the background color, with one major difference.

Triangle color encoding

Like before, triangles have an 8-bit RGB color profile. Unlike before, they only have 7 bits to determine their opacity. I did this so I could sneak in bool value, while maintaining byte alignment. I don’t think the triangles suffer from only having 128 opacity states versus 256.

The “visible” bit determines if the triangle is drawn at all. Although this could be achieved by setting all the opacity bits to 0 – we will see why it’s necessary below.

Putting it all together

Here’s what a bit string – or “genome” – of a red image with a centered, 50% transparent, light blue triangle would look like:

\[11000100011000010110111111111111010000001100000010000000\\01000000110000001100000000110000001100001111111101111111\]

Again, notice how the dimensions aren’t specified. They depend on the source image. The encoding is dimension agnostic – which is a nice feature.

The size of a genome, in bytes: \(4 + ( \text{number of triangles} \times 11)\)

Multiply by 8 to get the number of bits. This means a drawing with 2 triangles has \(2^{208}\) possible configurations.

Manipulating the Bits

So now we know how to encode solutions are bit strings, which I will affectionately name “genomes.” Given that even simple genomes can represent an incredibly large search space, we need an intelligent way to manipulate the strings, and narrow the search space. Unfortunately, this is not an AI post, this is a genetic algorithms post – we only have the power of natural selection. Natural selection is not very smart. It can be effective at find solutions but only if you harness the power of the only thing it cares about – reproductive fitness. Elizier Yudkowsky puts it nicely:

within each species, the blind idiot god is purely obsessed with inclusive genetic fitness. No quality is valued, not even survival, except insofar as it increases reproductive fitness. There’s no point in an organism with steel skin if it ends up having 1% less reproductive capacity.

from his blog post: Thou Art Godshatter

 

This means that the most important part of the algorithm, besides the encoding, is the fitness function. We’ll get there eventually, but first we need to know the life  of an Artist.

Artists

To keep the nature analogy going, artists are the creatures produced by the genome. They are very similar – since an artist is defined by its genome, but they are not the same. An artist is an extension of the genome in three ways.

  1. Artists are finite. They, not their genomes, are the ones that live and die. They are the ones that reproduce. The same genome may appear in different artists, multiple artists, in different places and different times. Artists live once.
  2. Artists live in a place. They have a location within their environment.
  3. Artists have two copies of their genome, a dominant and recessive strand.

When an artist is created, it is given two, equal length, random bit strings. One we’ll label “dominant” and the other “recessive.” These random strings are modified so that the “visible” bit of each triangle, except the first, is set to 0.

Population & Location

The first thing we do is create a population of artists. As stated above, these artists start with totally random bit-strings, or genotypes.

These artists are distributed on a map. This map is really an adjacency matrix, of \(N\) dimensions. This means that each artist is adjacent to \(N \)other artists. i.e. if \(N=4\) it simulates a 2D grid, wraparound grid. Artists are only allowed to mate with artists that are adjacent to them.

Crossover

Each artist has a chance to crossover \((~70%)\). A random point in the genome is picked, and the dominant and recessive genomes swap their bits after that point.

Genome Crossover

This crossover can be set to happen at the bit, byte or triangle alignment. i.e. if set to triangle alignment, triangles are never split, and always swapped intact.

Why?

Good question. The reason seems to be “because nature does it.” To be fair, that’s why anybody does genetic algorithms in the first place, but there is another reason why crossover might be valuable. Since GAs operate on bit strings, it stands to reason that there would be valuable subsequences of bits. Crossover allows these sub sequences to be recombined, and swapped around.

Mutation

Each artist has a chance \((~0.05%)\) to mutate at each bit. If a mutation occurs, the bit is flipped. i.e. if it is a 1 it becomes a 0, and if it is a 0 becomes a 1.

Why?

Mutation is important because it prevents the algorithm from being stuck. If the initial population of artists did not have the sequence present, then, without mutation, that sequence can never be created. Just like in nature, mutation is usually harmful, but seemingly necessary for advancement.

Fitness

The most important part of a genetic algorithm is how “fitness” is measured and reproduction is handled. This makes sense, natural selection is purely concerned with “reproductive fitness” which has both of these words in it.

Fitness is a measure of how close a solution is to ideal. This means you have to know what a good, or perfect solution, looks like before you start. This is tricky, because if you knew what a good solution was, you wouldn’t need to find one. In this case we need another function, one that takes the bit string and maps it back to the original domain. This domain lends itself to evaluation in a way the genome doesn’t.

Expressing Yourself

In biology, a genotype is the DNA, the sequence of ribonucleic acids that make up a gene. The phenotype is the expression of the DNA, or gene. In this algorithm, the bit strings, or the list of triangles is like the genotype. If we take this information and convert it into an image, that is like the phenotype.

In this case, the background color for the image is determined by the RGBA background color value encoded in the dominant genome. Then, each triangle is examined in order from both the dominant and recessive genomes. If the triangle from the dominant genome is “visible” (the visible bit is set to 1) then it is drawn. If the dominant triangle is not visible and the recessive triangle is visible then the recessive triangle is drawn. If neither are visible, no triangle is drawn. Repeat until we run out of triangles. This gives us our artist’s image.

Calculating Image Distance

We can compare this image to the source image to see how close it is. The closer it is to the source image, the better. In this program I settled on using the per-pixel root mean squared error for my fitness function. The larger the error, the more different the images, and the lower the fitness. If you’re interested in learning more, I have a previous post you might want to check out.

I also tried adding a custom image hash and an image hash based off of a discrete cosine transform – but neither seemed to improve the fitness function so I dropped them as they are expensive to calculate.

Reproduction

Once all the artists have been assigned a fitness, it is time for them to reproduce. Again, to harness natural selections powers we must allow more fit artists to reproduce with greater frequency. The question is, how much more often?

If we allow high fitness artists to reproduce too prolifically, we may converge too quickly as they take over the gene pool in rapid succession. If we don’t allow them any opportunity to reproduce more often then the algorithm will make progress only by blind luck.

The method I chose was a sigma-scaling. The standard deviation of fitness is calculated, and artists are allowed to reproduce in proportion to their z-factor. This has two benefits: when the variance is very high (in the beginning) a lucky artist won’t prematurely dominate the gene pool. In later generations, when the program has converged, the variance will be low, but there will still be artists that are several standard deviations from the mean and so selective pressure will remain.

$$ExpectedReproduction(a, t)= \left\{ 1 + \frac{f(a)-\bar f(t)}{2\sigma(t)} \right\}$$

Where \(a\) is the artist, \(f(a)\) is the fitness of the artist, \(\bar f(t)\) is the average fitness of the population at time t, \(\sigma (t)\) is the standard deviation of the fitness of the population at time t. If \(\sigma (t) =0\) then ExpectedReproduction is 1. If ExpectedReproduction is less than 0.1 it is set to 0.1 out of kindness.

Using a stochastic process artists are selected to mate such that they will reproduce at least \( \lfloor ExpRepr(i,t) \rfloor \) times and at most \( \lceil ExpRepr(i,t)\rceil\) times.

Mating

Artists are only allowed to mate with artists adjacent to them, and these are chosen at random. Their offspring inherit the dominant genome from the mating parent, and the recessive genome from the selected mate.

There is also an elitism parameter that can be set. If set, the top \(X\) members of each generation are cloned into the next generation, bypassing mating. This can help the program converge faster.

Finally

All the offspring of the current generation are distributed onto the map, roughly where their parents were; The current generation are deleted and the process starts over.

What’s Next?

I’ve been running the program on images and I’m not super impressed with the results. I plan on tinkering with it to see if there is any obvious improvements that I can make (that are still in the spirit of genetic algorithms). Most important is to do some actual research on which hyper parameters perform best. I’ve never machined learned parameters before so wish me luck!

  1. Test which hyper-parameters are the best
    1. This is the science part where I run the program many times to determine which hyper parameters produce the best looking image
  2. Switch to OpenCV so I can use CUDA to compare images and rasterize them
    1. This should make the code much faster – probably requires rewriting nearly all of it though.
  3. Hook it up to a webcam
    1. Have the “environment” change each frame so that the “artists” are constantly evolving.

Genetic Algorithms

I’m re-re-re-re-starting trying to build a program that uses a genetic algorithm to draw images using triangles. Like my blog that I keep “restarting.” Like everything in my pathetic life. This sushi is like your life, many things started and nothing finishedLast week I read An Introduction to Genetic Algorithms by Melanie Mitchell as a primer. It’s a good book that talks about the history, various technics and mathematics behind understanding GAs. That said, I could never shake the feeling that the central theme of the book was “They’re an interesting way to program things, but we’re not sure what they’re good for and are most likely just a bad form a hill climbing.” I wasn’t too bothered by this, because my goal with this program is to learn more about GAs and not build scalable, efficient machine learning algorithms.

I had an old version of this program that made pretty good images (see the panda on this earlier post, or this one). I then remade a shitty version of the program, like 2 years ago, that I managed to get working again as well. It takes longer and produces worse results, more proof that I am declining in my old age.

Crystal maiden image compared to the same image rendered only in triangles

I sketched out a rough plan of my revised algorithm over the weekend, and now it’s finally time to implement. Too bad it’s been ages since I’ve written any code let alone ceepeepee. Wish me luck.

Punk Rock Songs

I heard the lyric “A punk rock song won’t ever change the world but I can tell you about a couple that changed me” from the song Fuck Shit Up by Wingnut Dishwashers Union and envisioned it embroidered on a piece of cloth. Instead I turned it into my second piece of pixel art.

I’m not sure where the trans themes got mixed in. I suspect it has to do with the part of the lyric “that changed me” and having the idea while waiting for electrolysis. I was always into punk, but it’s definitely not responsible for me transitioning.
A punk rock song won't ever change the world but I can tell you about a couple that changed me