Randomized Image Matching

I’ve been working on another project that I haven’t yet written about: Randomized Image Matching. At a high-level, this project uses a greedy algorithm with simulated annealing to reproduce an image using overlapping semi-transparent polygons.

In English, that means that the program will randomly lay out semi-transparent triangles or squares and then randomly move them around to try to get the resulting image to look as much like the selected image as possible. The way to understand it is to watch it in action:

F# Image Map Main Window

From left to right:

  1. Last: the iteration the program is trying out right now.
  2. Best: the closest match so far.
  3. Error: how far away is best match is from the original? White pixels are farther away.
  4. Original: the image that we’re trying to match.

As the program runs, the Last image will change rapidly as the program tries out many slightly different variations (in the screenshot, it was averaging about 86 iterations per second). When it finds a version that has less error (meaning that it is closer to the original), that version will become the current best image.

This project is inspired by Roger Alsing’s Evolisa. When I first saw Roger Alsing’s version, I thought it was quite an interesting idea and definitely something that would be fun to work on and optimize. I, of course, figured that I could create a version that iterates and converges faster. I also thought it would be a good project to try running in parallel across multiple processors and multiple machines.

Implementations

I’ve developed two different implementations: one in OCaml and C that runs on UNIX (but could probably be ported to Windows using cygwin and OCaml for Windows) and another version using F#/C# that runs on Windows (but could probably use Mono).

OCaml/C

The first version one uses a combination of OCaml and C. The high-level logic such as generating permutations and displaying the preview is all in OCaml, which is well-suited for that sort of task. Lower-level logic is delegated to C. The main branch delegates the actual image processing (drawing triangles, calculating error, etc.) to ImageMagick, a C library with a convenient OCaml bindings. I’m working on a new branch with my own implementations of image processing and error calculations. That will hopefully be faster and will definitely be more fun.

The biggest limitation with this version is the lack of real multi-threading in OCaml. If I want to use all four cores in my machine, I need to run multiple processes. In response to that, I started work on a simple distributed OCaml map/reduce framework. It sounds fancy, but it’s pretty simple. It uses an RPC library to create a simple networking interface that clients use to connect to a central server to retrieve work items, process the item, then upload the results back to the server. My goal is to include git-based distribution with the framework so that clients can automatically update, rebuild, and reconnect after someone pushes an update.

F#/C#

This version uses a combination of F# and C#, with most of the logic and display in F# and the parts that deal with pointers in C# using unsafe. This is the first real F# program that I’ve written, so it was a good learning experience. I know the .NET APIs very well so I was able to focus just on the language.

The program shows a real-time display of the image as it evolves. This helps really understand what’s happening behind the scenes and looks very cool. The preview also gave me a chance to try out OpenTK (which I’m quite happy with) and to dust off my OpenGL skills.

This version uses rectangles rather than triangles to make the drawing code simpler. I don’t think it will affect the speed of convergence. I’ve limited it to grayscale for simplicity and performance. Permutations are generated by randomly moving edges of existing polygons or changing their color. After a certain number of iterations, a new rectangle is added even if it decreases fitness (this is almost simulated annealing).

What’s Next?

  • Optimizations to improve the speed of iteration
    • Evaluate new image drawing methods
      • Draw on GPU using OpenGL, then fetch rendered image back to main memory
      • Draw rectangles to bitmap in memory, then compare each pixel against original. This might be faster for larger numbers of squares. An object pool will prevent rapidly churning memory.
    • Faster color blending during drawing
      • Precomputed alpha
      • C version of blend function to use SSE or MMX
  • Try to to profile the program to identify slowdowns.
  • Distributed version that runs on multiple computers at once.
  • Enhancements to the display:
    • Add a chart showing fitness over time and the rate of convergence
  • Tweak heuristics for faster convergence
    • Change annealing to move rectangles in addition to slowly adding them.
    • Change the rate of rectangle addition and vary it by the number of total iterations. I’m not sure yet if it should get faster or slower over time, but I think it should start out much faster than it does now.
  • Enable color
  • UI to select the target image and maybe control some of the constants (annealing rate etc.)