Pindar Van Arman is an AI Artist exploring the intersection of human and artificial creativity. Winner of the Robot Art Prize in 2018, his robots use a broad array of deep learning, generative algorithms, and feedback loops to bring his AI creations into the material world one brush stroke at a time.
While it remains a theory, Marvin Minsky’s concept of a “Society of Minds” works well in the realm of artificial creativity. Minsky, the co-founder of MIT’s Artificial Intelligence Lab, proposed that our minds are not a single super intelligence, but rather a collection of smaller intelligent agents that surface as needed and compete with each other to solve problems. Making art with my painting robots is a problem just like any other and Minsky’s approach is remarkably effective at solving it. Instead of making a single super creative algorithm, I find it much easier to equip my robots with a collection of smaller creative agents that are working with and competing against one another.
To be precise, my robots currently have 24 of these creative agents. Some are simple like procedural algorithms that measure contrast. Others are a little more complex like the ones that plan brush stroke paths. While the most complex are arrays of neural networks. But at the heart of how it all works are feedback loops. As my robots paint with a brush on canvas, they are constantly taking photos of their progress. With each photo they analyze the direction that the painting is heading in, make adjustments, paint a little more, and take another photo in a creative feedback loop.
To demonstrate these algorithms and several others here is a quick break down of most recent painting, Portrait 18,384. This portrait was named after the fact that it contains 18,384 brush strokes. For some more numbers, it took almost a week to complete and used 4 reference photos, 5 neural networks, and 2,298 feedback loops.
But these are just numbers. The big question is how did this photos on the left, become the painting on the right.
It began with my son participating in a photo shoot. My robot then used Viola-Jones Facial Detection (1) as well as a custom algorithm I wrote to Crop a Balanced Composition (2) from the photos.
A couple other custom algorithm measured each image’s beauty by analyzing their Facial Symmetry(3) and Technical Quality (4). It asked questions such as whether both eyes could be clearly seen and does the image have good contrast? Having selected its favorite image to work from, the robot fired up the remaining creative agents and got ready to paint. The following is an image of the robot right before the first stroke where six of these creative agents are visualized on the right.
Reading from top left the creative agent visualizations are the
- Canvas Image (5) : the image that will be used for feedback loops
- Goal (6) : the image that the robot will be working towards
- Heatmap (7) : a measurement of the difference between the Canvas Image, Goal, and GANs
- CNN (8) : a style transfer of the photo with previously painted images
- Stroke Plan (9) : a image of the next set of strokes being planned
- GAN (10) : images generated from source photographs
Although it is being influenced by all of these and several more agents that are not visualized, the robot begins to calculate its next strokes.
It does this by looking at the Heatmap to find the areas of the canvas that are most different from the Goal. It then references the Canvas Image and 11) Selects a Paint that it thinks will make that area look more like the Goal.
Multiple path planning algorithms within the Stroke Plan then compete with each other to calculate how to apply the brush strokes. These algorithms include Hough Lines (12) for painting along edges, a custom Dot Detection Algorithm (13) for dots, and a second custom algorithms I call Rain Fill (14) that paints in large areas.
The robot analyzes the shapes that need to be painted, selects the most appropriate stroke generator and adds strokes from it to the Stroke Plan.
The first phase of the painting then begins.
In the following time lapse you can watch visualizations of six of the creative agents changing in real time as the image emerges on the canvas. The easiest to understand is the Heatmap (Row 2, Col 1) Notice how the bright reds lighten as paint is applied to those areas. You can also see the variation in the types of strokes used within the Stroke Map (Row 3, Col 1). In the beginning it is mostly filling in large areas with a Rain Fill, but in the end you can see a couple of frames where it decided to use Dot Detection and Hough Lines for some details.
It is sometimes said that all stories have a beginning, a middle, and an end. The time lapse above was the beginning. The painting is about to enter the middle of its story.
Running in the background this entire time are several other algorithms including creative agents that Measure Contrast (15), Monitor Complementary Colors (16), Balance Composition (17), and one I call 18) Horror Vacui, which is just trying to fill everywhere up with something. In this second phase the agent that Balances Composition detects that the painting was too focused on the left of the canvas. Because of this, it decided to re-imagine the Goal and paint more to the right. When it decided to create a new Goal, the robot puts several of its agents into competition including one that paints directly from the GAN, another that Paints Randomly (19)and another that Creates Abstractions (20). The abstraction itself is created by a deeper level of creative agents including one that uses Viola-Jones to detect and Tile the Faces (21), another that Merges Faces (22) by lining up the eyes, and a third that combines a detailed image from the photographs with the less detailed CNNs and GANs. In this case, the robot decids to Create Abstraction by overlaying a detailed image from the original photos with an image from the GAN.
With a new Goal in mind (Row 1, Col 2), the robot continues its Feedback Loops into a second phase of the painting.
Hours later, the creative agent tasked with Measuring Contrast notices that the painting lacks contrast. This triggers the process to create another Goal with more contrast. To do so the robot refers to its library of available imagery and notices a GAN that can be used as a suitable image. It then made this image the new Goal (row 1, col 2). This was a brief phase but it achieved its purpose as seen in the time lapse below.
We are now entering the end of the painting’s story.
A creative agent I call the Paint Map (23) detects that at least one layer of paint has covered the entire canvas. This triggers the creation of one last Goal to finish up details. Once again it refers to the Canvas Image to see what had already been painted. Using Viola-Jones it detects the position of the eyes in the painting. The agents responsible for creating Goal images compete one last time to produce an image that will line up with these eyes. An image created with Tiling Faces and CNN style transfers is selected and made the new Goal (row 1, col2). The robot then continued painting with Feedback Loops.
When the robot stops is the final creative agent that I call I’ve Done My Best (24). Using multiple factors including time spent painting and several measurements by agents mentioned previously, the robot just stops painting. It has done its best. I am never sure when this will happen and sometimes grow impatient waiting for it. But it eventually happens and the artwork is complete.
Artist Paul Klee once described the creative process of painters as a feedback loop. Painters make a mark, step back from the canvas to see the effect the mark had, then make the next mark in a creative feedback loop. My robots are doing precisely this. And they are not simply making random decisions from complex data. Though serendipity does play a role, their aesthetic decisions are more based on a purpose.
My robots currently use only two dozen creative agents, though this number goes up and down as I experiment with them. Seeing the creativity that they are capable of with just these 24, I can not help but wonder if human creativity is similarly built. And if one insists that this robot’s process is just a complex generative algorithm, couldn’t the same be said for human artists. Perhaps one could argue that the only difference between human and computational creativity, is that the human mind has a lot more than 24 algorithms.
When machines catch up, it will be amazing to see what they begin creating.
Pindar Van Arman