I happened to be busy with Utena at the same time that AI image synthesis took off. I had fun playing around with early systems that produced... questionable... results. OK, horrible results. Farther down are a few images that are less bad.
In 2022, I fed simple prompts to the AI picture-generating program “DALL-E mini”. I believe it’s named after Dali for its surreally distorted images. (Soon after, it was renamed to Craiyon to avoid confusion with the full-size DALL-E, which is far more capable.) Every one of these pictures is disturbing. As of March 2023, Craiyon has been improved and can sometimes produce a crude but recognizable Utena (right).
The newer Stable Diffusion picture generator produces much higher quality images, but seems to know less about Utena. Most image generators actively try to avoid drawing copyrighted characters (with little success).
In Paris? Why not? Though I don’t see anything Parisian in the pictures. The program has a vague idea of what Utena looks like, and a wisp of a vague idea of Anthy. I was surprised that it tended to make Utena dark-skinned to match Anthy. It was trained on data scraped from the wild internet, and it should share the internet’s skin color biases. It seems to enjoy homogenizing people. Do minorities stick together so strongly that if there’s one, DALL-E mini thinks there should be more?
It’s actually kind of impressive that DALL-E mini knows anything at all about Utena. It will try to draw pictures for any prompt. That said, it came up with nothing recognizable for other Utena characters, or for places like “ohtori academy”. It was slightly better at depicting love than depicting Paris—it drew the characters close together, often melting into each other.
Backstabbing is apparently too difficult a concept for the poor program, but I had to include this outlandish image. I want to send it back in time and have Picasso and Kandinsky discuss it together.
This is by far the best picture I got in many attempts, and the best illustration of its prompt. It’s a fair depiction of Anthy, and what her expression might be after she murders Utena in an alternate timeline. She looks triumphantly smug, “I’m better than you, and your lifeless body is the proof of it.” It’s not entirely out of character, a terrifically lucky guess for a computer program that doesn’t know a damn thing.
Stable Diffusion is much, much better, but still weak. I tried dozens of prompts, and this is the least bad image—the only one to get hair color and skin color somewhat close. Stable Diffusion does seem to know that they are both women in an anime show, and it knows that Utena is taller... and that’s about it. The program has a better sense of anatomy, but Anthy has a deformed right side and hand, and Utena has a supernumerary arm.
The prompt, by the way, includes “photorealistic”. It seems to help, but... it’s not all the way there.
It took me hundreds of tries to get this. I went through a wide range of prompts—this one uses “radio telescope” for the antenna. It is the first revolutionary squirrel antenna that looks the part, and even here it’s not entirely clear that the antenna is supposed to rotate. Stable Diffusion does not seem to know how to depict motion. Click through for a larger view.
I like the deformed mad scientist squirrel and the rough-and-ready look of the antenna.
A more serious picture that I generated by AI in July 2023. It is based on Utena’s disheveled hair from episode 33: Her hair is rendered as if made of paper that swirls into rosettes. Her thumbnail has turned into a rose petal. The rendering is... not uniformly good. A human artist could do much better. But I like the image overall.
I can interpret the image in my own way, of course, but I’ll let you think about it yourself. I’m sure people will come up with different ideas. I’ll just mention that, though there are a few loose petals, and though Utena is flat on her back, her roses are not scattered. At the moment.
Jay Scott <email@example.com>
first posted 9 June 2022
updated 23 July 2023