Stable Attribution Identifies the Art Behind AI Images
Shortly after their first releases to the public, text-to-image artificial intelligence models like Stable Diffusion and Midjourney also became the focal points in debates around the ethics of their usage. Anton Troynikov is a cofounder of Chroma, a startup working to improve AI interpretability-that is, making what goes on under the hood of AI systems a little less mysterious. With AI art generators, Troynikov and others at Chroma saw an opportunity to build a tool that would make it easier to address some of the thorny attribution issues that have emerged. Troynikov answered five quick questions on the project-called Stable Attribution-and how he thinks artists and AI engineers can stop talking past each other on the topic of AI-generated art.
What were your first impressions of AI art generators when they were released?
Anton Troynikov: I started paying attention to the AI art discourse after Stable Diffusion was released and a lot more people got access to the model. And I started to realize pretty quickly that people on both sides of the conversation were talking past each other. I wanted to see if there was a technical solution to the problem of making sure that technologists and creatives were not antagonists to one another.
What's your goal with Stable Attribution?
Troynikov: I wanted to demonstrate that this problem is not technically infeasible to tackle. After talking to a bunch of people, especially on the creative side, but also on the technology and research side, we felt it was the right thing to just go ahead and see what kind of reaction we'd get when we launched it.
What's the short version for how Stable Attribution works?
Troynikov: Stable Diffusion is in a class of models called latent diffusion models. Latent diffusion models encode images and their text captions into vectors (basically a unique numerical representation for each image). During training time, the model adds random values (noise) to the vectors. And then you train a model to go from a slightly more noisy vector to a slightly less noisy vector. In other words, the model tries to reproduce the original numerical representation of every image in its training set, based on that image's accompanying text caption.
The thinking was, because these numerical representations come from these pretrained models that turn images into vectors and back, the idea is basically, Okay, it's trying to reproduce images as similarly as possible." So a generated image wants to be similar to the images that most influenced it, by having a similar numerical representation. That's the very short explanation.
How do you make that final step and determine who the artists and creators are?
Troynikov: We would really like to be able to attribute directly back to the human who created the source images. What we have-and what's available in the public training data set of Stable Diffusion-are URLs for images, and those URLs often come from a CDN [content delivery network]. The owners of the sites where those images appear and the owners and operators of those CDNs could make that connection.
We do have a little submission form on the site. If people recognize who the creator is, they can submit it to us, and we'll try to link it back.
How do you see generative AI like this-alongside the ability to attribute source images to their creators-affecting artistic creation?
Troynikov: I think there's two things you could do. One is, by being able to do attribution, you can then proportionately compensate the contributors to your training set based on their contribution to any given generation. The other really interesting thing is, if you have attribution in generative models, it turns them from just a generator into a search engine. You can iteratively find that aesthetic that you like and then link back to the things that are contributing to the generation of that image.
Anton Troynikov is the cofounder of Chroma, an AI company focused on understanding the behavior of AI through data. Previously, Troynikov worked on robotics with a focus on 3D computer vision. He does not believe AI is going to kill us all.