Last year the Internet got its first taste of image-generating artificial intelligence. Suddenly, technology that had once been offered only to specialists was available to anyone with a web connection. The enthusiasm shows no signs of abating, and AI-generated images have won a major photography competition, created the title credits of a television series and tricked people into believing the pope stepped out in a fashionable puffer coat. Yet critics have noted how training the algorithms on existing works could potentially infringe on copyright, and using them could put artists' jobs in jeopardy. Generative AI also risks supercharging fake news: the pope coat was fun, but a generated photograph supposedly showing an attack on the Pentagon briefly inspired a dip in the stock market.
How did programs such as DALL-E 2, Midjourney and Stable Diffusion get to be so good all at once? Although AI has been in development for decades, the most popular of today's image generators use a technique called a diffusion model, which is relatively new on the AI scene. Here's how it works:
This article was originally published with the title "How AI Generates Images from Text" in Scientific American 329, 3, 66-67 (October 2023)
doi:10.1038/scientificamerican1023-66
ABOUT THE AUTHOR(S)
Sophie Bushwick is tech editor at Scientific American. She runs the daily technology news coverage for the website, writes about everything from artificial intelligence to jumping robots for both digital and print publication, records YouTube and TikTok videos and hosts the podcast 'Tech Quickly.' Sophie also makes frequent appearances on radio shows like Science Friday and television channels including CBS, MSNBC and National Geographic. She has more than a decade of experience as a science journalist based in New York City and previously worked at outlets such as Popular Science, Discover Magazine and Gizmodo. Follow Sophie Bushwick on Twitter Credit: Nick Higgins
Amanda Hobbs is a freelance research, writer and visual content editor specializing in storytelling via art and information graphics. Her work can be viewed at www.athcreative.com