OpenAI wants its new video generation tool to become a "world simulator."
AI-generated video is here to awe and mislead A tiny fluffy monster kneels in wonder beside a lit candle. Two small pirate ships battle inside a churning cup of coffee. An octopus crawls along the sandy floor of the ocean. A Dalmatian puppy leaps from one windowsill to another. These are among a series of demo videos of OpenAIâs Sora, revealed last week, which can turn a short text prompt into up to a minute of video. The [artificial intelligence]( model is not yet open to the public, but OpenAI has released the videos, along with the prompts that generated them. This was quickly followed by headlines calling Sora [âeye-poppingâ]( and â[terrifying](â and â[jaw-dropping](.â OpenAI researchers Tim Brooks and Bill Peebles told the New York Times that they picked âsora,â Japanese for sky, to emphasize the âidea of limitless creative potential.â There is another term, though, that OpenAI uses to describe Sora: a potential [âworld simulator,](â one that, over time, could create âhighly-capable simulators of the physical and digital world, and the objects, animals and people that live within them.â Itâs not there yet. While the available demo videos of Sora at work can feel uncanny and realistic, OpenAIâs technical paper on the model notes its many âlimitations.â While Sora can sometimes accurately represent the changes on a canvas when a paint-laden brush sweeps across it or create bite marks in a sandwich after showing a man taking a bite, Sora âdoes not accurately model the physics of many basic interactions," such as a glass breaking. People and objects can spontaneously appear and disappear, and like many AI models, Sora can âhallucinate.â Some AI experts, like Gary Marcus, [have raised doubts]( about whether a model like Sora could ever learn to faithfully represent the laws of physics. But just as DALL-E and ChatGPT improved over time, so could Sora. And if its goal is to become a âworld simulator,â itâs worth asking: What is the world that Sora thinks itâs simulating? Unknown worlds OpenAI has made that question kind of tough to answer, as the company has [not disclosed much]( about what data was used to train Sora. But there are a couple of things we can infer. First, though, letâs look at how Sora works. Sora is a âdiffusion transformer,â which is a fancy way of saying that it combines a couple different AI methods in order to work. Like many AI image generators (think DALL-E or Midjourney), Sora creates order from chaos based on the text prompt it receives, gradually learning how to turn a bunch of visual noise into an image that represents that prompt. Thatâs diffusion. The transformer bit has to do with how those still images relate to each other, creating the moving video. And Sora, OpenAI says, is designed to be a video-generating generalist. In order to do this, Sora would need a lot of data to learn from, reflecting a wide variety of styles, topics, duration, quality, and aspect ratios. OpenAI said in its technical paper that its development âtakes inspiration from large language models which acquire generalist capabilities by training on internet-scale data.â While not directly saying this, itâs probably safe to guess that Sora, too, learned from some training data that was taken from the internet. Itâs also possible, argued Nvidia AI researcher Jim Fan, that Sora was trained on a data set that incorporates [a large amount of âsyntheticâ]( data from the latest version of Unreal Engine, a 3D graphics creation tool that is best known for powering the visuals in video games. OpenAI also has some agreements with companies that could provide large amounts of data for training purposes, like [Shutterstock](. As for the data that OpenAI did not, in the past, use with the agreement of its creator or publisher, well, there are some pending [copyright lawsuits](. Biased worlds AI bias is not new, and as [Vox has explained before,]( it can be tough to combat. It creeps into training data and algorithms that power AI models in a lot of different ways. Since we donât know what data Sora was trained on, and the tool is not available for the public to test, itâs hard to speak in much detail about how biases might be reflected in the videos it creates. Sam Altman, OpenAIâs CEO, has said that he believes AI will eventually learn to rid itself of bias. âIâm optimistic that we will get to a world where these models can be a force to reduce bias in society, not reinforce it,â he said to [Rest of World]( last year. âEven though the early systems before people figured out these techniques certainly reinforced bias, I think we can now explain that we want a model to be unbiased, and itâs pretty good at that.â AI bias and ethics experts like Timnit Gebru have argued that this is exactly what people should not trust AI companies to do, telling [the Guardian]( last year that we shouldnât simply trust AI systems, or the people behind them, to self-regulate harms and bias. Made-up worlds A lot of the praise for Soraâs demo videos stems from their realism. And thatâs exactly why disinformation experts are concerned here. A new study indicates that [AI-generated propaganda]( created by GPT-3 (i.e., not even the newest GPT model powering the current generation of AI tools) can be just as persuasive as human-written content and takes a lot less effort to produce. Now apply that to video. Even without being able to faithfully replicate Earth physics, there are plenty of ways that a tool like Sora could be used, right now, to hurt and mislead people. âThis is definitely slick, but I see two main uses: 1) to sell people more stuff (via ads) 2) to make non-consensual/misleading content to manipulate or harass people online,â wrote Sasha Luccioni, an AI research scientist at HuggingFace, [on X](. âGenuine question - why is everyone so excited?â OpenAI announced Sora a couple weeks after a wave of explicit, nonconsensual deepfakes of Taylor Swift circulated on social media. The images, as [404 media reported](, were created with AI by exploiting loopholes in the systems that are designed to prevent exactly this from happening. To address potential biases and misuses of Sora, OpenAI is allowing only a small group of testers to evaluate its safety risks: âWe are working with red teamers â domain experts in areas like misinformation, hateful content, and bias â who are adversarially testing the model,â the company said in [a statement on X](. A world with podcasting AI dogs, I guess Underneath all this are concerns about what Sora and other tools like it will do to the livelihoods of creative professionals, whose work has been used â often without payment â to train AI tools in order to approximate their jobs. Altman, on X, was taking follower suggestions for new Sora videos in order to show off glimpses of our glorious future, which will evidently be [these AI-generated podcasting dogs](. âA.W. Ohlheiser, senior technology writer [An illustration of small overlapping squares of paper in the shape of a brain.]( Getty/Paige Vickers for Vox [Your brain needs a really good lawyer]( [Can new legislation protect us from the companies building tech to read our minds?]( [A graphic illustration of a statue holding a balance scale in one hand and a sword in the other, against a gridded backdrop. ]( Moor Studio/Getty Images [Can California show the way forward on AI safety?]( [A new state bill aims to protect us from the most powerful and dangerous AI models.]( [DeSantis gestures with both hands while holding a press conference, standing in front of a US flag.]( Paul Hennessy/SOPA Images/LightRocket via Getty Images [The Supreme Court will decide if the government can seize control of YouTube and Twitter]( [Weâre about to find out if the Supreme Court still believes in capitalism.](
Â
[Learn more about RevenueStripe...]( [A TikTok logo seen on a phone.]( Omar Marques/SOPA Images/LightRocket via Getty Images [How discredited health claims find a second life on TikTok]( [TikTok accounts are using audio from a banned wellness coach to sell salt and castor oil.]( [A rendering of pharmaceutical pills arranged in the shape of a human brain.]( Getty Images [A brief history of Silicon Valleyâs fascination with drugs]( [Sure, Elon Muskâs into ketamine and Peter Thiel has his doping Olympics, but drugs and tech are nothing new.]( Support our work Vox Technology is free for all, thanks in part to financial support from our readers. Will you join them by making a gift today? [Give]( [Listen To This] [Listen to This]( [Florida man owes half a billion]( Former President Donald Trump has now lost back-to-back civil trials in New York. Reporter Andrea Bernstein says itâs a big problem for him. Voxâs Abdallah Fayyad says itâs a big problem for everyone. [Listen to Apple Podcasts]( [This is cool] [There's a new internet-famous cat in town, and its name is Megatron](
Â
[Learn more about RevenueStripe...]( [Facebook]( [Twitter]( [YouTube]( This email was sent to {EMAIL}. Manage yourâ¯[email preferences]( , orâ¯[unsubscribe](param=tech) â¯to stop receiving emails from Vox Media. View our [Privacy Notice]( and our [Terms of Service](. Vox Media, 1201 Connecticut Ave. NW, Washington, DC 20036. Copyright © 2024. All rights reserved.