The future of filmmaking? OpenAI launches Sora, its powerful text-to-video generator


opeanai text to video soraopeanai text to video sora

OpenAI has released its text-to-video generator, Sora, and it’s… scary. Fascinatingly scary, that is. Select artists can already try it out, and we’ve seen some examples that look incredible.

[Related Reading: Google launches Lumiere, a ‘cute and fluffy’ text-to-video generator]

Sora can generate up to minute-long videos and keep the visual quality and adherence to your prompt. “We’re teaching AI to understand and simulate the physical world in motion,” OpenAI writes, “with the goal of training models that help people solve problems that require real-world interaction.”

With Sora, you can bring scenes to life with multiple characters, precise motion, and painstaking attention to detail. Not only does Sora grasp the user’s request, but it also accounts for how those elements would manifest in reality.

OpenAI claims that Sora’s model boasts a comprehensive understanding of language. This allows it to interpret prompts and bring characters to life accurately. Additionally, Sora can create multiple shots within a single video, perfectly capturing the characters and visual style.

Sora’s drawbacks

OpenAI admits that the current model has its weaknesses. “It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect,” the company notes. “For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.”

The model may also confuse the spatial details of a prompt. For example, it can mix up left and right (just like me) and struggle with precise descriptions of events that take place over time. For example, it can’t accurately follow a specific camera trajectory.

Safety

You may wonder about the safety of these videos, especially considering the amount of false information that floods the internet every day. OpenAI claims they’re taking important safety steps before Sora is publicly available in OpenAI’s products.

  • Expert evaluation: Before launch, OpenAI has enlisted specialists in misinformation, bias, and harmful content (“red teamers”) to rigorously test Sora’s potential for misuse.
  • Misleading content detection: The company’s developing tools like classifiers to identify videos generated by Sora and flag potential misinformation. Additionally, they plan to include transparency metadata for future deployments.
  • Leveraging existing safeguards: OpenAI is applying safety methods from DALL-E 3, such as text filters that block harmful prompts and image filters that ensure videos comply with their usage policies.
  • Global collaboration: OpenAI plans to work with policymakers, educators, and artists worldwide to address concerns, explore positive applications, and learn from real-world use to continuously improve safety.

Availability

OpenAI also gives some information about the research techniques behind Sora. It can generate entire videos at once or extend already generated videos to make them longer. The model has a foresight of many frames at a time, which solves the problem of keeping the subject the same even when it temporarily goes out of view. You can read more details about it in OpenAI’s technical report.

As for availability, you and I will have to wait a bit. As of today, Sora is only available to red teamers to assess critical areas for harm or risks. OpenAI has also granted access to select visual artists, designers, and filmmakers. This is so they can get feedback on the model and how to advance it further. “We’re sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon,” the company concludes.

As I mentioned, the examples of Sora’s AI-generated videos are amazing. Some of them are still a little laggy and weird, but many are easily passable as real videos. After all, with so much information we soak up daily and our attention spans, it can be hard to tell. So, I urge you, as always, to stay critical of the content you see online, especially as technology becomes more advanced and humanity becomes more regressive.

On the other hand, AI-generated videos can contribute to the filmmaking industry, replacing complex special effects and hours of editing. And for this first time, I feel that we’ve actually come closer to this.



We will be happy to hear your thoughts

Leave a reply

Funtechnow
Logo
Compare items
  • Total (0)
Compare
0
Shopping cart