With Sora, OpenAI highlights the mystery and clarity of its mission | The AI Beat – VentureBeat

Posted: Published on February 20th, 2024

This post was added by Dr Simmons

Last Thursday, OpenAI released a demo of its new text-to-video model Sora, that can generate videos up to a minute long while maintaining visual quality and adherence to the users prompt.

Perhaps youve seen one, two or 20 examples of the video clips OpenAI provided, from the litter of golden retriever puppies popping their heads out of the snow to the couple walking through the bustling Tokyo street. Maybe your reaction was wonder and awe, or anger and disgust, or worry and concern depending on your view of generative AI overall.

Personally, my reaction was a mix of amazement, uncertainty and good old-fashioned curiosity. Ultimately I, and many others, want to know what is the Sora release really about?

Heres my take: With Sora, OpenAI offers what I think is a perfect example of the companys pervasive air of mystery around its constant releases, particularly just three months after CEO Sam Altmans firing and quick comeback. That enigmatic aura feeds the hype around each of its announcements.

The AI Impact Tour NYC

Well be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.

Of course, OpenAI is not open. It offers closed, proprietary models, which makes its offerings mysterious by design. But think about it millions of us are now trying to parse every word around the Sora release, from Altman and many others. We wonder or opine on how the black-box model really works, what data it was trained on, why it was suddenly released now, what it will really be used for, and the consequences of its future development on the industry, the global workforce, society at large, and the environment. All for a demo that will not be released as a product anytime soon its AI hype on steroids.

At the same time, Sora also exemplifies the very un-mysterious, transparent clarity OpenAI has around its mission to develop artificial general intelligence (AGI) and ensure that it benefits all of humanity.

After all, OpenAI said it is sharing Soras research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon. The title of the Sora technical report, Video generation models as world simulators, shows that this is not a company looking to simply release a text-to-video model for creatives to work with. Instead, this is clearly AI researchers doing what AI researchers do pushing against the edges of the frontier. In OpenAIs case, that push is towards AGI, even if there is no agreed-upon definition of what that means.

That strange duality the mysterious alchemy of OpenAIs current efforts, and unwavering clarity of its long-term mission often gets overlooked and under-analyzed, I believe, as more of the general public becomes aware of its technology and more businesses sign on to use its products.

The OpenAI researchers working on Sora are certainly concerned about the present impact and are being careful about deployment for creative use. For example, Aditya Ramesh, an OpenAI scientist who co-created DALL-E and is on the Sora team, told MIT Technology Review that OpenAI is worried about misuses of fake but photorealistic video. Were being careful about deployment here and making sure we have all our bases covered before we put this in the hands of the general public, he said.

But Ramesh also considers Sora a stepping stone. Were excited about making this step toward AI that can reason about the world like we do, he posted on X.

In January 2023, I spoke to Ramesh for a look back at the evolution DALL-E on the second anniversary of the original DALL-E paper.

I dug up my transcript of that conversation and it turns out that Ramesh was already talking about video. When I asked him what interested him most about working on DALL-E, he said that the aspects of intelligence that are bespoke to vision and what can be done in vision were what he found the most interesting.

Especially with video, he added. You can imagine how a model that would be capable of generating a video could plan across long-time horizons, think about cause and effect, and then reason about things that have happened in the past.

Ramesh also talked, I felt, from the heart about the OpenAI duality. On the one hand, he felt good about exposing more people to what DALL-E could do. I hope that over time, more and more people get to learn about and explore what can be done with AI and that sort of open up this platform where people who want to do things with our technology can can easily access it through through our website and find ways to use it to build things that theyd like to see.

On the other hand, he said that his main interest in DALL-E as a researcher was to push this as far as possible. That is, the team started the DALL-E research project because we had success with GPT-2 and we knew that there was potential in applying the same technology to other modalities and we felt like text-to-image generation was interesting becausewe wanted to see if we trained a model to generate images from text well enough, whether it could do the same kinds of things that humans can in regard to extrapolation and so on.

In the short term, we can look at Sora as a potential creative tool with lots of problems to be solved. But dont be fooled to OpenAI, Sora is not really about video at all.

Whether you think Sora is a data-driven physics engine that is a simulation of many worlds, real or fantastical, like Nvidias Jim Fan, or you think modeling the world for action by generating pixel is as wasteful and doomed to failure as the largely-abandoned idea of analysis by synthesis, like Yann LeCun, I think its clear that looking at Sora simply as a jaw-dropping, powerful video application that plays into all the anger and fear and excitement around todays generative AI misses the duality of OpenAI.

OpenAI is certainly running the current generative AI playbook, with its consumer products, enterprise sales, and developer community-building. But its also using all of that as stepping stone towards developing the power over whatever it believes AGI is, could be, or should be defined as.

So for everyone out there who wonders what Sora is good for, make sure you keep that duality in mind: OpenAI may currently be playing the video game, but it has its eye on a much bigger prize.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Go here to see the original:

With Sora, OpenAI highlights the mystery and clarity of its mission | The AI Beat - VentureBeat

Related Posts
This entry was posted in Artificial General Intelligence. Bookmark the permalink.

Comments are closed.