monkey wrote:Text prompt to video. Another nail in humanity's coffin.
https://openai.com/sora
Here's the vid he's talking aboutIf you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths.
I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be!
Let's breakdown the following video. Prompt: "Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." -
The simulator instantiates two exquisite 3D assets: pirate ships with different decorations. Sora has to solve text-to-3D implicitly in its latent space.
- The 3D objects are consistently animated as they sail and avoid each other's paths.
- Fluid dynamics of the coffee, even the foams that form around the ships. Fluid simulation is an entire sub-field of computer graphics, which traditionally requires very complex algorithms and equations. - Photorealism, almost like rendering with raytracing.
- The simulator takes into account the small size of the cup compared to oceans, and applies tilt-shift photography to give a "minuscule" vibe.
- The semantics of the scene does not exist in the real world, but the engine still implements the correct physical rules that we expect.
Next up: add more modalities and conditioning, then we have a full data-driven UE that will replace all the hand-engineered graphics pipelines.
GurtTractor wrote:LLMs approach the problem from a very different angle, all top down using statistics over massive amounts of pre-existing data. They don't have a realtime relationship with the world or themselves so struggle to 'wake up'. Indeed their output very much resembles a dream, they are themselves a simulacrum rather than a simulation of a mind.
With enough data and clever algorithms they can become extraordinarily capable agents, in many ways exceeding the capabilities of most of us, but they don't yet live in the same world as we do so their output will always have that uncanny hallucinatory strangeness to them.
These things are understood by those on the cutting edge of AI research, if the right series of feedback loops and perhaps modelling of things like activation waves can be implemented then I don't see why human level intelligence couldn't be achievable already. But perhaps HLI is something of a red herring, maybe we'll end up skipping to a step beyond it if we're not limited by biology.
It looks like you're new here. If you want to get involved, click one of these buttons!