“The ChatGPT moment for robotics is coming,” said Jensen Huang, founder and CEO of NVIDIA, in a statement. “Like large language models, world foundation models are fundamental to advancing robot and AV development, yet not all developers have the expertise and resources to train their own. We created Cosmos to democratize physical AI and put general robotics in reach of every developer.”
NVIDIA envisions a world of bots, and it wants it badly enough to develop and give it away. As NVIDIA announced last night at CES, the world’s largest annual tech show, they’ve developed an open-source platform of “generative world foundation models, advanced tokenizers, guardrails, and an accelerated data processing and curation pipeline built to accelerate the development of physical AI systems.”
World foundation models, in turn, are “neural networks that simulate real-world environments and predict accurate outcomes based on text, image, or video input.” Basically, they’re the essential roadmaps for how robots navigate the physical world. Navigating the world is so much more difficult for a robot than a laboratory environment. There are unpredictable moments that necessitate a wide repository of experience so that a robot can react quickly to a person getting dangerously in its way or an obstacle falling into its path.
According to Huang, Cosmos was trained on 20 million hours of video. And because NVIDIA is really shooting for the stars to grow their status as the must-have hardware for AI even further than it is now, it’s fitting that they named it Cosmos: an entire, orderly universe.
buckle up, it’s a bumpy ride
Physical AI is, more or less, an intelligent software that enables a robot to move about the world in a more coherent manner than the MyPillow guy. A robot, in layman’s terms, would refer to the entire package of hardware and software, but a robot could bump around the world in a relatively clumsy manner and still be a robot.
If we’re going to get awfully granular (and we are), physical AI specifies that some level of advanced programming—AI—that gives it advanced capabilities, although there’s no set threshold or agreed-upon, quantifiable bar to pass for an AI’s capabilities to be considered advanced. So it’s a soft, squishy definition.
Autonomous vehicles, such as taxis and warehouse tugs, would count. So would the humanoid robots that will someday be used to fill out the empty stands at Kaufmann Stadium. And surgical robots who will probably one day perform a medical miracle by growing hospitals’ profitability without lowering patients’ bills one iota.
NVIDIA’s roadmap to the cosmos
Before we get all warm and fuzzy about a major corporation releasing something so freely, let’s postulate why they would do so and what they seek from it.
Naturally, NVIDIA stands to gain financially from the world’s move into smart robots and, by extension, the AI to run them. Thanks to the overwhelming demand for their high-end parts, used to develop and run the computers that enable AI, NVIDIA has become a very, very rich company.
It and Apple have been trading the lead for world’s most valuable company for the past few months because of it, so if NVIDIA can release Cosmos to the masses as open-source software to make it that much easier for developers to create AIs, users of those AIs will buy NVIDIA parts to run them. It’s like giving away free building blueprints to people so that you can sell them concrete and lumber.
Will it lead to Skynet or just beer swigging, chain-smoking, compulsive gambling robots with smart mouths? Maybe both? All we can do is hope.
The post NVIDIA Shoots for the Stars With Cosmos, an Open-Source Platform to Make AI for Robots appeared first on VICE.