Nvidia launched Blueprint for AI Agents that can analyze video today as part of its CES 2025 opening keynote by CEO Jensen Huang.
The new Nvidia AI Blueprint powered by Metropolis lets organizations and individuals increase productivity and safety, and could even help Nvidia’s CEO improve his fastball pitch.
The next big moment in AI is in sight — literally.
Today, more than 1.5 billion enterprise level cameras deployed worldwide are generating roughly 7 trillion hours of video per year. Yet, only a fraction of it gets analyzed.
It’s estimated that less than 1% of video from industrial cameras is watched live by humans, meaning critical operational incidents can go largely unnoticed.
This comes at a high cost. For example, manufacturers are losing trillions of dollars annually to poor product quality or defects that they could’ve spotted earlier, or even predicted, by using AI agents that can perceive, analyze and help humans take action.
Interactive AI agents with built-in visual perception capabilities can serve asalways-on video analysts, helping factories run more efficiently, bolster worker safety, keep track things are running smoothly and even up an athlete’s game.
To accelerate the creation of such agents, Nvidia today announced early access toa new version of the Nvidia AI Blueprint for video search and summarization. Built on top of the Nvidia Metropolis platform — and now supercharged by Nvidia Cosmos Nemotron vision language models (VLMs), Nvidia Llama Nemotron large language models (LLMs) and Nvidia NeMo Retriever — the blueprint provides developers with the tools to build and deploy AI agents that can analyze large quantities of video and image content.
The blueprint integrates the Nvidia AI Enterprise software platform — whichincludes Nvidia NIM microservices for VLMs, LLMs and advanced AI frameworks for retrieval-augmented generation — to enable batch video processing that’s 30 times faster than watching it in real time.
The blueprint contains several agentic AI features — such as chain-of-thought reasoning, task planning and tool calling — that can help developers streamline the creation of powerful and diverse visual agents to solve a range of problems.
AI agents with video analysis abilities can be combined with other agents with different skill sets to enable even more sophisticated agentic AI services.
Enterprises have the flexibility to build and deploy their AI agents from the edge to the cloud.
How Video Analyst AI Agents Can Help Industrial Businesses
AI agents with visual perception and analysis skills can be fine-tuned to help businesses with industrial operations by:
● Increasing productivity and reducing waste: Agents can help ensurestandard operating procedures are followed during complex industrialprocesses like product assembly. They can also be fine-tuned to carefullywatch and understand nuanced actions, and the sequence in which they’reimplemented.
● Boosting asset management efficiency through better space utilization:Agents can help optimize inventory storage in warehouses by performing 3Dvolume estimation and centralizing understanding across various camerastreams.
● Improving safety through auto-generation of incident reports andsummaries: Agents can process huge volumes of video and summarize it into contextually informative reports of accidents. They can also help ensurepersonal protective equipment compliance in factories, improving workersafety in industrial settings.
● Preventing accidents and production problems: AI agents can identifyatypical activity to quickly mitigate operational and safety risks, whether in awarehouse, factory or airport, or at an intersection or other municipal setting.
● Learning from the past: Agents can search through operations videoarchives, and relevant information from the past and use it to solve problems or create new processes.
Video Analysts for Sports, Entertainment and More
Another industry where video analysis AI agents stand to make a mark is sports — a $500 billion market worldwide, with hundreds of billions in projected growth over the next several years.
Coaches, teams and leagues — whether professional or amateur — rely on video analytics to evaluate and enhance player performance, prioritize safety and boost fan engagement through player analytics platforms and data visualization. With visually perceptive AI agents, athletes now have unprecedented access to deeper insights and opportunities for improvement.
During his CES opening keynote, Nvidia’s Huang demonstrated an AI video analytics agent that assessed the fastball pitching skills of an amateur baseball player compared with a professional’s. Using video capturedfrom the ceremonial first pitch that Huang threw for the San Francisco Giantsbaseball team, the video analytics AI agent was able to suggest areas forimprovement.
The $3 trillion media and entertainment industry is also poised to benefit from video analyst AI agents. Through the Nvidia Media2 initiative, these agents will help drive the creation of smarter, more tailored and more impactful content that can adapt to individual viewer preferences.
Worldwide Adoption and Availability
Partners from around the world are integrating the blueprint for building AI agents for video analysis into their own developer work flows, including Accenture, Infosys, Linker Vision, Pegatron, TATA Consultancy Services (TCS), Telit Cinterion and VAST.
The post Nvidia launches blueprint for AI agents that can analyze video appeared first on Venture Beat.