“It wasn’t that long ago that we were all testing and experimenting with chatbots, and now it seems like there’s something new every day,” he told an audience at AWS’ re:Invent conference in Las Vegas.
“The true value of AI has not yet been unlocked, but a lot of that is changing fast.” he said.
Innovation starts with infrastructure
Additionally, Garman added that achieving this vision will require innovation at the infrastructure level.
“Getting to a future of billions of agents, where every organisation is getting real-world value and results from AI, is going to require us to push the limits of what’s possible with infrastructure,” he stated.
“We’re going to have to invent new building blocks for agentic systems and applications. We want to reimagine every single process in the way that all of us work.
“There are no shortcuts. When you think about AI infrastructure, one of the first things that comes to mind is GPUs. AWS is by far the best place to run Nvidia GPUs,” he added.
This comes as the technology giant has collaborated with Nvidia for over 15 years, where it claims it has learned to “operate GPUs at scale.”
“Other places just accept that as how it works. Not us. We investigate and root cause every single issue. Then we collaborate with our partners at Nvidia to make constant improvements. Nothing is too small for us to focus on. Those details really matter, and it’s why we lead the industry in GPU reliability.”
“It takes hard work and real engineering to make that happen, and we improve on new generations with every release” he continued.
Accelerating the future of AI data centres
AWS also launched its new AWS AI Factories, a new offering to customers and governments with dedicated AWS AI infrastructure deployed in their own data centres.
“Effectively, AWS AI Factories operate like a private AWS region, letting customers leverage their own data centre space and power capacity that they’ve already acquired,” he said.
“We also give them access to leading AWS AI infrastructure and services, including the very latest Trainium UltraServers or Nvidia GPUs, and access to services like SageMaker and Bedrock.
“These AI Factories operate exclusively for each customer, and it helps them with that separation- maintaining the security and reliability that you get from AWS while also meeting stringent compliance and sovereignty requirements.”
On a separate note, HPE announced, this week, an expansion of the Nvidia AI Computing by HPE portfolio to simplify and accelerate the deployment of AI-ready data centres.
“HPE and Nvidia continue to provide the foundation for secure AI factories at any scale, with new innovations that deliver a greater range of performance for more diverse workloads than ever before,” said Antonio Neri, president and CEO of HPE.
“Together, HPE and Nvidia are showcasing our unique strengths to deliver true full-stack AI infrastructures that provide enterprises with a greater range of performance for more diverse workloads.”
Alongside this, AWS and Nvidia announced a new partnership with HUMAIN.
As part of this collaboration, AWS is creating Saudi Arabia’s first-ever “AI Zone” at a HUMAIN-designed data centre, equipped with up to 150,000 AI chips, including GB300 GPUs, alongside dedicated AWS AI infrastructure and services.
“The AI factory AWS is building in our new AI Zone represents the beginning of a multi-gigawatt journey for HUMAIN and AWS. From inception, this infrastructure has been engineered to serve both the accelerating local and global demand for AI compute,” Tareq Amin, CEO of HUMAIN said.
AI infrastructure involving partnerships
Alongside this, AWS also announced 18 new open-weight models to Amazon Bedrock, strengthening its goal of providing a wide range of fully managed AI models from top providers.
The update introduces two new model sets from Mistral AI, available first in Amazon Bedrock.
Mistral Large 3 is designed for long-context, multimodal tasks and reliable instruction following, while Ministral 3 offers compact, general-purpose, and multimodal AI capabilities.
The launch also includes popular models from other providers, such as Google’s Gemma 3, MiniMax’s M2, NVIDIA’s Nemotron, OpenAI’s GPT OSS Safeguard, and more, the company revealed.
“If you look at all the inferences that’s running in Amazon Bedrock today, the majority is actually powered by Trainium already. The performance advantage of Trainium is really noticeable.
“If you’re using any of Claude’s latest generation models in Bedrock, all of that traffic is running on Trainium, delivering the best end-to-end response times compared to any other major provider. And that’s part of the reason why we’ve deployed over one million Trainium chips already to date.”
Looking ahead, AWS is scaling AI infrastructure with Trainium 3 and 4, designed to make AI workloads better and more cost-effective.
“These UltraServers are our most advanced, containing the very first three-nanometer AI chip in the AWS Cloud. Trainium 3 offers the industry’s best price-performance for large-scale AI training and inference,” Garman said in his keynote.
“Our largest three UltraServers combine 144 total Trainium 3 chips acting together in a single scale-up domain connected by custom neuron switches.
“This delivers a massive 362 petaflops of compute, over 700TB per second of aggregate bandwidth, all in a single compute instance. Our custom-built EFA networks support scaling these out to clusters of hundreds of thousands of chips. No one else can deliver this for you,” he concluded.
RELATED STORIES
How AWS & Google Cloud hope to simplify multicloud networking
How AWS built a ‘designed to fail’ network spanning 9 million kilometres of fibre

ITW 2026
Over 2000 organisations from 120 countries made their mark at ITW 2025, powering the future of global connectivity and digital infrastructure.





