This marks a critical leap forward for Qualcomm as it commits to enabling scalable, efficient and flexible Gen AI across multiple industries.
Designed as chip-based accelerator cards and racks, the two new launches by Qualcomm build off its already-powerful NPU technology leadership. Both solutions are designed to offer rack-scale performance and superior memory capacity for fast generative AI (Gen AI) inference at high performance per dollar per watt.
“With Qualcomm AI200 and AI250, we’re redefining what’s possible for rack-scale AI inference. These innovative new AI infrastructure solutions empower customers to deploy Gen AI at unprecedented total cost of ownership (TCO), while maintaining the flexibility and security modern data centres demand,” said Durga Malladi, SVP & GM, technology planning, edge solutions & data center at Qualcomm Technologies, Inc.
Qualcomm AI200 has been designed to introduce a purpose-built rack-level AI inference solution to deliver low TCO and optimised performance for large language models (LLMs) and multimodal models (LMM), alongside other AI workloads.
Additionally, the AI250 solution by the technology company will launch with an innovative memory architecture based on near-memory computing. This aims to provide a generational leap in efficiency and performance for AI inference workloads, delivering higher effective memory bandwidth and less power consumption, Qualcomm said.
Malladi added: “Our rich software stack and open ecosystem support make it easier than ever for developers and enterprises to integrate, manage, and scale already trained AI models on our optimised AI inference solutions. With seamless compatibility for leading AI frameworks and one-click model deployment, Qualcomm AI200 and AI250 are designed for frictionless adoption and rapid innovation.”
Qualcomm said via its announcement: “Both rack solutions feature direct liquid cooling for thermal efficiency, PCIe for scale up, Ethernet for scale out, confidential computing for secure AI workloads, and a rack-level power consumption of 160 kW.”
Its hyperscale-grade AI software stack is also optimised for AI inference to support machine learning frameworks, inference engines, Gen AI and optimisation techniques like disaggregated serving. Both the AI200 and AI250 are expected to be commercially available in 2026 and 2027 respectively.
The news comes shortly after Qualcomm announced its re-entry into the global chip market, as it returned to CPUs in 2025 to diversify its offerings. Traditionally, the company had focused on selling processors and modems for smartphones, but after reportedly gaining a team of ex-Apple chip designers, has decided to sell its own products and expand its business.
Its latest offerings perhaps do represent a commitment from Qualcomm to its forward-looking data centre roadmap. As accelerator chips, they will inevitably be entering a highly competitive market alongside leaders Nvidia and AMD, alongside companies like Google, Amazon, Microsoft and OpenAI, who are all developing their own AI accelerators to support their cloud services.
RELATED STORIES
Micron, Qualcomm, TI urge rethink on semiconductor tariffs
Nvidia unlocks NVLink for third-party CPUs from Qualcomm, Fujitsu

Datacloud Energy 2026
After a standout 2025 edition, we’re back with an even sharper focus on the intersection of data centres, energy, and ESG. As power demand rises and regulations evolve, there’s a growing urgency to rethink how infrastructure is powered, financed, and built for long-term impact.





