Chip

Blackwell GPU flaw '100% Nvidia's fault,' says CEO Jensen Huang: report

24 October 2024
3 minutes
Nvidia CEO Jensen Huang has finally addressed the design flaw that pushed back production of its next-generation GPUs, laying the blame as “100% Nvidia's fault.”
NVIDIA CEO Jensen Huang Presents the GB200 Grace Blackwell Superchip
NVIDIA CEO Jensen Huang Presents the GB200 Grace Blackwell Superchip

Reuters reported that Huang said the Blackwell design was “functional” but the flaw — later found to be related to the processor die — caused manufacturing yield to be low.

Subscribe today for free

RELATED STORIES

Manufacturer TSMC uncovered the design flaw late into production, which forced Nvidia to rework the design before the hardware could move on to mass production.

Nvidia resolved the Blackwell issue in late August, with CFO Colette Kress telling investors that a change to the design improved production yields.

Addressing the setback, Huang said seven different types of chips were “designed from scratch and had to be ramped into production at the same time” to get Blackwell to work.

In the wake of the design flaw’s discovery and Nvidia’s subsequent scramble to fix it to meet demand, which is well above supply, rumours have swirled that tensions have arisen between the company and its chief manufacturer.

Huang dismissed the suggestion of tensions with TSMC, describing it as “fake news.”

“What TSMC did was to help us recover from that yield difficulty and resume the manufacturing of Blackwell at an incredible pace,” the Nvidia CEO said.

The likes of Google, Microsoft and Meta are among the companies in line to purchase the new chips, which can run AI models at 25 times lower costs than previous Nvidia H100 hardware.

Oracle has ordered some 131,000 Blackwell GPUs for its Zettascale supercomputing cluster — though it is unknown the exact timeframe when Oracle will get its hands on that many of the sought chips.

Data centre operator Nebius also plans to get hold of some Blackwell GPUs as part of its upgrade plans for its data centre site in Finland.

While Blackwell shipments have been pushed back into 2025, Nvidia is helping TSMC speed up production with a suite of tools and algorithms designed to accelerate and enhance computational lithography processes. The manufacturer is using Nvidia’s cuLitho platform to speed up the creation of time-consuming simulations of chip structures.

RELATED STORIES