If you’re a technology enthusiast or investor, it was impossible to avoid Wednesday’s Nvidia news, which was everywhere: The cloud infrastructure technology company’s blowout earnings and forecast for its fiscal second quarter was fueled by market-dominating chips and systems that run artificial intelligence infrastructure, as it exceeded its revenue forecast by $1.4 billion.
“You’re seeing the data centers around the world are taking that capital spend and focusing it on the two most important trends of computing today, accelerated computing and generative AI,” said CEO Jensen Huang on the conference call.
The company raised its guidance for the year, issuing an eye-popping forecast that fiscal third-quarter revenue will be about $16 billion, a mere $3.4 billion or so higher than the $12.61 billion that the analyst consensus forecast. That guidance indicates 170% growth from the year-earlier period.
As you read in the media at large, the story has largely been pinned to Nvidia’s dominance in graphics processing units, the building blocks of generative AI. Nvidia enjoys a near monopoly on high-powered chips for AI, including its A100 and H100 chips, at least until leading competitor AMD starts shipping its more competitive chips later this year. Currently, Nvidia has a huge backlog and can’t even keep up with demand
But there’s something larger going on here: Nvidia’s dominance across the AI stack — including software, memory, storage and networking. Its executives pointedly attributed the growth to selling entire systems – such as the HGX – that are built on Nvidia GPUs but also are integrated with powerful networking and software.
“Data center compute revenue nearly tripled year on year driven primarily by accelerating demand for cloud from cloud service providers and large consumer internet companies for our HGX platform, the engine of generative and large language models,” said Colette Kress, executive vice president and chief financial officer of Nvidia, on the conference call.
The Full Stack Makes Nvidia Stickier
It’s not just about GPUs. AI systems are sophisticated supercomputing platforms that must be networked together, optimized with software and use thousands of components. To optimize an AI system, engineers must take a “full stack” approach.
Nvidia has the lead not only in chips but also across the stack, including important networking technology from Mellanox, which it acquired in 2019, as well as key software optimization components.
Just to give you the sense of scale: A single HGX A100 system has 1.3 terabytes of GPU memory and 2 terabytes/second of memory bandwidth. The storage handles 492 of SSDs and external networking capacity is 400 gigabits/s.
Don’t expect Nvidia to stop extending the AI stack. It’s been steadily making key acquisitions targeting AI systems to build HGX. In 2022, it acquired Excelero for block storage systems and Bright Computing to drive high performance compute clusters. In February, Nvidia acquired OmniML, an AI software company designed to enable machine-learning models to run on any device.
While the larger world seems to be focused on Nvidia’s lead in GPUs, it’s really about the full stack, even coming down to extensive software libraries, which Nvidia executives pointed out on the conference call.
“So, this runtime called Nvidia AI Enterprise has something like 4,500 software packages, software libraries and has something like 10,000 dependencies among each other,” explained Huang on the call. “And that runtime is, as I mentioned, continuously updated and optimized for our install base for our stack. And that’s just one example of what it would take to get accelerated computing to work that the number of code combinations and type of application combinations is really quite insane.”
Network Is the Next Battle
It’s hard to see anybody putting together the full stack in a way that Nvidia has. The networking front has gotten more interesting lately, with important competitors such as Arista Networks getting an AI bump as hyperscalers increase their networking needs to connect AI servers.
Arista Networks has also been AI investor favorite in 2023, deriving networking growth from the AI boom. In reporting its second quarter fiscal 2023 earnings earlier this month, Arista reported 39% year-over-year growth, fueled by demand from hyperscalers such as Microsoft and Meta. One of its growth drivers is upgrades to higher bandwidth systems to drive AI workloads, said Arista executives.
“The AI opportunity is exciting,” said Arista CEO Jayshree Ullal. “As our largest cloud customers review their classic cloud and AI networking plans, Arista is adapting to these changes, thereby doubling down on our investments in AI.”
Possibly feeling a little left out, Cisco Systems played up AI wares on its earnings conference call a week ago. Cisco is releasing upgraded Ethernet switches with a new line of Cisco Silicon One ASICs designed to compete with AI networking systems based on Nvidia’s InfiniBand. Cisco, with rival Arista, is part of the Ultra Ethernet Consortium.
But so far, Cisco hasn’t been seen as big an AI play by investors. For example, Cisco’s year-to-date gain is 17%, while Nvidia is up 225% and Arista is up 51%. And neither Arista nor Cisco have GPUs and full-stack integrations.
When Will NVIDIA Dominance Be Challenged?
The bottom line is Nvidia has built an AI system business in the fastest growing market in cloud. Generative AI companies that want to build processing facilities don’t have much choice at this point – they can either rent space in the public clouds, which have built their own infrastructure and in some cases their own AI chips, or they can buy Nvidia systems and processors. Nvidia has been ahead of this trend but now everybody is chasing it.
With Arista and Cisco coming after Nvidia in AI networking, it’s clear Nvidia is going to be challenged on a number of fronts. But for now, the Nvidia frenzy you see is likely to remain for at least the rest of the year. AMD’s new AI chip, the MI300X, won’t be out for at least another quarter, so Nvidia will continue to cash in. The other closest competitors are the cloud companies themselves – Amazon and Google – which have designed their own chips for AI, having correctly foreseen the possible chip shortage. The networking competitors have pieces of the puzzle — but they don’t have the full system with software optimization.
What may be underestimated is the deep strategic thought and planning that Nvidia has put into its entire systems, from the networking interconnects to the software components. Competitors have a lot of work to do if they want to knock Nvidia off the top of the AI infrastructure hill.
©️ Forbes
Leave a Reply