Nvidia unveils next-gen chip to power larger AI models
The new chip configuration is specifically designed to speed up GenAI applications.
Nvidia has announced the H200, an upgrade to its flagship AI chip, set to surpass the current top H100 chip. The main upgrade is an expansion in high-bandwidth memory, which determines how much data it can process quickly and is one of the most costly parts of the chip. The H200 has 141 gigabytes of high-bandwidth memory, up from 80 gigabytes in the previous H100 model. An increased memory means the AI model remains on a single Graphics Processing Unit (GPU), eliminating the need for multiple GPUs to run the model. This results in improved performance and the ability to power larger AI models.
The upgrade will allow AI chatbots to respond faster to queries. The H200 chip is expected to ship in the second quarter of 2024, with Amazon, Google, and Oracle among the early adopters for their cloud services.
Why does it matter?
A dominant player in the market for AI chips, Nvidia powers many generative AI (GenAI) systems, including OpenAI’s ChatGPT services. The new chip configuration is specifically designed to speed up GenAI applications. The company’s new AI chip is expected to further strengthen its global position, despite intense competition from companies like AMD. Nvidia has not yet revealed its suppliers for the memory on the new chip. However, the company sources memory from Korea’s SK Hynix, and Micron has already stated that it is working to become an Nvidia supplier.
The announcement comes in the wake of an intensifying tech rivalry between Beijing and Washington, including in the field of semiconductors and the most advanced chip manufacturing devices.
Despite strict US export control, Chinese giant Huawei surprised analysts by unveiling a new smartphone powered by an AI-level chip produced locally by Semiconductor Manufacturing International Corporation (SMIC). This week, Yangtze Memory Technologies Co. (YMTC), China’s leading memory maker, filed a US lawsuit against Micron for alleged patent infringement.