Nvidia has introduced the "Hopper" graphics chip architecture, which is tailored for use in data centers. Hopper is said to be particularly suitable for calculating neural networks, for high-performance computing (HPC) and for real-time voice data processing. Compared to the previous "Ampere" architecture, Hopper provides significantly more computing power and brings new functions that aim, among other things, at data security. The first chips should be available in the third quarter of 2022 – Nvidia negotiates prices with data center operators.
Nvidia has the H100 chip, which consists of 80 billion transistors, manufactured by the Taiwanese contract manufacturer TSMC – with 4 nm structures (process N4). For comparison: The ampere, also manufactured by TSCM, consists of 54 billion transistors (7 nm). Thanks to the finer manufacturing process, Nvidia was able to increase the packing density of hoppers. In terms of raw speed, the H100 is said to be between three times (FP16, FP64, TF32) and six times (FP8) as fast as an A100.
More memory and higher power consumption
The PCIe version of the H100 consumes 350 watts.
The H100 is the first HPC GPU to use the HBM3 stack memory – and thus achieves a total data transfer rate of 3 TB per second (A100: 1.6 TB/s). Compared to the Ampere A100, Nvidia has doubled the memory expansion to a total of 80 GB. The data transfer rate of the PCIe connection has also been doubled: PCIe 5.0 achieves around 4 GB/s instead of just under 2 GB/s per data line like PCIe 4.0 – currently also a GPU unique selling point from Hopper.
According to the specifications, the power consumption has increased significantly: while the A100, as the SXM version for servers, gets by with 400 watts, Nvidia specifies a whopping 700 watts for the H100-SXM – i.e. 75 percent more. A version in the form of a PCIe plug-in card (add-in card, AIC) is also to come, which is said to absorb up to 350 watts of electrical energy.
Hopper is the first HPC GPU with HBM3 stack memory. It should achieve a transfer rate of 3 TByte/s.
Accelerate real-time speech processing
Through the "Transformer Engine", Hopper is said to be particularly effective in real-time processing and translation of natural language via the popular Transformer deep learning model developed by Google. To ensure both performance and accuracy, Nvidia dynamically combines 8- and 16-bit data formats. The Transformer does not use a recurrent approach like most machine translation systems, but processes all words in parallel and also includes the context of words that are further away within the sequence via a special mechanism – and can be parallelized accordingly.
Nvidia wants to guarantee the security of the data processed in real time via special Confidential Computing functions, a combination of hardware and software (secured VM). This should also work separately for individual user instances – H100 allows up to seven cloud tenants per H100 GPU via Secure Multi-Instance, each of which should correspond to a T4 GPU. A100 also allowed seven instances per GPU. Various H100 chips can communicate with each other at 900 GB/s via NVLink 4.0 – an increase of 50 percent compared to the A100.
DGX H100 with eight hopper cards
Render image of the hopper system DGX H100.
Nvidia announced the DGX H100 as the first hopper system. It contains eight H100 cards, which are said to achieve a processing power of 32 tensor petaflops in AI calculations (FP16) and 0.5 PFlops in FP64 calculations – a six-fold and three-fold increase, respectively, compared to the DGX A100. In addition, Nvidia presented the DGX SuperPOD server system with 32 DGX H100, but Nvidia did not announce prices or concrete availability dates for these systems.
According to Nvidia, the cloud service providers Alibaba Cloud, Amazon Web Services, Baidu AI Cloud, Google Cloud, Microsoft Azure, Oracle Cloud and Tencent Cloud want to offer instances based on H100. In the future, servers with H100 accelerators will come from Atos, Boxx Technologies, Cisco, Dell, Fujitsu, Gigabyte, H3C Hewlett Packard Enterprise, Inspur, Lenovo, Nettrix and Supermicro, among others.