NVIDIA H100 Tensor Core GPU

Unprecedented performance, scalability, and security for every data center.

Take an order-of-magnitude leap in accelerated computing.

The NVIDIA H100 Tensor Core GPU delivers unprecedented performance, scalability, and security for every workload. With NVIDIA NVLink, two H100 PCIe GPUs can be connected to accelerate demanding compute workloads, while the dedicated Transformer Engine supports large parameter language models. H100 uses breakthrough innovations in the NVIDIA Hopper architecture to deliver industry-leading conversational AI, speeding up large language models by 30x over the previous generation.

Ready for enterprise AI?

NVIDIA H100 Tensor Core GPUs for mainstream servers come with a five-year software subscription, including enterprise support, to the NVIDIA AI Enterprise software suite, simplifying AI adoption with the highest performance. This ensures organizations have access to the AI frameworks and tools they need to build H100-accelerated AI workflows such as AI chatbots, recommendation engines, vision AI, and more. Access the NVIDIA AI Enterprise software subscription and related support benefits for the NVIDIA H100 here.

Securely accelerate workloads from enterprise to exascale.

NVIDIA H100 GPUs feature fourth-generation Tensor Cores and the Transformer Engine with FP8 precision, further extending NVIDIA's market-leading AI leadership with up to 9x faster training and an incredible 30x inference speedup on large language models. For high-performance computing (HPC) applications, H100 triples the floating-point operations per second (FLOPS) of FP64 and adds dynamic programming (DPX) instructions to deliver up to 7x higher performance. With second-generation Multi-Instance GPU (MIG), built-in NVIDIA confidential computing, and NVIDIA NVLink, H100 securely accelerates all workloads for every data center from enterprise to exascale.

Specifications

Feature Specification
PNY Part NumberNVH100TCGPU-KIT
FP6426 TFLOPS
FP64 Tensor Core51 TFLOPS
FP3251 TFLOPS
TF32 Tensor Core51 TFLOPS*
BFLOAT16 Tensor Core1,513 TFLOPS*
FP16 Tensor Core1,513 TFLOPS*
FP8 Tensor Core3,026 TFLOPS*
INT8 Tensor Core3,026 TOPS*
GPU memory80GB HBM2e
GPU memory bandwidth2TB/s
Decoders7 NVDEC, 7 JPEG
Max thermal design power (TDP)300-350W (configurable)
Multi-Instance GPUsUp to 7 MIGs at 10GB each
Form factorPCIe Dual-Slot Passive Fansink
InterconnectNVLink: 600GB/s, PCIe Gen5: 128GB/s
Server optionsNVIDIA and Partner-Certified Systems with 1-8 GPUs
NVIDIA AI EnterpriseIncluded

* Shown with sparsity. Specifications 1/2 lower without sparsity.

Performance Benchmarks

Up to 9x Higher AI Training on Largest Models

Description of Training Chart: A bar chart comparing NVIDIA A100 and H100 GPUs for AI training on large models (Mixture of Experts, 395 Billion Parameters). It shows a performance increase from 7 days (A100) to 20 hours (H100) for training, indicating a significant speedup.

Up to 30x Higher AI Inference Performance on Largest Models

Description of Inference Chart: A bar chart comparing NVIDIA A100 and H100 GPUs for AI inference on large models (Megatron Chatbot, 530 Billion Parameters). It illustrates latency reduction from 2 seconds (A100) to 1 second (H100) and a 30x throughput increase.

Up to 7x Higher Performance for HPC Applications

Description of HPC Chart: A bar chart comparing NVIDIA A100 and H100 GPUs for HPC applications. It demonstrates a 7x performance increase for 3D Fast Fourier Transform (FFT) and a 7x performance increase for Genome Sequencing.

Explore the technology breakthroughs of NVIDIA Hopper.

[GPU Icon] NVIDIA H100 Tensor Core GPU

Built with 80 billion transistors using a cutting-edge TSMC 4N process custom tailored for NVIDIA's accelerated compute needs, H100 is the world's most advanced chip ever built. It features major advances to accelerate AI, HPC, memory bandwidth, interconnect, and communication at data center scale.

[Transformer Icon] Transformer Engine

The Transformer Engine uses software and Hopper Tensor Core technology designed to accelerate training for models built from the world's most important AI model building block, the transformer. Hopper Tensor Cores can apply mixed FP8 and FP16 precisions to dramatically accelerate AI calculations for transformers.

[NVLink Icon] NVLink Switch System

NVLink enables the scaling of dual H100 input/output (IO) at 600 gigabytes per second (GB/s) bidirectional per GPU, over 7x the bandwidth of PCIe Gen5, and also delivers 9x higher bandwidth than InfiniBand HDR on the NVIDIA Ampere architecture.

[Security Icon] NVIDIA Confidential Computing

NVIDIA Confidential Computing is a built-in security feature of Hopper that makes NVIDIA H100 the world's first accelerator with confidential computing capabilities. Users can protect the confidentiality and integrity of their data and applications in use while accessing the unsurpassed acceleration of H100 GPUs.

[MIG Icon] Second-Generation Multi-Instance GPU (MIG)

The Hopper architecture's second-generation MIG supports multi-tenant, multi-user configurations in virtualized environments, securely partitioning the GPU into isolated, right-size instances to maximize quality of service (QoS) for 7x more secured tenants.

[DPX Icon] DPX Instructions

Hopper's DPX instructions accelerate dynamic programming algorithms by 40x compared to CPUs and 7x compared to NVIDIA Ampere architecture GPUs. This leads to dramatically faster times in disease diagnosis, real-time routing optimizations, and graph analytics.

Accelerate every workload, everywhere.

The NVIDIA H100 is an integral part of the NVIDIA data center platform. Built for AI, HPC, and data analytics, the platform accelerates over 3,000 applications, and is available everywhere from data center to edge, delivering both dramatic performance gains and cost-saving opportunities.

Deploy H100 with the NVIDIA AI platform.

NVIDIA AI is the end-to-end open platform for production AI built on NVIDIA H100 GPUs. It includes NVIDIA accelerated computing infrastructure, a software stack for infrastructure optimization and AI development and deployment, and application workflows to speed time to market. Experience NVIDIA AI and H100 on NVIDIA LaunchPad through free hands-on labs.

Application Workflows:

NVIDIA AI Enterprise Stack:

Accelerated Infrastructure:

Ready to Get Started?

To learn more about the NVIDIA H100 Tensor Core GPU, visit: www.pny.com/h100.


File Info : application/pdf, 3 Pages, 4.00MB

PDF preview unavailable. Download the PDF instead.

nvidia-h100-datasheet

References

Adobe PDF library 17.00

Related Documents

Preview NVIDIA H100 Tensor Core GPU Datasheet - High-Performance AI and HPC Acceleration
Detailed datasheet for the NVIDIA H100 Tensor Core GPU, highlighting its unprecedented performance, scalability, and security for AI and HPC workloads. Features include the Hopper architecture, Transformer Engine, NVLink Switch System, and Confidential Computing.
Preview NVIDIA H100 Tensor Core GPU Architecture Whitepaper
Explore the NVIDIA H100 Tensor Core GPU architecture, detailing its advanced features, performance enhancements for AI, HPC, and data analytics, and its role in next-generation data centers.
Preview NVIDIA H100 PCIe GPU Product Brief: Specifications and Features
Detailed product brief for the NVIDIA H100 PCIe GPU, covering specifications, features, NVLink support, power connectors, AI enterprise software, and support information.
Preview NVIDIA H100 PCIe GPU Product Brief
Detailed product brief for the NVIDIA H100 PCIe GPU, covering its specifications, features, NVLink support, power requirements, NVIDIA AI Enterprise software integration, and support information.
Preview NVIDIA H100 NVL GPU Product Brief
A product brief detailing the NVIDIA H100 NVL GPU, its specifications, features, and support information for data center applications in AI, data analytics, and high-performance computing (HPC).
Preview NVIDIA AI Enterprise User Guide: GPU Virtualization, Deployment, and Management
Comprehensive user guide for NVIDIA AI Enterprise, detailing installation, configuration, and management of AI and data analytics workloads on virtualized GPU environments. Covers vGPU, Kubernetes, VMware vSphere, and Red Hat KVM.
Preview NVIDIA AI Enterprise User Guide: Installation, Configuration, and Management
Comprehensive user guide for NVIDIA AI Enterprise, detailing installation, configuration, and management of NVIDIA vGPU, AI frameworks, and software components across various hypervisors and operating systems.
Preview NVIDIA Jetson Orin Nano Super Developer Kit Datasheet
The NVIDIA Jetson Orin Nano Super Developer Kit is a compact, powerful, and affordable generative AI supercomputer for edge devices. It features an NVIDIA Ampere architecture GPU, a 6-core ARM CPU, and extensive connectivity, enabling developers, students, and makers to build next-generation AI applications in robotics, vision AI, and more. The kit includes the Jetson Orin Nano 8GB module and a reference carrier board, supported by the NVIDIA AI software stack.