NVIDIA Mission Control Manual

Revision: 2505667c1

Date: Wed Jun 18 2025

Introduction to NVIDIA Mission Control

This manual provides detailed information on the NVIDIA Mission Control features integrated within NVIDIA Base Command Manager (BCM) version 11. It is designed for cluster administrators to effectively install, configure, and manage these advanced capabilities on NVIDIA B200 and GB200 platforms.

NVIDIA Mission Control extends BCM's functionality, offering features such as Building Management System (BMS) integration, advanced leak detection, autonomous hardware recovery, NMX for network monitoring, and comprehensive rack management for DGX GB200 systems. It also includes power management and firmware updates.

For the latest documentation and support, NVIDIA recommends visiting NVIDIA Docs.

Key Features and Management

  • NMX Settings for NVLink Monitoring: Configure and monitor NMX telemetry services for NVLinks and NVLink switches.
  • Rack Management: Efficiently manage data center racks and their components, including nodes, switches, and power shelves, with commands like rackoverview and display.
  • BCM Power Shelf Integration: Manage power shelves, including networking, access configuration, settings, metrics, and firmware updates.
  • NVIDIA Autonomous Hardware Recovery: Automate hardware management to enhance cluster uptime.
  • DGX GB200 Measurables: Access detailed metrics for DGX GB200 systems, covering circuit information, leak detection, NVLink, power, cooling, GPU performance, and Redfish data.

Support and Services

For technical assistance, contact NVIDIA support through their enterprise support portal: NVIDIA Enterprise Support.

Professional services can be explored via the NVIDIA Enterprise Services page.

Models: B05-P1-PWR-04, B05-P1-PWR-04 Mission Control, Mission Control

File Info : application/pdf, 68 Pages, 322.60KB

PDF preview unavailable. Download the PDF instead.

nvidia-mission-control-manual

References

LaTeX with hyperref LuaTeX-1.12.0

Related Documents

Preview NVIDIA GB200 NVL Partition User Guide
This user guide provides comprehensive information on NVIDIA GB200 NVL systems, focusing on NVLink partitioning. It details how to create, manage, and maintain partitions for efficient GPU resource allocation and multi-tenancy. The guide covers platform location information, different partition types (UID-based, location-based, zero-GPU), administrative and tenant workflows, fault handling, and maintenance procedures.
Preview NVIDIA AI Enterprise User Guide: Installation, Configuration, and Management
Comprehensive user guide for NVIDIA AI Enterprise, detailing installation, configuration, and management of NVIDIA vGPU, AI frameworks, and software components across various hypervisors and operating systems.
Preview NVIDIA DGX B300 Datasheet: AI Factory Performance
Explore the NVIDIA DGX B300, a powerful AI infrastructure solution designed for AI factory performance, from training to inference. Learn about its key features, specifications, and how it enables enterprises to scale AI operations.
Preview NVIDIA AI Enterprise User Guide: GPU Virtualization, Deployment, and Management
Comprehensive user guide for NVIDIA AI Enterprise, detailing installation, configuration, and management of AI and data analytics workloads on virtualized GPU environments. Covers vGPU, Kubernetes, VMware vSphere, and Red Hat KVM.
Preview NVIDIA DGX GB300 Datasheet: AI Infrastructure for the Era of Reasoning
Explore the NVIDIA DGX GB300, a purpose-built AI factory infrastructure designed for generative AI and large language models. Discover its key features, including the Grace Blackwell Ultra Superchips, liquid-cooled design, and NVIDIA networking, for accelerating state-of-the-art AI models.
Preview NVIDIA Base Command Manager 11 Developer Manual
This developer manual for NVIDIA Base Command Manager 11 provides comprehensive guidance on utilizing the Python API for cluster automation, metric collection, and advanced operations. It covers setup, connection, inspection, modification of settings, and performing operations on entities within the cluster environment.
Preview NVIDIA DGX SuperPOD: Next-Generation AI Infrastructure Reference Architecture
This document outlines the reference architecture for the NVIDIA DGX SuperPOD, a scalable infrastructure designed for AI leadership. It details the key components, network fabrics, storage architecture, and software stack, including NVIDIA DGX GB200 systems, InfiniBand, NVLink, and Mission Control software, to power next-generation AI factories.
Preview Guía del usuario de la herramienta NVIDIA Display Mode Selector
Aprenda a usar la herramienta NVIDIA Display Mode Selector para habilitar o deshabilitar puertos de pantalla físicos y cambiar el modo del controlador de Windows de TCC a WDDM en las GPU NVIDIA.