Introduction
This manual is designed for developers looking to program the NVIDIA Base Command Manager (BCM) to enhance or modify its functionality. It focuses on the Python API, enabling automation of cluster operations, metric collection, and interaction with the CMDaemon process. The content is relevant for users familiar with the Administrator Manual, particularly CMDaemon, and provides detailed instructions for leveraging the Python API for various cluster management tasks.
About the Manuals
NVIDIA Base Command Manager 11 offers a suite of manuals covering different aspects of the system. These include the Administrator Manual for general cluster management, the Installation Manual for setup, the User Manual for end-user job submission, and specialized manuals for Cloudbursting, Edge, Containerization, and Mission Control integration. All manuals are regularly updated and available online at https://docs.nvidia.com/base-command-manager.
Getting Support
Support for BCM subscriptions from version 10 onwards is available through the NVIDIA Enterprise Support page. For developer-specific inquiries or more extensive support needs, developers can contact the BCM support team to arrange a support contract. Professional services are also available via the NVIDIA Enterprise Services page.
Python API Overview
The NVIDIA Base Command Manager Python API, overhauled in version 8.2, provides a pure Python connection to the cluster manager. This allows for cluster operations to be automated using Python on any operating system supporting Python 3.5 and higher. The API utilizes several extra modules, including pyOpenSSL, ply, and lxml, among others. The manual details how to get started with the Python API, including setting up the environment, connecting to a cluster, inspecting and modifying settings, performing operations on entities, and monitoring cluster data.
Examples and Resources
The manual includes a comprehensive list of examples located at `/cm/local/examples/cmd/pythoncm` on the head node. These examples demonstrate various functionalities, such as managing nodes, collecting monitoring data, and executing commands. Trying out these examples is recommended for practical understanding and efficient use of the Python API.