Lenovo Intelligent Computing Orchestration (LiCO)

Product Guide

Introduction

Lenovo Intelligent Computing Orchestration (LiCO) is a software solution that simplifies the use of clustered computing resources for Artificial Intelligence (AI) model development and training, and High-Performance Computing (HPC) workloads. LiCO interfaces with an open-source software orchestration stack, enabling the convergence of AI onto an HPC or Kubernetes-based cluster.

The unified platform simplifies interaction with the underlying compute resources, enabling customers to take advantage of popular open-source cluster tools while reducing the effort and complexity of using it for HPC and AI.

Did You Know?

LiCO enables a single cluster to be used for multiple AI workloads simultaneously, with multiple users accessing the available cluster resources at the same time. Running more workloads can increase utilization of cluster resources, driving more user productivity and value from the environment.

What's new in LiCO 6.3

Lenovo recently announced LiCO Version 6.3, improving the functionality for both AI users, HPC users, and HPC administrators of LiCO, including:

Part numbers

The following table lists the ordering information for LiCO.

Table 1. LICO HPC/AI version ordering information

Description LFO Software CTO Feature code
Lenovo HPC AI LICO Software 90 Day Evaluation License 7S090004WW 7S09CTO2WW B1YC
Lenovo HPC AI LICO Software w/1 yr S&S 7S090001WW 7S09CTO1WW B1Y9
Lenovo HPC AI LICO Software w/3 yr S&S 7S090002WW 7S09CTO1WW B1YA
Lenovo HPC AI LICO Software w/5 yr S&S 7S090003WW 7S09CTO1WW B1YB

Table 2. LICO K8S/AI ordering information (Kubernetes)

Description LFO Software CTO Feature code
Lenovo K8S AI LICO Software Evaluation License (90 days) 7S090006WW 7S09CTO3WW S21M
Lenovo K8S AI LICO Software 4GPU w/1Yr S&S 7S090007WW 7S09CTO4WW S21N
Lenovo K8S AI LICO Software 4GPU w/3Yr S&S 7S090008WW 7S09CTO4WW S21P
Lenovo K8S AI LICO Software 4GPU w/5Yr S&S 7S090009WW 7S09CTO4WW S21Q
Lenovo K8S AI LICO Software 16GPU upgrade w/1Yr S&S 7S09000AWW 7S09CTO4WW S21R
Lenovo K8S AI LICO Software 16GPU upgrade w/3Yr S&S 7S09000BWW 7S09CTO4WW S21S
Lenovo K8S AI LICO Software 16GPU upgrade w/5Yr S&S 7S09000CWW 7S09CTO4WW S21T
Lenovo K8S AI LICO Software 64GPU upgrade w/1Yr S&S 7S09000DWW 7S09CTO4WW S21U
Lenovo K8S AI LICO Software 64GPU upgrade w/3Yr S&S 7S09000EWW 7S09CTO4WW S21V
Lenovo K8S AI LICO Software 64GPU upgrade w/5Yr S&S 7S09000FWW 7S09CTO4WW S21W

Features for LiCO users

Topics in this section:

LICO versions

Note: There are two distinct versions of LICO, LICO HPC/AI (Host) and LiCO K8S/AI, to allow clients a choice for the which underlying orchestration stack is used, particularly when converging AI workloads onto an existing cluster. The user functionality is common across both versions, with minor environmental differences associated with the underlying orchestration being used.

A summary of the differences for user access is as follows:

LICO K8S/AI version:

LICO HPC/AI version:

Benefits to users

LiCO provides users the following benefits:

Features for users

Those designated as LiCO users have access to dashboards related primarily to AI development and training tasks. Users can submit jobs to the cluster, and monitor their results through the dashboards. The following menus are available to users:

LiCO and TensorBoard monitoring

LICO also provides TensorBoard monitoring when running certain TensorFlow workloads, as shown in the following figure.

Features

Lenovo Accelerated AI

Lenovo Accelerated AI provides a set of templates that aim to make AI training and inference simpler, more accessible, and faster to implement. The Accelerated AI templates differ from the other templates in LiCO in that they do not require the user to input a program; rather, they simply require a workspace (with associated directories) and a labelled dataset.

The following use cases are supported with Lenovo Accelerated AI templates:

Each Lenovo Accelerated AI use-case is supported by both a training and inference template. The training templates provide parameter inputs such as batch size and learning rate. These parameter fields are pre-populated with default values, but are tunable by those with data science knowledge. The templates also provide visual analytics with TensorBoard; the TensorBoard graphs continually update in-flight as the job runs, and the final statistics are available after the job has completed.

In LiCO 6.3, the Image Classification and Object Detection templates introduce the ability to select a topology based on the characteristics of a target inference device, such as an IoT Device, Edge Server, or Data Center server.

The following figure displays the embedded TensorBoard interface for a job. TensorBoard provides visualizations for TensorFlow jobs running in LiCO, whether through Lenovo Accelerated AI templates or the standard TensorFlow AI templates.

Inference Templates

LICO also provides inference templates which allow users to predict with new data based on models that have been trained with Lenovo Accelerated AI templates. For the inference templates, users only need to provide a workspace, an input directory (the location of the data on which inference will be performed), an output directory, and the location of the trained model. The job will run, and upon completion, the output directory will contain the analyzed data. For visual templates such as Object Detection, images can be previewed directly from within LiCO's Manage Files interface.

The following two figures display an input file to the Object Detection inference template, as well as the corresponding output.

Favorites

LiCO allows the user to select frequently used job submission templates as “favorites” to simplify user access. Selecting the star in a template box will add the template to the Favorites tab at the top of the Submit Job screen, which is the default view to the Submit Job tab. If no favorites have been selected, the Favorites tab will not appear. Users may add standard templates, Lenovo Accelerated AI templates, and custom-defined templates to this tab.

AI Studio

LICO AI Studio provides an end-to-end workflow for Image Classification, Object Detection, and Instance Segmentation, with training based on Lenovo Accelerated AI pre-defined models. A user can import an unprocessed, unlabeled data set of images, label them, train multiple instances with a grid of parameter values, test the output models for validation, and publish to a git repository for use in an application environment. Additionally, users can initiate the steps in AI Studio from a REST API call to take advantage of LiCO as part of a DevOps toolchain.

Dev Tools

LICO includes the capability to create and deploy instances of Jupyter on the cluster. Users may create multiple instances, to customize for different software environments and projects. At launch of an instance the user can define the amount of compute resource requirements needed (CPU and GPU) to better optimize performance to the task and optimize resource usage on the cluster.

Once a Jupyter instance is created, the user can deploy it to the cluster and use the environment directly from their browser in a new tab. The user can leverage the Jupyter interface directly to upload, download and run code as they normally would, utilizing the shared storage space used for LiCO.

Workflow

LICO provides the ability to define multiple job submissions into a single execution action, called a Workflow. Steps are created to execute job submissions in serial, and within each step multiple job submissions may be executed in parallel. Workflow uses LiCO job submission templates to define the jobs for each step, and any template available including custom templates can be used in a workflow.

LiCO workflows allow users to automate the deployment of multiple jobs that may be required for a project, so the user can execute and monitor as a single action. Workflows can be easily copied and edited, allowing users to quickly customize existing workflows for multiple projects.

Admin

The Admin tab for the user provides access to their storage space on the cluster. Through the Manage Files subtab the user can upload, download, cut/copy/paste, preview and edit files on the cluster storage space from within the LiCO portal. The text editor within LiCO allows syntax-aware display and editing based on the file extension. The Admin tab also enables users to publish a trained model to a git repository or as a docker container image.

Additional features for LiCO HPC/AI Users

In addition to the user features above, the LiCO HPC/AI version contains a number of features to simplify HPC workload deployment with a minimal learning curve for users vs. console-based scripting and execution. HPC users can submit jobs easily through standard or custom templates, utilize containers, pre-define runtime modules and environment variables for submission, and with LiCO 6.3 take advantage of advanced features such as Energy Aware Runtime and Intel oneAPI tools and optimizations all from within the LiCO interface.

Topics in this section:

Energy Aware Runtime

Energy Aware Runtime (EAR) is software technology designed to provide a solution for running MPI applications with higher energy efficiency. Developed in collaboration with Barcelona Supercomputing Center as part of the BSC-Lenovo Cooperation project, EAR is supported for use with the SLURM scheduler through a SPANK plugin. LiCO exposes EAR deployment options within the standard MPI template, allowing users to take advantage of the capability for MPI workloads.

Once the workload has been profiled through a learning phase, EAR will minimize CPU frequency to reduce energy consumption while maintaining a set threshold of performance. This is particularly helpful where MPI applications may not take significant advantage of higher clock frequencies, so the frequency can be reduced to save energy while maintaining expected performance.

Users can select EAR options at job submission in the standard MPI template, either to run the default set by the administrator, minimum time to solution, or minimum energy. Administrators can set the policies and thresholds for EAR usage within the LiCO Administrator portal, as well as which users are authorized to use EAR.

Intel oneAPI

Intel oneAPI is a cross-industry, open, standards-based unified programming model that delivers a common developer experience across accelerator architectures—for faster application performance, more productivity, and greater innovation. In addition to Intel MPI, Intel OpenMP and Intel MPITune, LiCO 6.3 features new templates based on oneAPI – Intel VTune, TensorFlow2, and PyTorch – that are optimized to run on Intel processors.

HPC Runtime Module Management

LICO HPC/AI version allows the user to pre-define modules and environmental variables to load at the time of job execution through Job submission templates. These user-defined modules eliminate the step of needing to manually load required modules before job submission, further simplifying the process of running HPC workloads on the cluster. Through the Runtime interface, users can choose from the modules available on the system, define their loading order, and specify environmental variables for repeatable, reliable job deployment.

Container-based HPC workload deployment

Additional standard templates are provided to support deployment of containerized HPC workloads through Singularity or CharlieCloud. These templates simplify deploying containers for HPC workloads by eliminating the need to create custom runtimes and custom templates for these workloads unless needed for more granularity.

Singularity Container Image Management

LICO HPC/AI version provides both users and administrators with the ability to build, upload and manage application environment images through Singularity containers. These images can support users with AI frameworks and HPC workloads, as well as others. Singularity containers may be built from Docker containers, imported from NVIDIA GPU Cloud (NGC), or other image repositories such as the Intel Container Portal. Containers created by administrators are available to all users, and users can create container images for their individual use as well. Users looking to deploy a custom image can also create a custom template that will deploy the container and run workloads in that environment.

Expert mode

LICO HPC/AI version provides more experienced cluster users console access to the user space in the LiCO management node, to execute Linux and SLURM commands directly. Expert mode enables users familiar with the underlying cluster orchestration choice in how they work, using either the command line, GUI or both in concert to facilitate their workflow.

Reports

LICO HPC/AI version provides expanded billing capabilities and provides the user access to monitor charges incurred for a date range via the Expense Reports subtab. Users can also download daily or monthly billing reports as a .xlsx file from the Admin tab.

Features for LiCO Administrators

Topics in this section:

Features for LICO K8S/AI version administrators

For administrators of a Kubernetes-based LiCO environment, LiCO provides the ability to monitor activity, create and manage users, monitor LiCO-initiated activity, generate job and operational reports, enable container access for LiCO users, and view the software license currently installed in LICO. LICO K8S/AI version does not provide resource monitoring for the administrator; resources can be monitored at the Kubernetes level with a tool such as Kubernetes Dashboard.The following menus are available to administrators in LiCO K8S/AI:

Features for LICO HPC/AI version Administrators

For cluster administrators, LiCO provides a sophisticated monitoring solution, built on OpenHPC tooling. The following menus are available to administrators:

Features for LiCO Operators

For the purpose of monitoring clusters but not overseeing user access, LiCO provides the Operator designation. LiCO Operators have access to a subset of the dashboards provided to Administrators; namely, the dashboards contained in the Home, Monitor, and Reports menus:

Subscription and Support

LICO HPC/AI is enabled through a per-CPU and per-GPU subscription and support entitlement model, which once entitled for the all the processors contained within the cluster, gives the customer access to LiCO package updates and Lenovo support for the length of the acquired term.

LICO K8S/AI is enabled through tiered subscription and support entitlement licensing based on the number of GPU accelerators being accessed by running LiCO workloads (tiers are up to 4 GPU in use, up to 16 GPU in use, and up to 64 GPU in use). Additional licensing beyond 64 GPUs can be provided by contacting your Lenovo sales representative.

Lenovo will provide interoperability support for all software tools defined as validated with LiCO, and development support (Level 3) for specific Lenovo-supported tools only. Open source and supported-vendor bugs/issues will be logged and tracked with their respective communities or companies if desired, with no guarantee from Lenovo for bug fixes. Full support details are provided at the support links below for each respective version of LiCO. Additional support options may be available; please contact your Lenovo sales representative for more information.

LiCO can be acquired as part of a Lenovo Scalable Infrastructure (LeSI) solution or for “roll your own” (RYO) solutions outside of the LeSI framework, and LiCO software package updates are provided directly through the Lenovo Electronic Delivery system. More information on LeSI is available in the LeSI product guide, available from https://lenovopress.com/lp0900.

Validated software components

LiCO's software packages are dependent on a number of software components that need to be installed prior to LICO in order to function properly. Each LiCO software release is validated against a defined configuration of software tools and Lenovo systems, to make deployment more straightforward and enable support. Other management tools, hardware systems and configurations outside the defined stack may be compatible with LICO, though not formally supported; to determine compatibility with other solutions, please check with your Lenovo sales representative.

The following software components are validated by Lenovo as part of the overall LiCO software solution entitlement:

LiCO HPC/AI version support

The following software components are validated for compatibility with LiCO HPC/AI:

LICO K8S/AI version support

Supported servers (LiCO HPC/AI version)

The following Lenovo servers are supported to run with LiCO HPC/AI. This server must run one of the supported operating systems as well as the validated software stack, as described in the Validated Software Components section.

Additional Lenovo ThinkSystem and System x servers may be compatible with LiCO. Contact your Lenovo sales representative for more information.

LiCO Implementation services

Customers who do not have the cluster management software stack required to run with LiCO may engage Lenovo Professional Services to install LiCO and the necessary open-source software. Lenovo Professional Services can provide comprehensive installation and configuration of the software stack, including operation verification, as well as post-installation documentation for reference. Contact your Lenovo sales representative for more information.

Client PC requirements

A web browser is used to access LiCO's monitoring dashboards. To fully utilize LiCO's monitoring and visualization capabilities, the client PC should meet the following specifications:

Related links

For more information, see the following resources:

Related product families

Notices

Lenovo may not offer the products, services, or features discussed in this document in all countries. Consult your local Lenovo representative for information on the products and services currently available in your area. Any reference to a Lenovo product, program, or service is not intended to state or imply that only that Lenovo product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any Lenovo intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any other product, program, or service. Lenovo may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to:

Lenovo (United States), Inc.
8001 Development Drive
Morrisville, NC 27560
U.S.A.
Attention: Lenovo Director of Licensing

LENOVO PROVIDES THIS PUBLICATION "AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. Lenovo may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

The products described in this document are not intended for use in implantation or other life support applications where malfunction may result in injury or death to persons. The information contained in this document does not affect or change Lenovo product specifications or warranties. Nothing in this document shall operate as an express or implied license or indemnity under the intellectual property rights of Lenovo or third parties. All information contained in this document was obtained in specific environments and is presented as an illustration. The result obtained in other operating environments may vary. Lenovo may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Any references in this publication to non-Lenovo Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this Lenovo product, and use of those Web sites is at your own risk. Any performance data contained herein was determined in a controlled environment. Therefore, the result obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment.

Copyright Lenovo 2022. All rights reserved.

This document, LP0858, was created or updated on December 15, 2021.

Send us your comments in one of the following ways:

This document is available online at https://lenovopress.com/LP0858.

Trademarks

Lenovo and the Lenovo logo are trademarks or registered trademarks of Lenovo in the United States, other countries, or both. A current list of Lenovo trademarks is available on the Web at https://www.lenovo.com/us/en/legal/copytrade/.

The following terms are trademarks of Lenovo in the United States, other countries, or both:

The following terms are trademarks of other companies:

Models: Intelligent Computing Orchestration LiCO

File Info : application/pdf, 29 Pages, 3.91MB

PDF preview unavailable. Download the PDF instead.

lp0858

References

wkhtmltopdf 0.12.4-dev-4fa8338 Qt 4.8.7

Related Documents

Preview Reference Architecture: Lenovo ThinkEdge for AI
This document outlines Lenovo's reference architecture for deploying and scaling AI inference workloads at the edge using ThinkEdge servers. It details validated server configurations, AI use cases, frameworks, and management software, providing a comprehensive guide for edge AI solutions.
Preview Running Edge AI Workloads with Lenovo ThinkAgile HX360 V2 Edge Servers
Explore how Lenovo ThinkAgile HX360 V2 Edge servers, powered by Nutanix Cloud Platform, accelerate AI inferencing deployments at the edge. This document details the server's capabilities, validated designs, and performance testing for AI workloads.
Preview Chelsio 10GbE Adapters for IBM System Cluster 1350 and IBM iDataPlex: Product Guide
This product guide provides detailed information on Chelsio 10GbE adapters for IBM System Cluster 1350 and IBM iDataPlex. It covers adapter features, specifications, part numbers, supported servers and operating systems, and related resources. The adapters offer high performance, low latency, and protocol offload capabilities for demanding cluster environments.
Preview Lenovo, Cisco, and NVIDIA Collaborate for Accelerated Computing Hybrid AI Infrastructure
Explore the collaboration between Lenovo, Cisco, and NVIDIA to deliver a new accelerated computing hybrid AI infrastructure. Learn how this partnership enhances AI workloads, optimizes network performance, and provides scalable solutions for data centers and enterprise applications.
Preview Lenovo ThinkSystem NVIDIA ConnectX-8 8240 400Gbps QSFP112 PCIe Gen6 x16 Adapter Product Guide
Product guide for the Lenovo ThinkSystem NVIDIA ConnectX-8 8240 400Gbps QSFP112 2-Port PCIe Gen6 x16 Adapter, detailing its specifications, ordering information, supported cables, server compatibility, NVIDIA Unified Fabric Manager (UFM) subscriptions, regulatory compliance, and usage guidelines.
Preview Inspiring the Future of the Built Environment with AI | Lenovo & NVIDIA
Explore how Lenovo and NVIDIA solutions are revolutionizing the Architecture, Engineering, and Construction (AEC) industry with AI-powered Building Information Modeling (BIM) 2.0. Discover AI-driven workflows in design, construction, and operations.
Preview ThinkSystem NVIDIA H800 PCIe Gen5 GPUs Product Guide
A comprehensive product guide for the ThinkSystem NVIDIA H800 PCIe Gen5 GPUs, detailing features, specifications, server support, and software offerings. This guide provides information on the GPU's architecture, performance metrics, compatibility, and available software solutions from Lenovo and NVIDIA.
Preview Lenovo ThinkSystem SC777 V4 Neptune Server Product Guide
Explore the high-performance Lenovo ThinkSystem SC777 V4 Neptune server, featuring NVIDIA GB200 NVL4 platform and advanced Lenovo Neptune direct water cooling for extreme HPC/AI performance. Discover its key features, scalability, energy efficiency, manageability, and serviceability.