Lenovo Intelligent Computing Orchestration (LiCO)

Product Guide

Introduction

Lenovo Intelligent Computing Orchestration (LiCO) is a software solution that simplifies the use of clustered computing resources for Artificial Intelligence (AI) model development and training, and High-Performance Computing (HPC) workloads. LiCO interfaces with an open-source software orchestration stack, enabling the convergence of AI onto an HPC or Kubernetes-based cluster.

The unified platform simplifies interaction with the underlying compute resources, enabling customers to take advantage of popular open-source cluster tools while reducing the effort and complexity of using it for HPC and AI.

Did You Know?

LiCO enables a single cluster to be used for multiple AI workloads simultaneously, with multiple users accessing the available cluster resources at the same time. Running more workloads can increase utilization of cluster resources, driving more user productivity and value from the environment.

What's new in LiCO 6.3

Lenovo recently announced LiCO Version 6.3, improving the functionality for both AI users, HPC users, and HPC administrators of LiCO, including:

Lenovo Accelerated AI AutoML sizing of neural networks for Image Classification and Object Detection
Support for NVIDIA TensorRT deep learning inference optimizer and runtime
Energy Aware Runtime integration for MPI workloads (HPC/AI version)
Expanded support for Intel oneAPI tools and templates (HPC/AI version)
Support for the use of IBM LSF scheduler for HPC and AI workloads (HPC/AI version)
Support for the use of OpenPBS scheduler for HPC and AI workloads (HPC/AI version)

Part numbers

The following table lists the ordering information for LiCO.

Table 1. LICO HPC/AI version ordering information

Description	LFO	Software CTO	Feature code
Lenovo HPC AI LICO Software 90 Day Evaluation License	7S090004WW	7S09CTO2WW	B1YC
Lenovo HPC AI LICO Software w/1 yr S&S	7S090001WW	7S09CTO1WW	B1Y9
Lenovo HPC AI LICO Software w/3 yr S&S	7S090002WW	7S09CTO1WW	B1YA
Lenovo HPC AI LICO Software w/5 yr S&S	7S090003WW	7S09CTO1WW	B1YB

Table 2. LICO K8S/AI ordering information (Kubernetes)

Description	LFO	Software CTO	Feature code
Lenovo K8S AI LICO Software Evaluation License (90 days)	7S090006WW	7S09CTO3WW	S21M
Lenovo K8S AI LICO Software 4GPU w/1Yr S&S	7S090007WW	7S09CTO4WW	S21N
Lenovo K8S AI LICO Software 4GPU w/3Yr S&S	7S090008WW	7S09CTO4WW	S21P
Lenovo K8S AI LICO Software 4GPU w/5Yr S&S	7S090009WW	7S09CTO4WW	S21Q
Lenovo K8S AI LICO Software 16GPU upgrade w/1Yr S&S	7S09000AWW	7S09CTO4WW	S21R
Lenovo K8S AI LICO Software 16GPU upgrade w/3Yr S&S	7S09000BWW	7S09CTO4WW	S21S
Lenovo K8S AI LICO Software 16GPU upgrade w/5Yr S&S	7S09000CWW	7S09CTO4WW	S21T
Lenovo K8S AI LICO Software 64GPU upgrade w/1Yr S&S	7S09000DWW	7S09CTO4WW	S21U
Lenovo K8S AI LICO Software 64GPU upgrade w/3Yr S&S	7S09000EWW	7S09CTO4WW	S21V
Lenovo K8S AI LICO Software 64GPU upgrade w/5Yr S&S	7S09000FWW	7S09CTO4WW	S21W

Features for LiCO users

Topics in this section:

LICO versions
Benefits to users
Features for users
Lenovo Accelerated AI
Favorites
AI Studio tab
Dev Tools tab
Workflow tab
Admin tab

LICO versions

Note: There are two distinct versions of LICO, LICO HPC/AI (Host) and LiCO K8S/AI, to allow clients a choice for the which underlying orchestration stack is used, particularly when converging AI workloads onto an existing cluster. The user functionality is common across both versions, with minor environmental differences associated with the underlying orchestration being used.

A summary of the differences for user access is as follows:

LICO K8S/AI version:

AI framework containers are docker-based and managed outside LiCO in the customer's docker repository
Custom job submission templates are defined with YAML
Does not include HPC standard job submission templates

LICO HPC/AI version:

AI framework containers are Singularity-based and managed inside the LiCO interface
Custom job submission templates are defined as batch scripts (for SLURM, LSF, PBS)
Includes HPC standard job submission templates

Benefits to users

LiCO provides users the following benefits:

A web-based portal to deploy, monitor and manage AI development and training jobs on a distributed cluster
Container-based deployment of supported AI frameworks for easy software stack configuration
Direct browser access to Jupyter notebook instances running on the cluster
Standard and customized job templates to provide an intuitive starting point for less experienced users
Lenovo Accelerated AI pre-defined training and inference templates for many common AI use cases
Lenovo AI Studio end-to-end workflow for Image Classification, Object Detection, Instance Segmentation
Workflow to define multiple job submissions as an automated workflow to deploy in a single action
TensorBoard visualization tools integrated into the interface (TensorFlow-based)
Management of private space on shared storage through the GUI
Monitoring of job progress and log access

Features for users

Those designated as LiCO users have access to dashboards related primarily to AI development and training tasks. Users can submit jobs to the cluster, and monitor their results through the dashboards. The following menus are available to users:

Home menu for users – provides an overview of the resources available in the cluster. Jobs and job status are also given, indicating the runtime for the current job, and the order of jobs deployed. Users may click on jobs to access the associated logs and job files. The figure below displays the home menu.
Submit job menu allows users to set up a job and submit it to the cluster. The user first picks a job template. After selecting the template, the user gives the job a name and inputs the relevant parameters, chooses the resources to be requested on the cluster and submits it. Users can take advantage of Lenovo Accelerated AI templates, industry-standard AI templates, submit generic jobs via the Common Job template, as well as create their own templates requesting specified parameters. The figure below displays a job template for training with TensorFlow on a single node.

LiCO and TensorBoard monitoring

LICO also provides TensorBoard monitoring when running certain TensorFlow workloads, as shown in the following figure.

Features

Jobs menu – displays a dashboard listing jobs and their statuses. In addition, you can select the job and see results and logs pertaining to the job in progress (or after completion). Tags and comments can be added to completed jobs for easier filtering.
AI Studio menu provides users the ability to label data, optimize hyperparameters, as well as test and publish trained models from within an end-to-end workflow in LiCO. AI Studio supports Image Classification, Object Detection, and Instance Segmentation workflows. See the AI Studio section for more information.
Dev Tools menu enables users to create, run and view Jupyter notebook instances on the cluster from LiCO for model experimentation and development. See the Dev Tools section for more information.
Workflow menu allows users to create multi-step jobs that execute as a single action. Workflows can contain serially-executed steps as well as multiple jobs to execute in parallel within a step to take full advantage of cluster resources. See the Workflow section for more information.
Admin menu allows users to access a number of capabilities not directly associated with deploying workloads to the cluster, including access to shared storage space on the cluster through a drag-and-drop interface and access to provision API and git interfaces for integration of AI Studio steps into a DevOps environment. See the Admin section for more information.

Lenovo Accelerated AI

Lenovo Accelerated AI provides a set of templates that aim to make AI training and inference simpler, more accessible, and faster to implement. The Accelerated AI templates differ from the other templates in LiCO in that they do not require the user to input a program; rather, they simply require a workspace (with associated directories) and a labelled dataset.

The following use cases are supported with Lenovo Accelerated AI templates:

Image Classification
Object Detection
Instance Segmentation
Medical Image Segmentation
Seq2Seq
Memory Network
Image GAN
Text Classification

Each Lenovo Accelerated AI use-case is supported by both a training and inference template. The training templates provide parameter inputs such as batch size and learning rate. These parameter fields are pre-populated with default values, but are tunable by those with data science knowledge. The templates also provide visual analytics with TensorBoard; the TensorBoard graphs continually update in-flight as the job runs, and the final statistics are available after the job has completed.

In LiCO 6.3, the Image Classification and Object Detection templates introduce the ability to select a topology based on the characteristics of a target inference device, such as an IoT Device, Edge Server, or Data Center server.

The following figure displays the embedded TensorBoard interface for a job. TensorBoard provides visualizations for TensorFlow jobs running in LiCO, whether through Lenovo Accelerated AI templates or the standard TensorFlow AI templates.

Inference Templates

LICO also provides inference templates which allow users to predict with new data based on models that have been trained with Lenovo Accelerated AI templates. For the inference templates, users only need to provide a workspace, an input directory (the location of the data on which inference will be performed), an output directory, and the location of the trained model. The job will run, and upon completion, the output directory will contain the analyzed data. For visual templates such as Object Detection, images can be previewed directly from within LiCO's Manage Files interface.

The following two figures display an input file to the Object Detection inference template, as well as the corresponding output.

Favorites

LiCO allows the user to select frequently used job submission templates as “favorites” to simplify user access. Selecting the star in a template box will add the template to the Favorites tab at the top of the Submit Job screen, which is the default view to the Submit Job tab. If no favorites have been selected, the Favorites tab will not appear. Users may add standard templates, Lenovo Accelerated AI templates, and custom-defined templates to this tab.

AI Studio

LICO AI Studio provides an end-to-end workflow for Image Classification, Object Detection, and Instance Segmentation, with training based on Lenovo Accelerated AI pre-defined models. A user can import an unprocessed, unlabeled data set of images, label them, train multiple instances with a grid of parameter values, test the output models for validation, and publish to a git repository for use in an application environment. Additionally, users can initiate the steps in AI Studio from a REST API call to take advantage of LiCO as part of a DevOps toolchain.

Dev Tools

LICO includes the capability to create and deploy instances of Jupyter on the cluster. Users may create multiple instances, to customize for different software environments and projects. At launch of an instance the user can define the amount of compute resource requirements needed (CPU and GPU) to better optimize performance to the task and optimize resource usage on the cluster.

Once a Jupyter instance is created, the user can deploy it to the cluster and use the environment directly from their browser in a new tab. The user can leverage the Jupyter interface directly to upload, download and run code as they normally would, utilizing the shared storage space used for LiCO.

Workflow

LICO provides the ability to define multiple job submissions into a single execution action, called a Workflow. Steps are created to execute job submissions in serial, and within each step multiple job submissions may be executed in parallel. Workflow uses LiCO job submission templates to define the jobs for each step, and any template available including custom templates can be used in a workflow.

LiCO workflows allow users to automate the deployment of multiple jobs that may be required for a project, so the user can execute and monitor as a single action. Workflows can be easily copied and edited, allowing users to quickly customize existing workflows for multiple projects.

Admin

The Admin tab for the user provides access to their storage space on the cluster. Through the Manage Files subtab the user can upload, download, cut/copy/paste, preview and edit files on the cluster storage space from within the LiCO portal. The text editor within LiCO allows syntax-aware display and editing based on the file extension. The Admin tab also enables users to publish a trained model to a git repository or as a docker container image.

Additional features for LiCO HPC/AI Users

In addition to the user features above, the LiCO HPC/AI version contains a number of features to simplify HPC workload deployment with a minimal learning curve for users vs. console-based scripting and execution. HPC users can submit jobs easily through standard or custom templates, utilize containers, pre-define runtime modules and environment variables for submission, and with LiCO 6.3 take advantage of advanced features such as Energy Aware Runtime and Intel oneAPI tools and optimizations all from within the LiCO interface.

Topics in this section:

Energy Aware Runtime
Intel oneAPI
HPC Runtime Module Management
Container-based HPC workload deployment
Singularity Container Image Management
Expert mode
Reports

Energy Aware Runtime

Energy Aware Runtime (EAR) is software technology designed to provide a solution for running MPI applications with higher energy efficiency. Developed in collaboration with Barcelona Supercomputing Center as part of the BSC-Lenovo Cooperation project, EAR is supported for use with the SLURM scheduler through a SPANK plugin. LiCO exposes EAR deployment options within the standard MPI template, allowing users to take advantage of the capability for MPI workloads.

Once the workload has been profiled through a learning phase, EAR will minimize CPU frequency to reduce energy consumption while maintaining a set threshold of performance. This is particularly helpful where MPI applications may not take significant advantage of higher clock frequencies, so the frequency can be reduced to save energy while maintaining expected performance.

Users can select EAR options at job submission in the standard MPI template, either to run the default set by the administrator, minimum time to solution, or minimum energy. Administrators can set the policies and thresholds for EAR usage within the LiCO Administrator portal, as well as which users are authorized to use EAR.

Intel oneAPI

Intel oneAPI is a cross-industry, open, standards-based unified programming model that delivers a common developer experience across accelerator architectures—for faster application performance, more productivity, and greater innovation. In addition to Intel MPI, Intel OpenMP and Intel MPITune, LiCO 6.3 features new templates based on oneAPI – Intel VTune, TensorFlow2, and PyTorch – that are optimized to run on Intel processors.

HPC Runtime Module Management

LICO HPC/AI version allows the user to pre-define modules and environmental variables to load at the time of job execution through Job submission templates. These user-defined modules eliminate the step of needing to manually load required modules before job submission, further simplifying the process of running HPC workloads on the cluster. Through the Runtime interface, users can choose from the modules available on the system, define their loading order, and specify environmental variables for repeatable, reliable job deployment.

Container-based HPC workload deployment

Additional standard templates are provided to support deployment of containerized HPC workloads through Singularity or CharlieCloud. These templates simplify deploying containers for HPC workloads by eliminating the need to create custom runtimes and custom templates for these workloads unless needed for more granularity.

Singularity Container Image Management

LICO HPC/AI version provides both users and administrators with the ability to build, upload and manage application environment images through Singularity containers. These images can support users with AI frameworks and HPC workloads, as well as others. Singularity containers may be built from Docker containers, imported from NVIDIA GPU Cloud (NGC), or other image repositories such as the Intel Container Portal. Containers created by administrators are available to all users, and users can create container images for their individual use as well. Users looking to deploy a custom image can also create a custom template that will deploy the container and run workloads in that environment.

Expert mode

LICO HPC/AI version provides more experienced cluster users console access to the user space in the LiCO management node, to execute Linux and SLURM commands directly. Expert mode enables users familiar with the underlying cluster orchestration choice in how they work, using either the command line, GUI or both in concert to facilitate their workflow.

Reports

LICO HPC/AI version provides expanded billing capabilities and provides the user access to monitor charges incurred for a date range via the Expense Reports subtab. Users can also download daily or monthly billing reports as a .xlsx file from the Admin tab.

Features for LiCO Administrators

Topics in this section:

Features for LICO K8S/AI version administrators
Features for LICO HPC/AI version administrators
Features for LICO Operators

Features for LICO K8S/AI version administrators

For administrators of a Kubernetes-based LiCO environment, LiCO provides the ability to monitor activity, create and manage users, monitor LiCO-initiated activity, generate job and operational reports, enable container access for LiCO users, and view the software license currently installed in LICO. LICO K8S/AI version does not provide resource monitoring for the administrator; resources can be monitored at the Kubernetes level with a tool such as Kubernetes Dashboard.The following menus are available to administrators in LiCO K8S/AI:

Home menu for Administrators – provides an at-a-glance view of LiCO jobs running and operational messages. For monitoring and managing cluster resources, the administrator can use a tool such as Kubernetes dashboard, Grafana, or other Kubernetes monitoring tools.
User Management menu provides dashboards to create, import and export LiCO users, and includes administrative actions to edit, suspend, or delete
Monitor menu provides a view of LiCO jobs running, allocating to the Kubernetes cluster, and completed jobs.This menu also allows the administrator to query and filter operational logs.
Reports menu allows administrators the ability to generate reports on jobs, for a given time interval. Administrators may export these reports as a spreadsheet, in a PDF, or in HTML. The reports menu also allows the administrator to view cluster utilization for a given date range.
Admin menu Provides the administrator to map container images for use in job submission templates, and download operations and web logs for LiCO.
Settings menu allows the administrator to view the currently active license for LiCO, including the license key, license tier and expiration date of the license.

Features for LICO HPC/AI version Administrators

For cluster administrators, LiCO provides a sophisticated monitoring solution, built on OpenHPC tooling. The following menus are available to administrators:

Home menu for administrators – provides dashboards giving a global overview of the health of the cluster. Utilization is given for the CPUs, GPUs, memory, storage, and network. Node status is given, indicating which nodes are being used for I/O, compute, login, and management. Job status is also given, indicating runtime for the current job, and the order of jobs in the queue. The Home menu is shown in the following figure.
User Management menu provides dashboards to control user groups and users, determining permissions and access levels (based on LDAP) for the organization. Administrators can also control and provision billing groups for accurate accounting.
Monitor menu provides dashboards for interactive monitoring and reporting on cluster nodes, including a list of the nodes, or a physical look at the node topology. Administrators may also use the Monitor menu to drill down to the component level, examining statistics on cluster CPUs, GPUs, networking, jobs, and operations. Administrators can access alerts that indicate when these statistics reach unwanted values (for instance, GPU temperature reaching critical levels). These alerts are created using the Setting menu. Additionally, a large screen view is available to display a high-level summary of cluster status, and a cluster view is added in LiCO 6.2 for a focused view of compute resource utilization across the cluster. The figures below display the component and alert dashboards.
Reports menu allows administrators the ability to generate reports on jobs, cluster utilization, alerts, and view current charges and cluster utilization.
Admin menu Provides the administrator with the capability to create Singularity images for use by all users, generate billing spreadsheets, examine processes and assets, monitor VNC sessions, and download web logs.
Settings menu allows administrators to set up automated notifications and alerts. Administrators may enable the notifications to reach users and interested parties via email, SMS, and WeChat. Administrators may also enable notifications and alerts via uploaded scripts. The Settings menu also allows administrators to create and modify queues. These queues allow administrators to subdivide hardware based on different types or needs. For example, one queue may contain systems that are exclusively machines with GPUs, while another queue may contain systems that only contain CPUs. This allows the user running the job to select the queue that is more applicable to their requirement. Within the Settings menu, administrators can also set the status of queues, bringing them up or down, draining them, or marking them inactive. Administrators can also limit which queues are available to users by user group.
License menu – displays the software licenses active in LiCO including the number of licensed processing entitlements and the expiration date of the license.

Features for LiCO Operators

For the purpose of monitoring clusters but not overseeing user access, LiCO provides the Operator designation. LiCO Operators have access to a subset of the dashboards provided to Administrators; namely, the dashboards contained in the Home, Monitor, and Reports menus:

Home menu for operators – provides dashboards giving a global overview of the health of the cluster. Utilization is given for the CPUs, GPUs, memory, storage, and network. Node status is given, indicating which nodes are being used for I/O, compute, login, and management. Job status is also given, indicating runtime for the current job, and the order of jobs in the queue.
Monitor menu Dashboard that enables interactive monitoring and reporting on cluster nodes, including a list of the nodes, or a physical look at the node topology. Operators may also use the Monitor menu to drill down to the component level, examining statistics on cluster CPUs, GPUs, jobs, and operations. Operators can access alarms that indicate when these statistics reach unwanted values (for instance, GPU temperature reaching critical levels.) These alarms are created by Administrators using the Setting menu (for more information on the Setting menu, see the Features for LiCO Administrators section.)
Reports menu allows operators the ability to generate reports on jobs, alerts, or actions for a given time interval. Operators may export these reports as a spreadsheet, in a PDF, or in HTML.

Subscription and Support

LICO HPC/AI is enabled through a per-CPU and per-GPU subscription and support entitlement model, which once entitled for the all the processors contained within the cluster, gives the customer access to LiCO package updates and Lenovo support for the length of the acquired term.

LICO K8S/AI is enabled through tiered subscription and support entitlement licensing based on the number of GPU accelerators being accessed by running LiCO workloads (tiers are up to 4 GPU in use, up to 16 GPU in use, and up to 64 GPU in use). Additional licensing beyond 64 GPUs can be provided by contacting your Lenovo sales representative.

Lenovo will provide interoperability support for all software tools defined as validated with LiCO, and development support (Level 3) for specific Lenovo-supported tools only. Open source and supported-vendor bugs/issues will be logged and tracked with their respective communities or companies if desired, with no guarantee from Lenovo for bug fixes. Full support details are provided at the support links below for each respective version of LiCO. Additional support options may be available; please contact your Lenovo sales representative for more information.

LiCO can be acquired as part of a Lenovo Scalable Infrastructure (LeSI) solution or for “roll your own” (RYO) solutions outside of the LeSI framework, and LiCO software package updates are provided directly through the Lenovo Electronic Delivery system. More information on LeSI is available in the LeSI product guide, available from https://lenovopress.com/lp0900.

Validated software components

LiCO's software packages are dependent on a number of software components that need to be installed prior to LICO in order to function properly. Each LiCO software release is validated against a defined configuration of software tools and Lenovo systems, to make deployment more straightforward and enable support. Other management tools, hardware systems and configurations outside the defined stack may be compatible with LICO, though not formally supported; to determine compatibility with other solutions, please check with your Lenovo sales representative.

The following software components are validated by Lenovo as part of the overall LiCO software solution entitlement:

LiCO HPC/AI version support

Lenovo Development Support (L1-L3)
Graphical User Interface: LICO
System Management & Provisioning: xCAT/Confluent
Lenovo LiCO HPC/AI Configuration Support (L1 only)
Job Scheduling & Orchestration: SLURM; Torque/Maui (HPC only)
System Monitoring: Icinga v2
Container Support (AI): Singularity, CharlieCloud
Al Frameworks (AI): Caffe, Intel-Caffe, TensorFlow, MxNet, Neon, Chainer, Pytorch, Scikit-learn

The following software components are validated for compatibility with LiCO HPC/AI:

Supported by their respective software provider
Operating System: CentOS/RHEL 8.3, SUSE SLES 15.2
File Systems: IBM Spectrum Scale, Lustre
Job Scheduling & Orchestration: IBM Spectrum LSF v10
Development Tools: GNU compilers, Intel Cluster Toolkit

LICO K8S/AI version support

Lenovo Development Support (L1-L3)
Graphical User Interface: LICO
Lenovo LiCO K8S/AI Configuration Support (L1 only)
Al Frameworks (AI): Caffe, Intel-Caffe, TensorFlow, MxNet, Neon, Chainer, Pytorch, Scikit-learn

Supported servers (LiCO HPC/AI version)

The following Lenovo servers are supported to run with LiCO HPC/AI. This server must run one of the supported operating systems as well as the validated software stack, as described in the Validated Software Components section.

ThinkSystem SR670 V2 – The Lenovo ThinkSystem SR670 V2 is a versatile GPU-rich 3U rack server that supports eight double-wide GPUs including the new NVIDIA A100 and A40 Tensor Core GPUs, or the NVIDIA HGX A100 4-GPU offering with NVLink and Lenovo Neptune hybrid liquid-to-air cooling. The server is based on the new third-generation Intel Xeon Scalable processor family (formerly codenamed "Ice Lake"). The server delivers optimal performance for Artificial Intelligence (AI), High Performance Computing (HPC) and graphical workloads across an array of industries. For more information, see the SR670 V2 product guide.
ThinkSystem SD650 V2 – The ThinkSystem SD650 V2 server is the next-generation high-performance server based on Lenovo's fourth generation Lenovo Neptune™™ direct water cooling platform. With two third-generation Intel Xeon Scalable processors, the ThinkSystem SD650 V2 server combines the latest Intel processors and Lenovo's market-leading water cooling solution, which results in extreme performance in an extreme dense packaging, supporting your application From Exascale to Everyscale ™™. For more information, see the SD650 V2 product guide.
ThinkSystem SD650-N V2 – The ThinkSystem SD650-N V2 server is the next-generation high-performance GPU-rich server based on Lenovo's fourth generation Lenovo Neptune™™ direct water cooling platform. With four NVIDIA A100 SXM4 GPUs and two third-generation Intel Xeon Scalable processors, the ThinkSystem SD650-N V2 server combines advanced NVIDIA acceleration technology with the latest Intel processors and Lenovo's market-leading water cooling solution, which results in extreme performance in an extreme dense packaging supporting your accelerated application From Exascale to Everyscale™™. For more information, see the SD650-N V2 product guide.
ThinkSystem SR650 V2 – The Lenovo ThinkSystem SR650 V2 is an ideal 2-socket 2U rack server for small businesses up to large enterprises that need industry-leading reliability, management, and security, as well as maximizing performance and flexibility for future growth. The SR650 V2 is a very configuration-rich offering, supporting 28 different drive bay configurations in the front, middle and rear of the server and 5 different slot configurations at the rear of the server. This level of flexibility ensures that you can configure the server to meet the needs of your workload. For more information, see the SR650 V2 product guide.
ThinkSystem SR630 V2 – The Lenovo ThinkSystem SR630 V2 is an ideal 2-socket 1U rack server designed to take full advantage of the features of the 3rd generation Intel Xeon Scalable processors, such as the full performance of 270W 40-core processors, support for 3200 MHz memory and PCIe Gen 4.0 support. The server also offers onboard NVMe PCIe ports that allow direct connections to 12x NVMe SSDs, which results in faster access to store and access data to handle a wide range of workloads. For more information, see the SR630 V2 product guide.
ThinkSystem SD530 – The Lenovo ThinkSystem SD530 is an ultra-dense and economical two-socket server in a 0.5U rack form factor. With up to four SD530 server nodes installed in the ThinkSystem D2 enclosure, and the ability to cable and manage up to four D2 enclosures as one asset, you have an ideal high-density 2U four-node (2U4N) platform for enterprise and cloud workloads. The SD530 also supports a number of high-end GPU options with the optional GPU tray installed, making it an ideal solution for AI Training workloads. For more information, see the SD530 product guide.
ThinkSystem SD650 – The Lenovo ThinkSystem SD650 direct water cooled server is an open, flexible and simple data center solution for users of technical computing, grid deployments, analytics workloads, and large-scale cloud and virtualization infrastructures. The direct water cooled solution is designed to operate by using warm water, up to 50°C (122°F). Chillers are not needed for most customers, meaning even greater savings and a lower total cost of ownership. The ThinkSystem SD650 is designed to optimize density and performance within typical data center infrastructure limits, being available in a 6U rack mount unit that fits in a standard 19-inch rack and houses up to 12 water-cooled servers in 6 trays. For more information, see the SD650 product guide.
ThinkSystem SR630 – Lenovo ThinkSystem SR630 is an ideal 2-socket 1U rack server for small businesses up to large enterprises that need industry-leading reliability, management, and security, as well as maximizing performance and flexibility for future growth. The SR630 server is designed to handle a wide range of workloads, such as databases, virtualization and cloud computing, virtual desktop infrastructure (VDI), infrastructure security, systems management, enterprise applications, collaboration/email, streaming media, web, and HPC. For more information, see the SR630 product guide.
ThinkSystem SR650 – The Lenovo ThinkSystem SR650 is an ideal 2-socket 2U rack server for small businesses up to large enterprises that need industry-leading reliability, management, and security, as well as maximizing performance and flexibility for future growth. The SR650 server is designed to handle a wide range of workloads, such as databases, virtualization and cloud computing, virtual desktop infrastructure (VDI), enterprise applications, collaboration/email, and& business analytics and big data. For more information, see the SR650 product guide.
ThinkSystem SR670 – The Lenovo ThinkSystem SR670 is a purpose-built 2 socket 2U accelerated server, supporting up to 8 single-wide or 4 double-wide GPUs and designed for optimal performance required by both Artificial Intelligence and High Performance Computing workloads. Supporting the latest NVIDIA GPUs and Intel Xeon Scalable processors, the SR670 supports hybrid clusters for organizations that may want to consolidate infrastructure, improving performance and compute power, while maintaining optimal TCO. For more information, see the SR670 product guide.
ThinkSystem SR950 – The Lenovo ThinkSystem SR950 is Lenovo's flagship server, suitable for mission-critical applications that need the most processing power possible in a single server. The powerful 4U ThinkSystem SR950 can expand from two to as many as eight Intel Xeon Scalable Family processors. The modular design of SR950 speeds upgrades and servicing with easy front or rear access to all major subsystems that ensures maximum performance and maximum server uptime. For more information, see the SR950 product guide.
ThinkSystem SR655 – The Lenovo ThinkSystem SR655 is a 1-socket 2U server that features the AMD EPYC 7002 "Rome" family of processors. With up to 64 cores per processor and support for the new PCle 4.0 standard for I/O, the SR655 offers the ultimate in single-socket server performance. ThinkSystem SR655 is a multi-GPU optimized rack server, providing support for up to 6 low-profile GPUs or 3 double-wide GPUs. For more information, see the SR655 product guide.
ThinkSystem SR635 – The Lenovo ThinkSystem SR635 is a 1-socket 1U server that features the AMD EPYC 7002 "Rome" family of processors. With up to 64 cores per processor and support for the new PCle 4.0 standard for I/O, the SR635 offers the ultimate in single-socket server performance. For more information, see the SR635 product guide.
ThinkSystem SR645 – The Lenovo ThinkSystem SR645 is a 2-socket 1U server that features the AMD EPYC 7002 "Rome" family of processors. With up to 64 cores per processor and support for the new PCle 4.0 standard for I/O, the SR645 offers the ultimate in two-socket server performance in a space-saving 1U form factor. For more information, see the SR645 product guide.
ThinkSystem SR665 – The Lenovo ThinkSystem SR665 is a 2-socket 2U server that features the AMD EPYC 7002 "Rome" family of processors. With support for up to 8 single-wide or 3 double-wide GPUs, up to 64 cores per processor and support for the new PCIe 4.0 standard for I/O, the SR665 offers the ultimate in two-socket server performance in a 2U form factor. ThinkSystem SR665 is a multi-GPU optimized rack server, providing support for up to 8 low-profile GPUs or 3 double-wide GPUs. For more information, see the SR665 product guide.
ThinkSystem SR850 – The Lenovo ThinkSystem SR850 is a 4-socket server that features a streamlined 2U rack design that is optimized for price and performance, with best-in-class flexibility and expandability. The SR850 now supports second-generation Intel Xeon Scalable Family processors, up to a total of four, each with up to 28 cores. The ThinkSystem SR850's agile design provides rapid upgrades for processors and memory, and its large, flexible storage capacity helps to keep pace with data growth. For more information, see the SR850 product guide.

Additional Lenovo ThinkSystem and System x servers may be compatible with LiCO. Contact your Lenovo sales representative for more information.

LiCO Implementation services

Customers who do not have the cluster management software stack required to run with LiCO may engage Lenovo Professional Services to install LiCO and the necessary open-source software. Lenovo Professional Services can provide comprehensive installation and configuration of the software stack, including operation verification, as well as post-installation documentation for reference. Contact your Lenovo sales representative for more information.

Client PC requirements

A web browser is used to access LiCO's monitoring dashboards. To fully utilize LiCO's monitoring and visualization capabilities, the client PC should meet the following specifications:

Hardware: CPU of 2.0 GHz or above and 8 GB or more of RAM
Display resolution: 1280 x 800 or higher
Browser: Chrome (v62.0 or higher) or Firefox (v56.0 or higher) is recommended

Notices

Lenovo may not offer the products, services, or features discussed in this document in all countries. Consult your local Lenovo representative for information on the products and services currently available in your area. Any reference to a Lenovo product, program, or service is not intended to state or imply that only that Lenovo product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any Lenovo intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any other product, program, or service. Lenovo may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to:

Lenovo (United States), Inc.
8001 Development Drive
Morrisville, NC 27560
U.S.A.
Attention: Lenovo Director of Licensing

LENOVO PROVIDES THIS PUBLICATION "AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. Lenovo may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

The products described in this document are not intended for use in implantation or other life support applications where malfunction may result in injury or death to persons. The information contained in this document does not affect or change Lenovo product specifications or warranties. Nothing in this document shall operate as an express or implied license or indemnity under the intellectual property rights of Lenovo or third parties. All information contained in this document was obtained in specific environments and is presented as an illustration. The result obtained in other operating environments may vary. Lenovo may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Any references in this publication to non-Lenovo Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this Lenovo product, and use of those Web sites is at your own risk. Any performance data contained herein was determined in a controlled environment. Therefore, the result obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment.

This document, LP0858, was created or updated on December 15, 2021.

Send us your comments in one of the following ways:

Use the online Contact us review form found at: https://lenovopress.com/LP0858
Send your comments in an e-mail to: comments@lenovopress.com

This document is available online at https://lenovopress.com/LP0858.

Trademarks

Lenovo and the Lenovo logo are trademarks or registered trademarks of Lenovo in the United States, other countries, or both. A current list of Lenovo trademarks is available on the Web at https://www.lenovo.com/us/en/legal/copytrade/.

The following terms are trademarks of Lenovo in the United States, other countries, or both:

Lenovo®
From Exascale to Everyscale
Lenovo Neptune
System x®
ThinkSystem

The following terms are trademarks of other companies:

Intel®, Xeon®, and VTune™™ are trademarks of Intel Corporation or its subsidiaries.
Linux® is the trademark of Linus Torvalds in the U.S. and other countries.
Other company, product, or service names may be trademarks or service marks of others.

Models: Intelligent Computing Orchestration LiCO

File Info : application/pdf, 29 Pages, 3.91MB

lp0858

References

wkhtmltopdf 0.12.4-dev-4fa8338 Qt 4.8.7

	Reference Architecture: Lenovo ThinkEdge for AI This document outlines Lenovo's reference architecture for deploying and scaling AI inference workloads at the edge using ThinkEdge servers. It details validated server configurations, AI use cases, frameworks, and management software, providing a comprehensive guide for edge AI solutions.
	Running Edge AI Workloads with Lenovo ThinkAgile HX360 V2 Edge Servers Explore how Lenovo ThinkAgile HX360 V2 Edge servers, powered by Nutanix Cloud Platform, accelerate AI inferencing deployments at the edge. This document details the server's capabilities, validated designs, and performance testing for AI workloads.
	Chelsio 10GbE Adapters for IBM System Cluster 1350 and IBM iDataPlex: Product Guide This product guide provides detailed information on Chelsio 10GbE adapters for IBM System Cluster 1350 and IBM iDataPlex. It covers adapter features, specifications, part numbers, supported servers and operating systems, and related resources. The adapters offer high performance, low latency, and protocol offload capabilities for demanding cluster environments.
	Lenovo, Cisco, and NVIDIA Collaborate for Accelerated Computing Hybrid AI Infrastructure Explore the collaboration between Lenovo, Cisco, and NVIDIA to deliver a new accelerated computing hybrid AI infrastructure. Learn how this partnership enhances AI workloads, optimizes network performance, and provides scalable solutions for data centers and enterprise applications.
	Lenovo ThinkSystem NVIDIA ConnectX-8 8240 400Gbps QSFP112 PCIe Gen6 x16 Adapter Product Guide Product guide for the Lenovo ThinkSystem NVIDIA ConnectX-8 8240 400Gbps QSFP112 2-Port PCIe Gen6 x16 Adapter, detailing its specifications, ordering information, supported cables, server compatibility, NVIDIA Unified Fabric Manager (UFM) subscriptions, regulatory compliance, and usage guidelines.
	Inspiring the Future of the Built Environment with AI \| Lenovo & NVIDIA Explore how Lenovo and NVIDIA solutions are revolutionizing the Architecture, Engineering, and Construction (AEC) industry with AI-powered Building Information Modeling (BIM) 2.0. Discover AI-driven workflows in design, construction, and operations.
	ThinkSystem NVIDIA H800 PCIe Gen5 GPUs Product Guide A comprehensive product guide for the ThinkSystem NVIDIA H800 PCIe Gen5 GPUs, detailing features, specifications, server support, and software offerings. This guide provides information on the GPU's architecture, performance metrics, compatibility, and available software solutions from Lenovo and NVIDIA.
	Lenovo ThinkSystem SC777 V4 Neptune Server Product Guide Explore the high-performance Lenovo ThinkSystem SC777 V4 Neptune server, featuring NVIDIA GB200 NVL4 platform and advanced Lenovo Neptune direct water cooling for extreme HPC/AI performance. Discover its key features, scalability, energy efficiency, manageability, and serviceability.

Introduction

What's new in LiCO 6.3

Part numbers

Table 1. LICO HPC/AI version ordering information

Table 2. LICO K8S/AI ordering information (Kubernetes)

Features for LiCO users

Topics in this section:

LICO versions

LICO K8S/AI version:

LICO HPC/AI version:

Benefits to users

Features for users

LiCO and TensorBoard monitoring

Features

Lenovo Accelerated AI

Inference Templates

Favorites

AI Studio

Dev Tools

Workflow

Admin

Additional features for LiCO HPC/AI Users

Topics in this section:

Energy Aware Runtime

Intel oneAPI

HPC Runtime Module Management

Container-based HPC workload deployment

Singularity Container Image Management

Expert mode

Reports

Features for LiCO Administrators

Topics in this section:

Features for LICO K8S/AI version administrators

Features for LICO HPC/AI version Administrators

Features for LiCO Operators

Subscription and Support

Validated software components

LiCO HPC/AI version support

LICO K8S/AI version support

Supported servers (LiCO HPC/AI version)

LiCO Implementation services

Client PC requirements

Related links

Related product families

Notices

Trademarks

References

Related Documents