GPU Card Installation
This chapter contains the following topics:
- Server Firmware Requirements, on page 1
- GPU Card Configuration Rules, on page 1
- Requirement For All GPUs: Memory-Mapped I/O Greater Than 4 GB, on page 2
- Replacing a Single-Wide GPU Card, on page 3
- Installing Drivers to Support the GPU Cards, on page 8
Server Firmware Requirements
The following table lists the minimum server firmware versions for the supported GPU cards.
>GPU Card | Cisco IMC/BIOS Minimum Version Required |
---|---|
NVIDIA L4 PCIe, 72W, Gen 4 x8 (UCSC-GPU-L4) | 4.1(3) |
Intel GPU Flex 140 PCIe, 75W, Gen 4 x8 (UCSC-GPU-FLEX140) | 4.1(3) |
GPU Card Configuration Rules
Note the following rules when populating a server with GPU cards.
- The server supports the following GPUs:
- NVIDIA L4 70W 24GB PCIe GPU (UCSC-GPU-L4), which is a half-height, half-length (HHHL) GPU single-wide GPU card. This GPU can be installed in either PCIe Gen 4 or PCIe Gen 5 half-height or full-height risers. Each server can support a maximum of 3 GPUs of the same type with half-height (HH risers) or a maximum of 2 GPUs of the same type with full-height (FH) risers.
- Intel GPU Flex 140 Gen4x8 75W PCIe (UCSC-GPU-FLEX140), which is a half-height, half-length (HHHL) GPU single-wide GPU card. This GPU can be installed in either PCIe Gen4 or PCIe Gen5 half-height risers. Each server can support a maximum of 3 GPUs of the same type with half-height (HH risers) or a maximum of 2 GPUs of the same type with full-height (FH) risers.
Requirement For All GPUs: Memory-Mapped I/O Greater Than 4 GB
All supported GPU cards require enablement of the BIOS setting that allows greater than 4 GB of memory-mapped I/O (MMIO).
- Standalone Server: If the server is used in standalone mode, this BIOS setting is enabled by default: Advanced > PCI Configuration > Memory Mapped I/O Above 4 GB [Enabled]. If you need to change this setting, enter the BIOS Setup Utility by pressing F2 when prompted during bootup.
- Integrated Server: If the server is integrated with Cisco UCS Manager and is controlled by a service profile, this setting is enabled by default in the service profile when a GPU is present. To change this setting manually, use the following procedure.
Procedure
- Refer to the Cisco UCS Manager configuration guide (GUI or CLI) for your release for instructions on configuring service profiles: Cisco UCS Manager Configuration Guides
- Refer to the chapter on Configuring Server-Related Policies > Configuring BIOS Settings.
- In the section of your profile for PCI Configuration BIOS Settings, set Memory Mapped IO Above 4GB Config to one of the following:
- Disabled-Does not map 64-bit PCI devices to 64 GB or greater address space.
- Enabled-Maps I/O of 64-bit PCI devices to 64 GB or greater address space.
- Platform Default-The policy uses the value for this attribute contained in the BIOS defaults for the server. Use this only if you know that the server BIOS is set to use the default enabled setting for this item.
- Reboot the server.
Replacing a Single-Wide GPU Card
A GPU kit (UCSC-GPURKIT-C220) is available from Cisco. The kit contains a GPU mounting bracket and the following risers (risers 1 and 2):
- One x16 PCIe Gen4 riser, standard PCIe, supports Cisco VIC, full-height, 3/4 length
- One x16 PCIe Gen4 riser, standard PCIe, full-height, 3/4 length
Procedure
- Remove an existing GPU card from the PCIe riser:
- Shut down and remove power from the server as described in Shutting Down and Removing Power From the Server.
- Slide the server out the front of the rack far enough so that you can remove the top cover. You might have to detach cables from the rear panel to provide clearance.
- Caution: If you cannot safely view and access the component, remove the server from the rack.
- Remove the top cover from the server as described in Removing Top Cover.
- Using a #2 Phillips screwdriver, loosen the captive screws.
- Pull evenly on both ends of the GPU card to disconnect the card from the socket.
- If the riser has no card, remove the blanking panel from the rear opening of the riser.
- Holding the GPU level, slide it out of the socket on the PCIe riser.
- Install a new GPU card:
Note: The Intel Flex 140 and Nvidia L4 are half-height, half-length cards. If one is installed in full-height PCIe slot 1, it requires a full-height rear-panel tab installed to the card.
- Align the new GPU card with the empty socket on the PCIe riser and slide each end into the retaining clip.
- Push evenly on both ends of the card until it is fully seated in the socket.
- Ensure that the card's rear panel tab sits flat against the riser rear-panel opening.
PCIe Riser Assembly
Figure 1: PCIe Riser Assembly, 3 HHHL
Note: For easy identification, riser numbers are stamped into the sheet metal on the top of each riser cage.
1 Captive screw for PCIe slot 1 (alignment feature) | 6 Handle for PCIe slot 3 riser | PCIe slot 1 rear-panel opening |
2 Captive screw for PCIe slot 2 (alignment feature) | 7 Rear-panel opening for PCIe slot 1 | |
3 Captive screw for PCIe slot 2 (alignment feature) | 8 Rear-panel opening for PCIe slot 2 | |
4 Handle for PCIe slot 1 riser | 9 Rear-panel opening for PCIe slot 3 | |
5 Handle for PCIe slot 2 riser | - |
Figure 2: PCIe Riser Assembly, 2 FHFL
1 Captive screw for PCIe slot 1 | 4 Handle for PCIe slot 2 riser | Rear-panel opening for PCIe slot 1 |
2 Captive screw for PCIe slot 2 | 5 Rear-panel opening for PCIe slot 1 | |
3 Handle for PCIe slot 1 riser | - | Rear-panel opening for PCIe slot 2 |
d) Position the PCIe riser over its sockets on the motherboard and over the chassis alignment channels.
Figure 3: PCIe Riser Alignment Features
For a server with 3 HHHL risers, 3 sockets and 3 alignment features are available, as shown below.
Figure 4: PCIe Riser Alignment Features
For a server with 2 FHFL risers, 2 sockets and 2 alignment features are available, as shown below.
e) Carefully push down on both ends of the PCIe riser to fully engage its two connectors with the two sockets on the motherboard.
f) When the riser is level and fully seated, use a #2 Phillips screwdriver to secure the riser to the server chassis.
g) Replace the top cover to the server.
h) Replace the server in the rack, replace cables, and then fully power on the server by pressing the Power button.
Optional: Continue with Installing Drivers to Support the GPU Cards, on page 8.
Installing Drivers to Support the GPU Cards
After you install the hardware, you must update to the correct level of server BIOS and then install GPU drivers and other software in this order:
- Update the server BIOS.
- Update the GPU drivers.
1. Updating the Server BIOS
Install the latest Cisco UCS C240 M4 server BIOS by using the Host Upgrade Utility for the Cisco UCS C240 M4 server.
Note: You must do this procedure before you update the NVIDIA drivers.
Procedure
- Navigate to the following URL: http://www.cisco.com/cisco/software/navigator.html
- Click Servers-Unified Computing in the middle column.
- Click Cisco UCS C-Series Rack-Mount Standalone Server Software in the right-hand column.
- Click the name of your model of server in the right-hand column.
- Click Unified Computing System (UCS) Server Firmware.
- Click the release number.
- Click Download Now to download the ucs-server platform-huu-version_number.iso file.
- Verify the information on the next page, and then click Proceed With Download.
- Continue through the subsequent screens to accept the license agreement and browse to a location where you want to save the file.
- Use the Host Upgrade Utility to update the server BIOS.
The user guides for the Host Upgrade Utility are at Utility User Guides.
2. Updating the GPU Card Drivers
After you update the server BIOS, you can install GPU drivers to your hypervisor virtual machine.
Procedure
- Install your hypervisor software on a computer. Refer to your hypervisor documentation for the installation instructions.
- Create a virtual machine in your hypervisor. Refer to your hypervisor documentation for instructions.
- Install the GPU drivers to the virtual machine. Download the drivers from either:
- NVIDIA Enterprise Portal for GRID hypervisor downloads (requires NVIDIA login): https://nvidia.flexnetoperations.com/
- NVIDIA public driver area: http://www.nvidia.com/Download/index.aspx
- AMD: http://support.amd.com/en-us/download
- Restart the server.
- Check that the virtual machine is able to recognize the GPU card. In Windows, use the Device Manager and look under Display Adapters.