Cisco Performance Tuning for UCS M8 Platforms
Tujuan lan ruang lingkup dokumen
Sistem Input-lan-Output Dasar (BIOS) nguji lan miwiti komponen hardware sistem lan boot sistem operasi saka piranti panyimpenan. Sistem komputasi khas duwe sawetara setelan BIOS sing ngontrol prilaku sistem. Sawetara setelan kasebut langsung ana hubungane karo kinerja sistem.
This document explains the BIOS settings that are valid for the Cisco Unified Computing System™ (Cisco UCS®) M8 servers with AMD EPYC™ 4th Gen and 5th Gen processors. It describes how to optimize the BIOS settings to meet requirements for best performance and energy efficiency for the Cisco UCS X215c M8 Compute Nodes, Cisco UCS C245 M8 Rack Servers, and Cisco UCS C225 M8 Rack Servers.
Dokumen iki uga mbahas setelan BIOS sing bisa dipilih kanggo macem-macem jinis beban kerja ing server Cisco UCS M8 karo prosesor AMD EPYC 4th Gen lan 5th Gen. Ngerteni pilihan BIOS bakal mbantu sampeyan milih nilai sing cocog kanggo entuk kinerja sistem sing optimal.
Dokumen iki ora ngrembug opsi BIOS kanggo rilis perangkat kukuh tartamtu saka server Cisco UCS M8 adhedhasar prosesor AMD EPYC 4 lan 5 Gen. Setelan sing dituduhake ing kene umume.
Apa sing bakal sampeyan sinau
Proses nyetel opsi kinerja ing BIOS sistem bisa dadi nggegirisi lan mbingungake, lan sawetara opsi sing bisa dipilih ora jelas. Kanggo sawetara opsi, sampeyan kudu milih antarane ngoptimalake server kanggo ngirit daya utawa kanggo kinerja. Dokumen iki nyedhiyakake sawetara pedoman lan saran umum kanggo mbantu sampeyan entuk kinerja optimal saka server Cisco UCS M8 sing nggunakake CPU kulawarga AMD EPYC Generasi 4 lan Generasi 5.
Prosesor AMD EPYC 9004 Series
The AMD EPYC 9004 Series processors are built with innovative Zen 4 cores and AMD Infinity architecture. AMD EPYC 9004 Series processors incorporate compute cores, memory controllers, I/O controllers, Reliability, Availability, and Serviceability (RAS), and security features into an integrated System on a Chip (SoC). The AMD EPYC 9004 Series Processor retains the proven Multi-Chip Module (MCM) Chiplet architecture of prior successful AMD EPYC processors while making further improvements to the SoC components. The SoC includes the Core Complex Dies (CCDs), which contain Core Complexes (CCXs), which contain the Zen 4–4-based cores.
AMD EPYC 9004 Series processors are based on the new Zen 4 compute core. The Zen 4 core is manufactured using a 5nm process and is designed to provide an Instructions per Cycle (IPC) uplift and frequency improvements over prior-generation Zen cores. Each core has a larger L2 cache and improved cache effectiveness over the prior generation.
Each core supports Simultaneous Multithreading (SMT), which enables two separate hardware threads to run independently, sharing the corresponding core’s L2 cache.
The Core Complex (CCX) is where up to eight Zen 4–based cores share a L3 or Last Level Cache (LLC). Enabling Simultaneous Multithreading (SMT) allows a single CCX to support up to 16 concurrent hardware threads.
Prosesor AMD EPYC 9004 Series kalebu teknologi die-stacking AMD 3D V-Cache sing ngidini prosesor 9700 Series entuk integrasi chiplet sing luwih efisien. AMD 3D Chiplet arsitektur tumpukan kothak cache L3 vertikal kanggo nyedhiyani nganti 96MB cache L3 saben mati (lan nganti 1 GB L3 Cache saben soket) nalika isih nyediakake kompatibilitas soket karo kabeh model prosesor AMD EPYC 9004 Series.
Prosesor AMD EPYC 9004 Series kanthi teknologi AMD 3D V-Cache nggunakake tumpukan logika sing unggul ing industri adhedhasar ikatan hibrida tembaga-kanggo-tembaga "bumpless" proses chip-on-wafer kanggo ngaktifake luwih saka 200X kapadhetan interkoneksi teknologi 2D saiki (lan luwih saka 15X kapadhetan interkoneksi nggunakake teknologi liyane), latensi, bandwidth sing luwih dhuwur, lan efisiensi daya lan termal sing luwih gedhe.
The CCDs connect to memory, I/O, and each other through an updated I/O Die (IOD). This central AMD Infinity Fabric provides the data path and control support to interconnect CCXs, memory, and I/O. Each CCD connects to the IOD via a dedicated high-speed Global Memory Interconnect (GMI) link. The IOD helps maintain cache coherency and additionally provides the interface to extend the data fabric to a potential second processor via its xGMI, or G-links. AMD EPYC 9004 Series processors support up to 4 xGMI (or G-links) with speeds up to 32Gbps.
The IOD exposes DDR5 memory channels, PCIe Gen5, CXL 1.1+, and Infinity Fabric links. The IOD provides twelve Unified Memory Controllers (UMCs) that support DDR5 memory.
Saben UMC bisa ndhukung nganti 2 Dual In-line Memory Modules (DIMMs) saben saluran (DPC) kanggo maksimum 24 DIMMs saben soket. Prosesor AMD EPYC Generasi kaping 4 bisa ndhukung nganti 6TB memori DDR5 saben soket. Duwe saluran memori tambahan lan luwih cepet dibandhingake generasi sadurunge pemroses AMD EPYC menehi bandwidth memori tambahan kanggo feed prosesor dhuwur-inti-count. Interleaving memori ing saluran 2, 4, 6, 8, 10, lan 12 mbantu ngoptimalake macem-macem beban kerja lan konfigurasi memori.
Saben prosesor bisa duwe pesawat 4 P-links lan 4 G-links. Desain motherboard OEM bisa nggunakake G-link kanggo nyambung menyang prosesor AMD EPYC 4th Gen kapindho utawa kanggo nyedhiyakake jalur PCIe Gen5 tambahan. Prosesor AMD EPYC Generasi kaping 4 ndhukung nganti wolung set jalur I/O x16-bit, yaiku, 128 jalur PCIe Gen5 kanthi kacepetan dhuwur ing platform soket siji lan nganti 160 jalur ing platform soket dual.
Prosesor AMD EPYC 9004 Series 4th Gen dibangun kanthi spesifikasi sing kadhaptar ing Tabel 1.
Tabel 1. Spesifikasi prosesor AMD EPYC 9004 Series 4th Gen
Item | Spesifikasi |
Teknologi proses inti | 5-nanometer (nm) Zen 4 |
Jumlah maksimum inti | 128 |
kacepetan memori maksimum | 4800 Mega-Transfer per detik (MT/s) |
Saluran memori maksimum | 12 saben soket |
Kapasitas memori maksimal | 6 TB saben soket |
PCI | 128 lane (maksimum) kanggo 1-soket
160 lanes (maximum) for 2-socket PCIe Gen 5 |
For more information about the AMD EPYC 9004 Series processors’ microarchitecture, see Swaraview of AMD EPYC 9004 Series Processors Microarchitecture.
Prosesor AMD EPYC 9005 Series
Sistem adhedhasar pemroses AMD EPYC Generasi kaping 5 bisa ndhukung inisiatif IT saka konsolidasi lan modernisasi pusat data nganti kebutuhan aplikasi perusahaan sing saya tambah akeh. Sistem kasebut bisa ngaktifake ngembangake AI ing perusahaan nalika ndhukung kabutuhan bisnis kanggo ningkatake efisiensi energi lan ngendhaleni sprawl pusat data liwat dhukungan kepadatan dhuwur kanggo virtualisasi lan lingkungan maya. Modernisasi infrastruktur IT minangka kunci kanggo mbebasake ruang lan energi kanggo nampung AI lan inisiatif bisnis inovatif liyane ing jejak kaki pusat data sing wis ana.
Prosesor AMD EPYC terus-terusan entuk bathi kaping pindho ing kinerja instruksi-per-jam-siklus (IPC) karo saben generasi anyar, lan inti Zen 5 paling anyar ing pemroses AMD EPYC Generasi kaping 5 nyedhiyakake peningkatan sing signifikan kanggo beban kerja ML, HPC, lan perusahaan. Inti Zen 5c sing dioptimalake efisiensi kita nguwasani CPU kanthi jumlah inti paling dhuwur saka prosesor arsitektur x86, menehi kapadhetan inti paling dhuwur kanggo beban kerja virtual lan awan.
Prosesor AMD EPYC Generasi kaping 5 ngidini sampeyan nyebarake lan ngatasi tuntutan beban kerja sing terus berkembang. Arsitèktur hibrida, multichip ngidini kita nyisihake jalur inovasi lan ngirim produk kanthi kinerja dhuwur sing terus-terusan inovatif. Inti Zen 5 lan Zen 5c makili kemajuan penting liyane saka generasi paling anyar, kanthi dhukungan anyar kanggo aplikasi machine-learning lan inferensi sing rumit banget.
Ing pemroses AMD EPYC Generasi 5, kita nggunakake rong inti sing beda kanggo ngatasi macem-macem kabutuhan beban kerja kanthi macem-macem jinis lan jumlah inti lan cara ngemas.
Zen 5 inti
This core is optimized for high performance. Up to eight cores are combined to create a core complex (CCX) that includes a 32-MB shared L3 cache. This core complex is fabricated onto a die (CCD), up to 16 of which can be configured into an EPYC 9005 processor for up to 128 cores in the SP5 form factor. Compared to the previous generation, 5th Gen AMD EPYC processors, powered by the advanced Zen 5 core, along with faster memory and other key CPU improvements, provide 20 percent greater integer and 34 percent higher floating-point performance in 64-core processors operating within the same 360W TDP range 9xx5-070, 9xx5-073.
Zen 5c inti
Inti iki dioptimalake kanggo kapadhetan lan efisiensi. Wis logika register-transfer padha Zen 5 inti, nanging tata fisik njupuk kurang papan lan dirancang kanggo ngirim kinerja luwih saben watt. Komplek inti Zen 5c kalebu nganti 16 intine lan cache L32 3-MB sing dienggo bareng. Nganti 12 saka CCD iki bisa digabungake karo I / O CCD kanggo ngirim CPU nganti 192 intine ing faktor wangun SP5.
Prosesor AMD EPYC 9005 Series 5th Gen dibangun kanthi spesifikasi sing kadhaptar ing Tabel 2.
Tabel 2. Spesifikasi prosesor AMD EPYC 9005 Series 5th gen
Item | Spesifikasi |
Teknologi proses inti | 4-nanometer (nm) Zen 5 and 3-nanometer Zen 5c |
Jumlah maksimum inti | 192 |
Maximum L3 cache | 512 MB |
kacepetan memori maksimum | 6000 Mega-Transfer per detik (MT/s) |
Saluran memori maksimum | 12 saben soket |
Kapasitas memori maksimal | 6 TB saben soket |
PCI | 128 lanes (max.) for 1-socket 160 lanes (max.) for 2-socket PCIe Gen 5 |
Cathetan: Cisco UCS M8 platform ndhukung mung nganti 160 intine 400W TDP saka Zen 5c pemroses.
For more information about the AMD EPYC 9005 Series 5th Gen processors microarchitecture, see Swaraview of AMD EPYC 9005 Series Processors Microarchitecture.
Topologi Non-Uniform Memory Access (NUMA).
Prosesor AMD EPYC 9004 lan 9005 Series nggunakake arsitektur Non-Uniform Memory Access (NUMA) ing ngendi latensi beda-beda gumantung saka jarak inti prosesor menyang memori lan pengontrol I/O. Nggunakake sumber daya ing simpul NUMA sing padha nyedhiyakake kinerja apik sing seragam, nalika nggunakake sumber daya ing simpul sing beda-beda nambah latensi.
Pangguna bisa nyetel setelan BIOS sistem NUMA Nodes Per Socket (NPS) kanggo ngoptimalake topologi NUMA iki kanggo lingkungan operasi lan beban kerja sing spesifik. Kanggo example, setelan NPS = 4 dibagi prosesor menyang quadrants, ngendi saben quadrant wis 3 CCDs, 3 UMCs, lan 1 I / O hub. Jarak I/O prosesor-memori sing paling cedhak yaiku antarane inti, memori, lan periferal I/O ing kuadran sing padha. Jarak paling adoh antarane inti lan memori controller utawa I / O hub ing salib-diagonal quadrants (utawa prosesor liyane ing konfigurasi 2P). Lokalitas inti, memori, lan hub / piranti IO ing sistem basis NUMA minangka faktor penting nalika nyetel kinerja.
Ing pemroses EPYC Generasi kaping 4, optimasi menyang Infinity Fabric interkoneksi luwih nyuda beda latensi. Nggunakake EPYC 9004 prosesor Series, kanggo aplikasi sing kudu remet pungkasan siji utawa rong persen latensi metu saka referensi memori, nggawe karemenan antarane kisaran memori lan CPU mati (Zen 4 utawa Zen 4c) bisa nambah kinerja. Gambar 1 nggambarake cara kerjane. Yen dibagi I / O mati dadi papat quadrants kanggo konfigurasi NPS = 4, sampeyan bakal weruh sing enem DIMMs pakan menyang telung pengontrol memori, sing disambungake rapet liwat Tanpa wates Fabric (GMI) kanggo pesawat nganti telung Zen 4 CPU mati, utawa munggah 24 intine CPU.
Gambar:1
AMD EPYC 4th Gen prosesor blok diagram karo domain NUMA
Ing pemroses EPYC Generasi 5, dandan sing digawe kanggo interkoneksi AMD Infinity Fabric wis nyuda beda latensi luwih akeh. Nggunakake EPYC 9005 prosesor Series, kanggo aplikasi sing kudu remet pungkasan siji utawa rong persen latensi metu saka referensi memori, kanggo nggawe karemenan antarane kisaran memori lan CPU mati (Zen 5 utawa Zen 5c), bisa nambah kinerja. Gambar 2 nggambarake cara kerjane. Yen dibagi I / O mati dadi papat quadrants kanggo konfigurasi NPS = 4, sampeyan bakal weruh sing enem DIMMs feed menyang telung pengontrol memori, sing disambungake rapet liwat Tanpa wates Fabric (GMI) kanggo pesawat nganti papat Zen 5 CPU mati utawa nganti telung Zen 5c CPU mati.
Gambar:2
AMD EPYC 5th Gen prosesor blok diagram karo domain NUMA
NPS1
Setelan NPS=1 nuduhake siji simpul NUMA saben soket. Setelan iki ngatur kabeh saluran memori ing prosesor dadi siji simpul NUMA. Kabeh inti prosesor, kabeh memori sing dipasang, lan kabeh piranti PCIe sing disambungake menyang SoC ana ing siji simpul NUMA kasebut. Memori interleaved ing kabeh saluran memori ing prosesor menyang papan alamat siji.
NPS2
A setelan saka NPS = 2 configures saben prosesor dadi loro domain NUMA sing kelompok setengah saka intine lan setengah saka saluran memori menyang siji domain NUMA, lan intine isih lan saluran memori menyang domain NUMA kapindho. Memori interleaved liwat enem saluran memori ing saben domain NUMA. Piranti PCIe bakal dadi lokal ing salah siji saka rong simpul NUMA gumantung saka setengah sing duwe kompleks ROOT PCIe kanggo piranti kasebut.
NPS4
Setelan saka NPS = 4 partisi prosesor dadi papat simpul NUMA saben soket karo saben kuadran logis diatur minangka domain NUMA dhewe. Memori disambungake ing saluran memori sing ana gandhengane karo saben kuadran. Piranti PCIe bakal dadi lokal kanggo salah siji saka papat prosesor NUMA domain, gumantung ing kuadran IOD sing nduweni kompleks ROOT PCIe sing cocog kanggo piranti kasebut. Saben pasangan saluran memori interleaved. Iki dianjurake kanggo HPC lan beban kerja paralel liyane. Sampeyan kudu nggunakake NPS4 nalika boot sistem Windows karo CPU SMT aktif kanggo pemroses AMD EPYC karo luwih saka 64 intine, amarga Windows matesi ukuran klompok CPU maksimum 64 intine logis.
Cathetan: For Windows systems, verify that the number of logical processors per NUMA node <=64 by using either NPS2 or NPS4 instead of the default NPS1.
NPS0 (ora dianjurake)
Setelan NPS=0 nuduhake domain NUMA siji saka kabeh sistem (ing loro soket ing konfigurasi loro-soket). Setelan iki ngatur kabeh saluran memori ing sistem dadi siji simpul NUMA. Memori interleaved ing kabeh saluran memori ing sistem menyang papan alamat siji. Kabeh intine prosesor ing kabeh soket, kabeh memori sing dipasang, lan kabeh piranti PCIe sing disambungake menyang salah siji prosesor ana ing domain NUMA siji.
Lapisan 3 cache minangka NUMA Domain
Saliyane setelan NPS, siji opsi BIOS liyane kanggo ngganti konfigurasi NUMA kasedhiya. Kanthi opsi Layer 3 Cache minangka NUMA (L3CAN), saben cache Layer 3 (siji saben CCD) kapapar minangka simpul NUMA dhewe. Kanggo example, prosesor siji karo 8 CCDs bakal duwe 8 kelenjar NUMA: siji kanggo saben CCD. Ing kasus iki, sistem rong soket bakal duwe total 16 simpul NUMA.
Setelan prosesor
Bagean iki nerangake opsi prosesor sing bisa diatur.
Mode CPU SMT
You can set the CPU Simultaneous Multithreading (CPU SMT) option to enable or disable logical processor cores on processors that support the AMD CPU SMT mode option. When the CPU SMT mode is set to Auto (enabled), each physical processor core operates as two logical processor cores and allows multithreaded software applications to process threads in parallel within each processor.
Sawetara beban kerja, kalebu akeh HPC, mirsani asil kinerja-netral utawa malah kinerja-negatif nalika CPU SMT diaktifake. Sawetara aplikasi, lan ora mung inti fisik, dilisensi dening thread hardware minangka aktif. Kanggo alasan kasebut, mateni CPU SMT ing prosesor EPYC 9004 Series bisa uga dikarepake. Kajaba iku, sawetara sistem operasi ora duwe dhukungan kanggo x2APIC ing prosesor EPYC 9004 Series, sing dibutuhake kanggo ndhukung luwih saka 255 benang. Yen sampeyan mbukak sistem operasi sing ora ndhukung implementasine x2APIC AMD, lan sampeyan duwe loro prosesor 64-inti diinstal, sampeyan kudu mateni CPU SMT. Tabel 3 ngringkes setelan.
Sampeyan kudu nyoba pilihan CPU hyperthreading loro aktif lan dipatèni ing lingkungan tartamtu. Yen sampeyan mbukak aplikasi single-threading, sampeyan kudu mateni hyperthreading.
Tabel 3. CPU SMT settings
Setelan | Pilihan |
CPU SMT control | ● Auto: uses two hardware threads per core
● Disable: uses a single hardware thread per core ● Enable: uses a double hardware thread per core |
Mode Secure Virtual Machine (SVM).
Mode Secure Virtual Machine (SVM) mbisakake fitur virtualisasi prosesor lan ngidini platform kanggo mbukak macem-macem sistem operasi lan aplikasi ing partisi independen. Mode AMD SVM bisa disetel menyang salah siji saka nilai ing ngisor iki:
- Pateni: prosesor ora ngidini virtualisasi.
- Diaktifake: prosesor ngidini sawetara sistem operasi ing partisi independen.
Yen skenario aplikasi sampeyan ora mbutuhake virtualisasi, banjur mateni teknologi virtualisasi AMD. Sawise virtualisasi dipatèni, uga mateni pilihan AMD IOMMU, kang bisa nimbulaké beda ing latensi kanggo akses memori. Tabel 4 ngringkes setelan.
Tabel 4. Virtualization option settings
Setelan | Pilihan |
SVM | ● Enabled
● Disabled |
DF C-negara
Kaya inti CPU, AMD Infinity Fabric bisa mlebu ing kahanan daya sing luwih murah nalika nganggur. Nanging, bakal ana wektu tundha nalika bali menyang mode full-power, nyebabake sawetara latensi jitter. Ing beban kerja sing kurang latensi utawa sing nganggo I/O bursty, sampeyan bisa mateni fitur Data Fabric (DF) C-states kanggo entuk kinerja sing luwih akeh, kanthi ijol-ijolan konsumsi daya sing luwih dhuwur. Tabel 5 ngringkes setelan.
Tabel 5. DF C-negara
Setelan | Pilihan |
DF C-negara | ● Auto/Enabled: allows the AMD Infinity Fabric to enter a low-power state
● Disabled: prevents the AMD Infinity Fabric from entering a low-power state |
ACPI SRAT L3 Cache minangka NUMA Domain
Nalika setelan ACPI SRAT L3 Cache minangka NUMA Domain diaktifake, saben cache Layer-3 kapapar minangka simpul NUMA. Kanthi setelan Layer 3 Cache minangka NUMA Domain (L3CAN), saben cache Layer-3 (siji saben CCD) katon minangka simpul NUMA dhewe. Kanggo example, prosesor siji karo 8 CCDs bakal duwe 8 kelenjar NUMA: siji kanggo saben CCD. Sistem prosesor dual bakal duwe total 16 simpul NUMA.
Setelan iki bisa ningkatake kinerja kanggo beban kerja sing dioptimalake NUMA banget yen beban kerja utawa komponen beban kerja bisa disematake menyang inti ing CCX lan yen bisa entuk manfaat saka nuduhake cache Layer-3. Yen setelan iki dipateni, domain NUMA diidentifikasi miturut setelan parameter NUMA NPS.
Some operating systems and hypervisors do not perform Layer 3–aware scheduling, and some workloads benefit from having Layer 3 declared as a NUMA domain. Table 6 summarizes the settings.
Tabel 6. ACPI SRAT Layer 3 Cache as NUMA Domain settings
Setelan | Pilihan |
ACPI SRAT L3 Cache Minangka NUMA Domain | ● Auto (disabled)
● Disable: does not report each Layer-3 cache as a NUMA domain to the OS ● Enable: reports each Layer-3 cache as a NUMA domain to the OS |
Algoritma Performance Boost Disable (APBDIS)
Allows you to select the Algorithm Performance Boost (APB) disable value for the SMU. In the default state, the AMD Infinity Fabric selects between a full-power and low-power fabric clock and memory clock, based on fabric and memory use. However, in certain scenarios involving low bandwidth but latency-sensitive traffic
(and memory latency checkers), The transition from low power to full power can adversely affect latency. Setting APBDIS to 1 (to disable Algorithm Performance Boost [APB]) and specifying a fixed Infinity Fabric P-state of 0 will force the Infinity Fabric and memory controllers into full-power mode, eliminating any such latency jitter. Certain CPU processors and memory population options result in a scenario in which setting a fixed Infinity Fabric P- state of 1 will reduce memory latency at the expense of memory bandwidth. This setting may benefit applications known to be sensitive to memory latency. Table 7 summarizes the settings.
Tabel 7. APBDIS setting
Setelan | Pilihan |
APBDIS | ● Auto (0): sets an auto APBDIS for the SMU. This is the default option.
● 0: dynamically switches Infinity Fabric P-state based on link use ● 1: enables fixed Infinity Fabric P-state control |
Telpon SOC P-State SP5F 19h
Meksa negara-P dadi independen utawa gumantung, kaya sing dilapurake dening obyek ACPI _PSD. Iki ngganti SOC P-State yen APBDIS diaktifake. ngendi, F nuduhake kulawarga prosesor.
Setelan | Pilihan |
Telpon SOC P-State SP5F 19h | ● P0: highest-performing Infinity Fabric P-state
● P1: next-highest-performing Infinity Fabric P-state ● P2: next-highest-performing Infinity Fabric P-state after P1 |
setelan xGMI: sambungan antarane soket
Ing sistem rong soket, prosesor disambungake liwat link xGMI socket-to-socket, bagean saka Infinity Fabric sing nyambungake kabeh komponen SoC bebarengan.
Beban kerja sing ora dingerteni NUMA bisa uga mbutuhake bandwidth xGMI maksimal amarga komunikasi cross-socket sing ekstensif. Beban kerja sing ngerti NUMA bisa uga pengin nyilikake kekuwatan xGMI amarga ora duwe lalu lintas lintas-soket lan luwih seneng nggunakake peningkatan CPU. Jembar jalur xGMI bisa dikurangi saka x16 dadi x8 utawa x2, utawa link xGMI bisa dipateni yen konsumsi daya dhuwur banget.
Konfigurasi link xGMI lan kacepetan maksimal xGMI 4-link (Cisco xGMI max Speed)
Sampeyan bisa nyetel nomer link xGMI lan kacepetan maksimum kanggo link xGMI. Nyetel nilai iki menyang kacepetan sing luwih murah bisa ngirit daya uncore sing bisa digunakake kanggo nambah frekuensi inti utawa nyuda daya sakabèhé. Uga nyuda bandwidth cross-socket lan nambah latency cross-socket. Cisco UCS C245 M8 rak Server ndhukung papat xGMI pranala karo pa kacepetan maksimum 32 Gbps.
Cisco xGMI max setelan Speed ngidini kanggo ngatur konfigurasi Link xGMI lan 4-Link / 3-Link xGMI Max Speed. Ngaktifake Cisco xGMI max kacepetan bakal nyetel xGMI Link Konfigurasi kanggo 4, lan 4-Link xGMI Max Speed 32 Gbps. Mateni setelan Cisco xGMI Max Speed bakal ngetrapake nilai standar.
Tabel 8 ngringkes setelan.
Tabel 8. setelan link xGMI
Setelan | Pilihan |
Cisco xGMI Max Speed | ● Disabled (default)
● Enabled |
xGMI Link Konfigurasi | ● Auto
● 1 ● 2 ● 3 ● 4 |
4-Link xGMI Max Speed | ● Auto (25 Gbps)
● 20 Gbps ● 25 Gbps ● 32 Gbps |
3-Link xGMI Max Speed | ● Auto (25 Gbps)
● 20 Gbps ● 25 Gbps ● 32 Gbps |
Cathetan: This BIOS feature is applicable only to Cisco UCS X215c M8 Compute Nodes and Cisco UCS C245 M8 Rack Servers with 2-socket configurations.
Peningkatan kinerja CPU
Opsi BIOS iki mbantu pangguna ngowahi setelan kinerja CPU sing ditingkatake. Yen diaktifake, opsi iki nyetel setelan prosesor lan mbisakake prosesor bisa mlaku kanthi agresif, sing bisa ningkatake kinerja CPU sakabèhé nanging bisa nyebabake konsumsi daya sing luwih dhuwur. Nilai kanggo pilihan BIOS iki bisa otomatis utawa dipatèni. Kanthi gawan, opsi kinerja CPU sing ditingkatake dipateni.
Cathetan: This BIOS feature is applicable only to Cisco UCS X215c M8 Compute Nodes and Cisco UCS C245 M8 Rack Servers. When this option is enabled, we highly recommend setting the fan policy at maximum power.
Kanthi gawan, setelan BIOS iki dipatèni.
Setelan memori
Sampeyan bisa ngatur setelan Memori sing diterangake ing bagean iki.
NUMA Nodes Per Socket (NPS)
This setting lets you specify the number of desired NUMA Nodes Per Socket (NPS) and enables a tradeoff between reducing local memory latency for NUMA-aware or highly parallelizable workloads and increasing per-core memory bandwidth for non-NUMA-friendly workloads. Socket interleave (NPS0) will attempt to interleave the two sockets together into one NUMA node. 4th Gen AMD EPYC processors support a varying number of NUMA NPS values depending on the internal NUMA topology of the processor. NPS2 and NPS4 may not be options on certain processors or with certain memory populations.
Ing server siji-soket, jumlah simpul NUMA saben soket bisa 1, 2, utawa 4, sanajan ora kabeh nilai didhukung dening saben prosesor. Kinerja kanggo aplikasi sing dioptimalake NUMA banget bisa ditambah kanthi nyetel jumlah simpul NUMA saben soket menyang nilai sing didhukung luwih saka 1.
The default configuration (one NUMA Domain per socket) is recommended for most workloads. NPS4 is recommended for High-Performance Computing (HPC) and other highly parallel workloads. When using 200-Gbps network adapters, NPS2 may be preferred to provide a compromise between memory latency and memory bandwidth for the Network Interface Card (NIC).
This setting is independent of the Advanced Configuration and Power Interface (ACPI) Static Resource Affinity Table (SRAT) Layer- 3 (L3) cache as a NUMA Domain setting. When ACPI SRAT L3 Cache as NUMA Domain is enabled, this setting then determines the memory interleaving granularity. With NPS1, all eight memory channels are interleaved. With NPS2, every four channels are interleaved with each other. With NPS4, every pair of channels is interleaved. Table 9 summarizes the settings.
Tabel 9. NUMA NPS settings
Setelan | Pilihan |
NUMA Nodes per Socket | ● Auto (NPS1)
● NPS0: interleave memory accesses across all channels in both sockets (not recommended). ● NPS1: interleave memory accesses across all eight channels in each socket; reports one NUMA node per socket (unless L3 Cache as NUMA is enabled). ● NPS2: interleave memory accesses across groups of four channels (ABCD and EFGH) in each socket; reports two NUMA nodes per socket (unless L3 Cache as NUMA is enabled). ● NPS4: interleave memory accesses across pairs of channels (AB, CD, EF, and GH) in each socket; reports four NUMA nodes per socket (unless L3 Cache as NUMA is enabled). |
Unit Manajemen Memori I/O (IOMMU)
The I/O Memory Management Unit (IOMMU) provides several benefits and is required when using x2 programmable interrupt controller (x2APIC). Enabling the IOMMU allows devices (such as the EPYC integrated SATA controller) to present separate interrupt requests (IRQs) for each attached device instead of one IRQ for the subsystem. The IOMMU also allows operating systems to provide additional protection for Direct Memory Access (DMA)–capable I/O devices. IOMMU also helps filter and remap interrupts from peripheral devices. Table 10 summarizes the settings.
Tabel 10. IOMMU settings
Setelan | Pilihan |
IOMMU | ● Auto (enabled)
● Disabled: disable IOMMU support ● Enabled: enable IOMMU support |
Memori interleaving
Interleaving memori minangka teknik sing digunakake CPU kanggo nambah bandwidth memori sing kasedhiya kanggo aplikasi. Tanpa interleaving, pamblokiran memori consecutive, asring garis cache, diwaca saka bank memori padha. Piranti lunak sing maca memori berturut-turut kudu ngenteni operasi transfer memori rampung sadurunge miwiti akses memori sabanjure. Kanthi memori interleaving aktif, pamblokiran memori consecutive ing bank-bank beda, lan kabeh mau bisa kontribusi kanggo bandwidth memori sakabèhé sing program bisa entuk.
AMD nyaranake supaya kabeh wolung saluran memori saben soket CPU diisi karo kabeh saluran sing nduweni kapasitas sing padha. Pendekatan iki ngidini subsistem memori bisa digunakake ing mode interleaving wolung arah, sing kudu menehi kinerja paling apik ing pirang-pirang kasus. Tabel 11 ngringkes setelan.
Tabel 11. Setelan interleaving memori
Setelan | Pilihan |
Memori interleaving | ● Enabled: interleaving is enabled with supported memory DIMM configuration.
● Disable: no interleaving is performed. |
Setelan daya
Sampeyan bisa ngatur setelan status daya sing diterangake ing bagean iki.
ngedongkrak kinerja inti
Fitur ngedongkrak kinerja inti ngidini prosesor kanggo transisi menyang frekuensi sing luwih dhuwur tinimbang frekuensi dhasar CPU, adhedhasar kasedhiyan daya, headroom termal, lan nomer inti aktif ing sistem. Peningkatan kinerja inti bisa nyebabake jitter amarga transisi frekuensi saka inti prosesor.
Sawetara beban kerja ora kudu bisa mlaku kanthi frekuensi inti maksimal kanggo entuk tingkat kinerja sing bisa ditampa. Kanggo entuk efisiensi daya sing luwih apik, sampeyan bisa nyetel frekuensi dorongan inti maksimum. Setelan iki ora ngidini sampeyan nyetel frekuensi tetep; mung matesi frekuensi ngedongkrak maksimum. Kinerja ngedongkrak nyata gumantung ing akeh faktor lan setelan liyane sing kasebut ing dokumen iki. Tabel 12 ngringkes setelan.
Tabel 12. Setelan ngedongkrak kinerja inti
Setelan | Pilihan |
ngedongkrak kinerja inti | ● Auto (enabled): allows the processor to transition to a higher frequency (turbo frequency) than
the CPU’s base frequency ● Disabled: disables the CPU core boost frequency |
Kontrol C-state global
C-states are a processor’s CPU core inactive power states. C0 is the operational state in which instructions are processed, and higher-numbered C-states (C1, C2, etc.) are low-power states in which the core is idle. The Global C-state setting can be used to enable and disable C-states on the server. By default, the global C-state control is set to Auto, which enables cores to enter lower power states; this can cause jitter due to frequency transitions of the processor cores. When this setting is disabled, the CPU cores will operate at the C0 and C1 states. Table 13 summarizes the settings.
C-states are exposed through ACPI objects and can be dynamically requested by software. Software can request a C-state change either by executing a HALT instruction or by reading from a particular I/O address. The actions taken by the processor when entering the low-power C-state can also be configured by software. The 4th Gen AMD EPYC processor’s core is designed to support as many as three AMD-specified C-states:
I/O-based C0, C1, and C2.
Tabel 13. Setelan C-state global
Setelan | Pilihan |
Kontrol C-state global | ● Auto (enabled): enables I/O-based C-states
● Disabled: disables I/O-based C-states |
Layer-1 lan Layer-2 stream prefetcher hardware
Umume beban kerja entuk manfaat saka panggunaan prefetcher hardware stream Layer-1 lan Layer-2 (L1 Stream HW Prefetcher lan L2 Stream HW Prefetcher) kanggo ngumpulake data lan njaga pipa inti sibuk. Nanging, sawetara beban kerja pancen acak banget lan bakal entuk kinerja sakabehe sing luwih apik yen siji utawa loro prefetcher dipateni. Kanthi gawan, loro prefetcher diaktifake. Tabel 14 ngringkes setelan.
Tabel 14. Layer-1 lan Layer-2 stream setelan prefetcher hardware
Setelan | Pilihan |
L1 Stream HW Prefetcher | ● Auto (Enabled)
● Disable: disables prefetcher ● Enable: enables prefetcher |
L2 Stream HW Prefetcher | ● Auto (Enabled)
● Disable: disables prefetcher ● Enable: enables prefetcher |
Determinisme panggeser
Panggeser Determinism ngidini kanggo milih antarane kinerja seragam antarane sistem dikonfigurasi identik ing pusat data, kanthi nyetel server menyang setelan Performance, utawa kinerja maksimum sistem individu nanging kanthi kinerja beda-beda ing tengah data, kanthi nyetel server menyang setelan Power. Nalika panggeser Determinism disetel kanggo Performance, priksa manawa Daya Desain Termal (cTDP) sing bisa dikonfigurasi lan Limit Daya Paket (PPL) disetel menyang nilai sing padha. Setelan gawan (Otomatis) kanggo paling prosesor yaiku mode determinisme Kinerja, ngidini prosesor bisa operate ing tingkat daya sing luwih murah kanthi kinerja sing konsisten. Kanggo kinerja maksimal, setel panggeser Determinisme menyang Daya. Tabel 15 ngringkes setelan.
Tabel 15. Setelan panggeser Determinisme
Setelan | Pilihan |
Determinisme panggeser | ● Auto: this setting is equal to the Performance option.
● Power: ensures maximum performance levels for each CPU in a large population of identically configured CPUs by throttling CPUs only when they reach the same cTDP ● Performance: ensures consistent performance levels across a large population of identically configured CPUs by throttling some CPUs to operate at a lower power level |
CPPC: Kontrol Kinerja Prosesor Kolaboratif
Collaborative Processor Performance Control (CPPC) was introduced with ACPI 5.0 as a mode to communicate performance between an operating system and the hardware. This mode can be used to allow the OS to control when and how much turbo boost can be applied in an effort to maintain energy efficiency. Not all operating systems support CPPC, but Microsoft began support with Microsoft Windows 2016 and later.
Tabel 16 ngringkes setelan.
Tabel 16. Setelan CPPC
Setelan | Pilihan |
CPPC | ● Auto
● Disabled: disabled ● Enabled: allows the OS to make performance and power optimization requests using ACPI CPPC |
Daya profile pilihan F19h
Pilihan DF P-negara ing profile privasi wis overridden dening sawetara P-negara, pilihan BIOS, utawa pilihan APB_DIS BIOS, ngendi F nuduhake kulawarga prosesor lan M nuduhake model.
Setelan | Pilihan |
daya profile pilihan F19h | ● Efficiency mode
● High-performance mode ● Maximum I/O performance mode ● Balanced memory performance mode ● Balanced core performance mode ● Balanced core memory performance mode ● Auto |
Kabijakan kontrol penggemar
Kabijakan penggemar ngidini sampeyan ngontrol kacepetan penggemar kanggo nyuda konsumsi daya lan tingkat gangguan server. Sadurunge nggunakake kabijakan penggemar, kacepetan penggemar mundhak kanthi otomatis nalika suhu komponen server ngluwihi batesan sing disetel. Kanggo mesthekake yen kacepetan penggemar kurang, suhu batesan komponen biasane disetel menyang nilai dhuwur. Sanajan prilaku iki cocog karo konfigurasi server paling akeh, nanging ora ngatasi kahanan ing ngisor iki:
- Kinerja CPU maksimal: Kanggo kinerja dhuwur, CPU tartamtu kudu digawe adhem banget ing ngisor suhu ambang sing disetel. Pendinginan iki mbutuhake kecepatan penggemar sing dhuwur banget, sing nyebabake konsumsi daya lan tingkat gangguan.
- Konsumsi daya sing sithik: Kanggo njamin konsumsi daya sing paling murah, para penggemar kudu mlaku alon-alon lan, ing sawetara kasus, mandheg rampung ing server sing ngidini prilaku iki. Nanging kacepetan penggemar sing alon bisa nyebabake server dadi panas banget. Kanggo ngindhari kahanan iki, sampeyan kudu mbukak penggemar kanthi kacepetan sing luwih cepet tinimbang kacepetan sing paling murah.
Sampeyan bisa milih kabijakan penggemar ing ngisor iki:
- imbang: This is the default policy. This setting can cool almost any server configuration, but it may not be suitable for servers with PCIe cards, because these cards overheat easily.
- Daya kurang: This setting is well suited for minimal-configuration servers that do not contain any PCIe cards.
- Daya dhuwur: This setting can be used for server configurations that require fan speeds ranging from 60 to 85 percent. This policy is well suited for servers that contain PCIe cards that easily overheat and have high temperatures. The minimum fan speed set with this policy varies for each server platform, but it is approximately in the range of 60 to 85 percent.
- Daya maksimum: This setting can be used for server configurations that require extremely high fan speeds ranging between 70 and 100 percent. This policy is well suited for servers that contain PCIe cards that easily overheat and have extremely high temperatures. The minimum fan speed set with this policy varies for each server platform, but it is approximately in the range of 70 to 100 percent.
- Akustik: The fan speed is reduced to reduce noise levels in acoustic-sensitive environments. Rather than regulating energy consumption and preventing component throttling as in other modes, the Acoustic option could result in short-term throttling to achieve a lowered noise level. Applying this fan control policy may result in short-duration transient performance impacts.
Cathetan: This policy is configurable for standalone Cisco UCS C-Series M8 servers using the Cisco Integrated Management Controller (IMC) console and the Cisco IMC supervisor. From the Cisco IMC web console, pilih Compute > Power Policies > Configured Fan Policy > Fan Policy.
For Cisco Intersight®–managed C-Series M8 servers, this policy is configurable using fan policies.
Setelan BIOS kanggo Cisco UCS X215c M8 Compute Nodes, Cisco UCS C245 M8 Rack Server, lan Cisco UCS C225 M8 Rack Server
Tabel 17 nampilake jeneng token BIOS, standar, lan nilai sing didhukung kanggo server Cisco UCS M8 karo kulawarga prosesor AMD EPYC 4th gen lan 5th Gen.
Tabel 17. jeneng token BIOS lan nilai
jeneng token BIOS | Nilai standar | Nilai sing didhukung |
Prosesor | ||
Mode CPU SMT | Otomatis (diaktifake) | Otomatis, Aktif, Pateni |
mode SVM | diaktifake | Diaktifake, Dipateni |
DF C-negara | Otomatis (diaktifake) | Otomatis, Aktif, Pateni |
ACPI SRAT L3 Cache as NUMA
Domain |
Otomatis (nonaktif) | Otomatis, Aktif, Pateni |
APBDIS | Otomatis (0) | Otomatis, 0, 1 |
Telpon SOC P-State SP5F 19h | P0 | P0, P1, P2 |
4-link xGMI kacepetan maksimum* | Otomatis (32Gbps) | Auto, 20Gbps, 25Gbps, 32Gbps |
Peningkatan kinerja CPU* | dipatèni | Otomatis, Pateni |
Memori | ||
NUMA nodes per socket | Otomatis (NPS1) | Otomatis, NPS0, NPS1, NPS2, NPS4 |
IOMMU | Otomatis (diaktifake) | Otomatis, Aktif, Pateni |
Memori interleaving | Otomatis (diaktifake) | Otomatis, Aktif, Pateni |
Daya / kinerja | ||
ngedongkrak kinerja inti | Otomatis (diaktifake) | Otomatis, Pateni |
Kontrol C-state global | dipatèni | Otomatis, Aktif, Pateni |
L1 Stream HW Prefetcher | Otomatis (diaktifake) | Otomatis, Aktif, Pateni |
L2 Stream HW Prefetcher | Otomatis (diaktifake) | Otomatis, Aktif, Pateni |
Determinisme panggeser | Otomatis (daya) | Otomatis, Daya, Kinerja |
CPPC | Otomatis (nonaktif) | Otomatis, Pateni, Aktif |
jeneng token BIOS | Nilai standar | Nilai sing didhukung |
daya profile pilihan F19h | Mode kinerja dhuwur | Mode kinerja memori seimbang, mode efisiensi, mode kinerja dhuwur, mode kinerja I/O maksimum, mode kinerja inti imbang, mode kinerja memori inti imbang |
Rekomendasi BIOS kanggo macem-macem beban kerja umum
Bagean iki ngringkes setelan BIOS sing disaranake kanggo ngoptimalake beban kerja kanggo tujuan umum:
- Komputasi-intensif
- I/O-intensif
- Efisiensi energi
- Latensi sithik
Bagean ing ngisor iki nggambarake saben beban kerja.
beban kerja intensif CPU
Kanggo beban kerja intensif CPU, tujuane kanggo nyebarake karya kanggo siji proyek ing pirang-pirang CPU kanggo nyuda wektu pangolahan sabisa. Kanggo nindakake iki, sampeyan kudu mbukak bagean saka proyek ing podo karo. Saben proses, utawa utas, nangani bagean saka karya lan nindakake komputasi bebarengan. CPU biasane kudu ngganti informasi kanthi cepet, mbutuhake hardware komunikasi khusus.
CPU-intensive workloads generally benefit from processors or memory that achieves the maximum turbo frequency for any individual core at any time. Processor power management settings can be applied to help ensure that any component frequency increase can be readily achieved. CPU intensive workloads are general-purpose workloads, so optimizations are performed generically to increase processor core and memory speed, and performance tunings that typically benefit from faster computing time are used.
Beban kerja intensif I/O
I/O-intensive optimizations are configurations that depend on maximum throughput between I/O and memory. Processor utilization–based power management features that affect performance on the links between I/O and memory are disabled.
Beban kerja sing irit energi
Optimisasi hemat energi minangka setelan kinerja imbang sing paling umum. Dheweke entuk manfaat paling akeh beban kerja aplikasi nalika uga ngaktifake setelan manajemen daya sing ora duwe pengaruh kanggo kinerja sakabehe. Setelan sing ditrapake kanggo beban kerja sing irit energi nambah kinerja aplikasi umum tinimbang efisiensi daya. Setelan manajemen daya prosesor bisa mengaruhi kinerja nalika sistem operasi virtualisasi digunakake. Mula, setelan kasebut disaranake kanggo pelanggan sing biasane ora nyetel BIOS kanggo beban kerja.
Beban kerja latensi rendah
Beban kerja sing mbutuhake latensi sing sithik, kayata dagang finansial lan pangolahan wektu nyata, mbutuhake server menehi respon sistem sing konsisten. Beban kerja latensi rendah yaiku kanggo pelanggan sing njaluk latensi komputasi paling sithik kanggo beban kerjane. Kacepetan lan throughput maksimal asring dikorbanake kanggo nyuda latensi komputasi sakabèhé. Manajemen daya prosesor lan fitur manajemen liyane sing bisa ngenalake latensi komputasi dipateni.
Kanggo entuk latensi sing sithik, sampeyan kudu ngerti konfigurasi hardware sistem sing diuji. Faktor penting sing mengaruhi wektu nanggepi kalebu jumlah inti, benang pangolahan saben inti, jumlah simpul NUMA, CPU lan pangaturan memori ing topologi NUMA, lan topologi cache ing simpul NUMA. Opsi BIOS umume ora gumantung saka OS, lan sistem operasi low-latency sing disetel kanthi bener uga dibutuhake kanggo entuk kinerja deterministik.
Ringkesan setelan BIOS sing dioptimalake kanggo beban kerja umum
Tabel 18 ngringkes setelan BIOS sing dioptimalake kanggo beban kerja umum.
Table 18. BIOS recommendations for CPU-intensive, I/O-intensive, energy-efficiency, and low-latency workloads
pilihan BIOS | BIOS values (platform default) | CPU intensif | I/O intensif | Energi efisiensi | Latensi sithik |
Prosesor | |||||
Mode CPU SMT | Otomatis (diaktifake) | Auto | Auto | Auto | dipatèni |
mode SVM | diaktifake | diaktifake | diaktifake | diaktifake | dipatèni |
DF C-negara | Otomatis (diaktifake) | Auto | dipatèni | Auto | dipatèni |
ACPI SRAT L3
Cache as NUMA Domain |
Otomatis (nonaktif) | diaktifake | Auto | Auto | Auto |
APBDIS | Otomatis (0) | 1 | 1 | Auto | Auto |
Telpon SOC P-State SP5F 19h | P0 | P0 | P0 | P2 | P0 |
4-link xGMI kacepetan maksimum | Otomatis (32Gbps) | Auto | Auto | Auto | Auto |
Peningkatan kinerja CPU | dipatèni | Auto | dipatèni | dipatèni | dipatèni |
Memori | |||||
NUMA node saben soket | Otomatis (NPS1) | NPS4 | NPS4 | Auto | Auto |
IOMMU | Otomatis (diaktifake) | Otomatis * | Auto | Auto | dipatèni* |
Memori interleaving | Otomatis (diaktifake) | Otomatis * | Auto | Auto | dipatèni* |
pilihan BIOS | BIOS values (platform default) | CPU intensif | I/O intensif | Energi efisiensi | Latensi sithik |
Daya / kinerja | |||||
Core performance ngedongkrak | Otomatis (diaktifake) | Auto | Auto | Auto | dipatèni |
Kontrol C-State Global | dipatèni | dipatèni | diaktifake | diaktifake | dipatèni |
L1 Stream HW Prefetcher | Otomatis (diaktifake) | Auto | Auto | dipatèni | Auto |
L2 Stream HW Prefetcher | Otomatis (diaktifake) | Auto | Auto | dipatèni | Auto |
Determinisme panggeser | Otomatis (daya) | Auto | Auto | Auto | Kinerja |
CPPC | Otomatis (nonaktif) | Auto | Auto | diaktifake | Auto |
Daya profile pilihan F19h | Mode kinerja dhuwur | High- performance mode | Maximum I/O performance mode | Mode efisiensi | Mode kinerja dhuwur |
Cathetan: BIOS tokens with * highlighted are applicable only for Cisco UCS X215c M8 Compute Nodes and Cisco UCS C245 M8 Rack Servers.
If your application scenario does not require virtualization, then disable AMD virtualization technology. With virtualization disabled, also disable the AMD IOMMU option. It can cause differences in latency for memory access. See the AMD performance tuning guide kanggo informasi luwih lengkap.
Rekomendasi BIOS tambahan kanggo beban kerja perusahaan
Bagean iki ngringkes setelan BIOS optimal kanggo beban kerja perusahaan:
- Virtualisasi
- Wadhah
- Relational Database (RDBMS)
- Analytical Database (Bigdata)
- HPC workloads
Bagean ing ngisor iki nggambarake saben beban kerja perusahaan.
Virtualization workloads
AMD Virtualization Technology provides manageability, security, and flexibility in IT environments that use software-based virtualization solutions. With this technology, a single server can be partitioned and can be projected as several independent servers, allowing the server to run different applications on the operating system simultaneously. It is important to enable AMD Virtualization Technology in the BIOS to support virtualization workloads.
The CPUs that support hardware virtualization enable the processor to run multiple operating systems in virtual machines. This feature involves some overhead because the performance of a virtual operating system is comparatively slower than that of the native OS.
For more information, see AMD’s VMware vSphere Tuning Guide.
Beban karya kontainer
Containerizing platform aplikasi lan dependensi sing gegandhengan abstrak prasarana ndasari lan OS beda kanggo efisiensi. Saben wadhah dibundel dadi siji paket sing ngemot kabeh lingkungan runtime, kalebu aplikasi kanthi kabeh dependensi, perpustakaan lan binari liyane, lan konfigurasi. files needed kanggo mbukak aplikasi sing. Kontainer sing mbukak aplikasi ing lingkungan produksi mbutuhake manajemen kanggo njamin uptime sing konsisten. Yen wadhah mudhun, wadhah liyane kudu diwiwiti kanthi otomatis.
Workloads that scale and perform well on bare metal should see a similar scaling curve in a container environment with minimal performance overhead. Some containerized workloads can even see close to 0% performance variance compared to bare metal. Large overhead generally means that application settings and/or container configuration are not optimally set. These topics are beyond the scope of this tuning guide. However, the CPU load balancing behavior of Kubernetes or other container orchestration platform schedulers may assign or load balance containerized applications differently than in a bare metal environment.
For more information, see AMD’s Kubernetes Container Tuning Guide.
Beban kerja Database Relasional
Nggabungake RDBMS kaya Oracle, MySQL, PostgreSQL, utawa Microsoft SQL Server karo pemroses AMD EPYC bisa nyebabake kinerja database sing luwih apik, utamane ing lingkungan sing mbutuhake concurrency dhuwur, pangolahan pitakon kanthi cepet, lan panggunaan sumber daya sing efisien. Arsitèktur pemroses AMD EPYC ngidini basis data bisa nggunakake sawetara inti lan utas kanthi efektif, sing utamané migunani kanggo beban kerja transaksional, analisis, lan pangolahan data skala gedhé.
Ing ringkesan, nggunakake pemroses AMD EPYC ing lingkungan RDBMS bisa nyebabake perbaikan kinerja, skalabilitas, lan efisiensi biaya sing signifikan, dadi pilihan sing kuat kanggo solusi database perusahaan.
Prosesor AMD EPYC Generasi kaping 4 ngirimake Operasi Input/Output Per Detik (IOPS) lan throughput sing dhuwur kanggo kabeh database. Milih CPU tengen penting kanggo arsip kinerja aplikasi database optimal.
For more information, see AMD’s RDBMS Tuning Guide.
Beban kerja Big Data Analytics
Big Data Analytics involves the examination of vast amounts of data to uncover hidden patterns, correlations, and other insights that can be used to make better decisions. This requires significant computational power, memory capacity, and I/O bandwidth—areas where AMD EPYC processors excel.
Prosesor AMD EPYC nyedhiyakake platform sing kuat kanggo Big Data Analytics, nyedhiyakake daya komputasi, kapasitas memori, lan bandwidth I/O sing dibutuhake kanggo nangani panjaluk pangolahan data skala gedhe. Skalabilitas, efisiensi biaya, lan efisiensi energi ndadekake dheweke dadi pilihan sing menarik kanggo organisasi sing pengin mbangun utawa nganyarke infrastruktur Big Data Analytics.
Beban kerja HPC (High-performance computing).
HPC refers to cluster-based computing that uses multiple individual nodes that are connected and that work in parallel to reduce the amount of time required to process large data sets that would otherwise take exponentially longer to run on any one system. HPC workloads are computation-intensive and typically also network-I/O intensive. HPC workloads require high-quality
CPU components and high-speed, low-latency network fabrics for their Message Passing Interface (MPI) connections.
Kluster komputasi kalebu simpul kepala sing nyedhiyakake titik siji kanggo ngatur, nyebarake, ngawasi, lan ngatur kluster. Kluster uga duwe komponen manajemen beban kerja internal, sing dikenal minangka panjadwal, sing ngatur kabeh item kerja sing mlebu (disebut proyek). Biasane, beban kerja HPC mbutuhake akeh node kanthi jaringan MPI sing ora ngalangi supaya bisa skala. Skalabilitas simpul minangka faktor sing paling penting kanggo nemtokake kinerja kluster sing bisa digunakake.
HPC mbutuhake jaringan I/O bandwidth dhuwur. Nalika sampeyan ngaktifake dhukungan Direct Cache Access (DCA), paket jaringan langsung menyang cache prosesor Layer 3 tinimbang memori utama. Pendekatan iki nyuda jumlah HPC I / siklus O kui dening workloads HPC nalika adaptor Ethernet tartamtu digunakake, kang nambah kinerja sistem.
For more information, see AMD’s High-Performance Computing (HPC) Tuning Guide.
Ringkesan setelan BIOS dianjurake kanggo beban kerja perusahaan
Tabel 19 ngringkes token BIOS lan setelan dianjurake kanggo macem-macem workloads perusahaan.
Tabel 19.
BIOS recommendations for virtualization, containers, RDBMS, big-data analytics, and HPC enterprise workloads
pilihan BIOS | BIOS values (platform default) | Virtualization/ container | RDBMS | Big-data analytics | HPC |
Prosesor | |||||
Mode CPU SMT | diaktifake | diaktifake | diaktifake | dipatèni | dipatèni |
mode SVM | diaktifake | diaktifake | diaktifake | diaktifake | diaktifake |
DF C-negara | Otomatis (Aktifake) | Auto | dipatèni | Auto | Auto |
ACPI SRAT L3 Cache
as NUMA Domain |
Auto (Disabled) | Auto | Auto | Auto | Auto |
APBDIS | Otomatis (0) | Auto | 1 | 1 | 1 |
Telpon SOC P-State SP5F 19h | P0 | P0 | P0 | P0 | P0 |
4-link xGMI max kacepetan* | Otomatis (32Gbps) | Auto | Auto | Auto | Auto |
Peningkatan kinerja CPU* | dipatèni | dipatèni | dipatèni | dipatèni | Auto |
pilihan BIOS | BIOS values (platform default) | Virtualization/ container | RDBMS | Big-data analytics | HPC |
Memori | |||||
NUMA node saben soket | Otomatis (NPS1) | Auto | NPS4 | Auto | NPS4 |
IOMMU | Otomatis (Aktifake) | Auto | Auto | Auto | Auto |
Memori interleaving | Otomatis (Aktifake) | Auto | Auto | Auto | Auto |
Daya / kinerja | |||||
Core performance ngedongkrak | Otomatis (Aktifake) | Auto | Auto | Auto | Auto |
Kontrol C-State Global | dipatèni | diaktifake | diaktifake | diaktifake | diaktifake |
L1 Stream HW Prefetcher | Otomatis (Aktifake) | Auto | Auto | Auto | Auto |
L2 Stream HW Prefetcher | Otomatis (Aktifake) | Auto | Auto | Auto | Auto |
Determinisme panggeser | Auto (Power) | Auto | Auto | Auto | Auto |
CPPC | Auto (Disabled) | diaktifake | Auto | diaktifake | Auto |
Daya profile pilihan F19h | Mode kinerja dhuwur | Mode kinerja dhuwur | Maximum I/O performance mode | High- performance mode | High- performance mode |
Cathetan: BIOS tokens with *highlighted are not applicable only for single socket optimized platform like Cisco UCS C225 M8 1U Rack Server.
- If your workloads have few vCPUs per virtual machine (that is, less than a quarter of the number of cores per socket), then the following settings tend to provide the best performance:
- NUMA NPS (nodes per socket) = 4
- LLC As NUMA turned on
- If your workload virtual machines have a large number of vCPUs (that is, greater than half the number of cores per socket), then the following settings tend to provide the best performance:
- NUMA NPS (nodes per socket) = 1
- LLC As NUMA turned off
Kanggo informasi luwih lengkap, ndeleng ing VMware vSphere Tuning Guide.
Pandhuan tuning sistem operasi kanggo kinerja dhuwur
Sistem operasi Microsoft Windows, VMware ESXi, Red Hat Enterprise Linux, lan SUSE Linux dilengkapi akeh fitur manajemen daya anyar sing diaktifake kanthi standar. Mula, sampeyan kudu nyetel sistem operasi kanggo entuk kinerja sing paling apik.
For additional performance documentation, see the AMD EPYC performance tuning guides.
Linux (Red Hat lan SUSE)
Gubernur CPUfreq nemtokake karakteristik daya saka CPU sistem, sing uga mengaruhi kinerja CPU. Saben gubernur nduweni prilaku, tujuan, lan kesesuaian sing unik ing babagan beban kerja.
Gubernur kinerja meksa CPU nggunakake frekuensi jam sing paling dhuwur. Frekuensi iki disetel kanthi statis lan ora owah. Mulane, gubernur tartamtu iki ora menehi keuntungan ngirit daya. Iku cocok mung kanggo jam beban kerja abot, lan malah mung ing wektu nalika CPU arang (utawa ora tau) nganggur. Setelan gawan "ing dikarepake," sing ngidini CPU entuk frekuensi jam maksimum nalika mbukak sistem dhuwur, lan frekuensi jam minimal nalika sistem meneng. Senajan setelan iki ngidini sistem kanggo nyetel konsumsi daya miturut mbukak sistem, iku ora kanggo latensi saka ngoper frekuensi.
Gubernur kinerja bisa disetel nggunakake perintah cpupower: cpupower frekuensi-set -g kinerja
Kanggo informasi tambahan, deleng pranala ing ngisor iki:
- Red Hat Enterprise Linux: Setel gubernur CPUfreq kinerja.
- SUSE Enterprise Linux Server: Setel kinerja gubernur CPUfreq.
Microsoft Windows Server 2019 lan 2022
Kanggo Microsoft Windows Server 2019, kanthi gawan, rencana daya sing seimbang (disaranake). Setelan iki mbisakake konservasi energi, nanging bisa nimbulaké tambah latensi (wektu respon luwih alon kanggo sawetara tugas), lan bisa nimbulaké masalah kinerja kanggo aplikasi CPU-intensif. Kanggo kinerja maksimum, setel rencana daya kanggo High Performance.
Kanggo informasi tambahan, deleng link ing ngisor iki:
Microsoft Windows and Hyper-V: Set the power policy to High Performance.
VMware ESXi
Ing VMware ESXi, manajemen daya host dirancang kanggo nyuda konsumsi daya host ESXi nalika lagi diuripake. Setel kabijakan daya menyang Kinerja Tinggi kanggo entuk kinerja maksimal.
Kanggo informasi tambahan, deleng pranala ing ngisor iki:
VMware ESXi: Set the power policy to High Performance.
Kesimpulan
Nalika nyetel setelan BIOS sistem kanggo kinerja, sampeyan kudu nimbang sawetara opsi prosesor lan memori. Yen kinerja sing paling apik minangka tujuan sampeyan, pilih opsi sing ngoptimalake kinerja tinimbang ngirit daya. Uga eksprimen karo opsi liyane, kayata memori interleaving lan CPU hyperthreading. Sing paling penting, evaluasi pengaruh setelan apa wae ing kinerja sing dibutuhake aplikasi sampeyan.
Kanggo informasi luwih lengkap
Kanggo informasi luwih lengkap babagan Cisco UCS M8 Server karo prosesor AMD 4th & 5th gen, waca sumber daya ing ngisor iki:
- Pandhuan token BIOS IMM:
/b_IMM_Server_BIOS_Tokens_Guide.pdf
- Cisco UCS X215c M8 Compute Node:
- Server Rak Cisco UCS C245 M8:
- Server Rak Cisco UCS C225 M8:
- Pandhuan tuning AMD EPYC:
- https://developer.amd.com/resources/epyc-resources/epyc-tuning-guides/
- https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/58015- epyc-9004-tg-architecture-overview.pdf
- https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/white- papers/58649_amd-epyc-tg-low-latency.pdf
- https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/57996- epyc-9004-tg-rdbms.pdf
- https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning- guides/58002_amd-epyc-9004-tg-hpc.pdf
- https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/58013- epyc-9004-tg-hadoop.pdf
- https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/58007- epyc-9004-tg-mssql-server.pdf
- https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning- guides/58001_amd-epyc-9004-tg-vdi.pdf
Markas Amerika
Cisco Systems, Inc.
San Jose, CA
Markas Asia Pasifik
Sistem Cisco (USA) Pte. Ltd.
Singapura
Markas Eropah
Cisco Systems International BV Amsterdam,
Walanda
Cisco duwe luwih saka 200 kantor ing saindenging jagad. Alamat, nomer telpon, lan nomer fax kadhaptar ing Cisco Websitus ing https://www.cisco.com/go/offices. Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries, To view dhaftar merek dagang Cisco, menyang iki URL: https://www.cisco.com/go/trademarks. Third-party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnership relationship between CISCO and any other company. (1 1 1 OR)
Dicithak ing AS
Cll-4692101-03
07/25
© 2025 Cisco lan / utawa afiliasi. Kabeh hak dilindhungi undhang-undhang.
Dokumen / Sumber Daya
![]() |
Cisco Performance Tuning kanggo Cisco UCS M8 Platforms [pdf] Instruksi Manual C245 M8, Tuning Kinerja kanggo Platform Cisco UCS M8, Tuning kanggo Platform Cisco UCS M8, Platform Cisco UCS M8, Platform UCS M8, Platform M8, Platform |