IBM Power10 Performance Quick Start Guides

(Power10 QSGs)

November 2021

E1080 Memory Performance

Minimum Memory

DDIMM Plug Rules

Memory Performance

Memory Bandwidth

DDIMM Capacity Theoretical Max Bandwidth
32GB, 64 GB (DDR4 @ 3200 Mbps) 409 GB/s
128GB, 256 GB (DDR4 @ 2933 Mbps) 375 GB/s

Summary

P10 MMA Performance Guide

P10 Compute & MMA Architecture

P10 MMA Applications & Workload Integration

PowerPC Matrix-Multiply Assist Built-in Functions

https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Matrix-Multiply-Assist-Built-in-Functions.html

Matrix-Multiply Assist Best Practices Guide

https://www.redbooks.ibm.com/Redbooks.nsf/RedpieceAbstracts/redp5612.html?Open

PowerVM Best Practices

Virtual Processors

Processor Compatibility Mode

Processor Folding Considerations

LPAR Page Table Size Considerations

Reference

AIX Best Practices

Ensure OS level is current

Fix Central provides the latest updates for AIX, IBM i, VIOS, Linux, HMC and F/W. In addition, the FLRT tool provides the recommended levels for each H/W model. Use these tools to maintain your system up to date. If you cannot move up to the recommended level, then refer to the Known Issue section of the Hints & Tips for Migrating Workload to IBM POWER10 Processor-Based Systems document.

AIX CPU utilization

On POWER10, the AIX OS system is optimized for best raw throughput at higher CPU usage when running with dedicated processors. When running with shared processors, the AIX OS system is optimized to reduce CPU usage (pc). If the customer requires to further reduce CPU usage (pc), use the schedo tunable vpm_throughput_mode to tune the workload and evaluate the benefits of raw throughput vs. CPU usage.

NX GZIP

To take advantage of NX GZIP acceleration on POWER10 systems, the LPAR must be in POWER9 compatibility mode (not POWER9_base mode) or POWER10 compatibility mode.

IBM i Performance Tips

IBM i

Ensure the IBM i operating system level is current. Fix Central provides the latest updates for IBM i, VIOS, HMC and firmware. https://www.ibm.com/support/fixcentral/

Firmware

Ensure the system firmware level is current. Fix Central provides the latest updates for IBM i, VIOS, HMC and firmware. https://www.ibm.com/support/fixcentral/

Memory DIMMs

Follow proper memory plug-in rules. If possible, fully populate memory DIMM slots and utilize similar sized memory DIMMs.

Processor SMT level

To take full advantage of the performance of Power10 CPUs, it is recommended to utilize the IBM i default processor multitasking settings, which will maximize the SMT level for the LPAR configuration.

Partition Placement

Current FW levels ensure optimal placement of the partitions. However, if frequent DLPAR operations are executed on partitions on the CEC, it is recommended to use DPO to optimize placement.

Virtual Processors – shared vs dedicated processors

Utilize dedicated processors for optimal partition level performance.

EnergyScale

For the best CPU processor speed, ensure that Maximum Performance is set (default for IBM Power E1080). This setting is configurable in the ASMI.

Storage and Networking I/O

VIOS provides flexible storage and networking functionality. For the best possible performance, utilize native IBM i interfaces for I/O.

More comprehensive information

Refer to link: IBM i on Power - Performance FAQ https://www.ibm.com/downloads/cas/QWXA9XKN

Enterprise Linux for Power

The enterprise Linux operating system (OS) is a solid foundation for your hybrid cloud infrastructure and for scale-up enterprise software solutions. Recent releases are optimized for best-in-class Power10 Enterprise systems.

Power10

Linux + PowerVM

Supported distros

LPM Support

Power Specific Packages

Best practices

More reads

Network IO Considerations

Higher speed adapter considerations

Changing queue settings in AIX

To change the number of receive/transmit queues in AIX:

Changing queue size in AIX

Changing queue settings in Linux

To change the number of queues in Linux:

Changing queue size in Linux

Virtualization

Storage IO Considerations

Tuning guidelines

Please refer to the IBM Knowledge Center for AIX and Linux guidelines.

PCIe3 12 GB Cache RAID + SAS Adapter Quad-port 6 Gb x8 Adapter

Linux:

AIX:

IBM i:

PCIe3 x8 2-port Fibre Channel (32 Gb/s) Adapter

Additional AIX tuning for performance

NVMe Adapter AIX tuning for performance

IBM Open XL Compilers 17.1.0 for AIX

IBM's next-generation C/C++/Fortran compilers that combine IBM's advanced optimizations with open-source LLVM infrastructure.

LLVM

IBM optimizations

Availability

Full Power10 architecture exploitation with Open XL 17.1.0

Note: Applications compiled with earlier versions of XL Compilers (e.g., XL 16.1.0) to run on previous Power processors will run compatibly on Power10.

Recommended performance tuning options

Optimization Level Usage recommendations
-O2 and -O3 Typical starting point
Link time optimization:
-flto (C/C++), -qlto (Fortran)
For workloads with lots of small function calls
Profile guided optimization:
-fprofile-generate, -fprofile-use (C/C++)
-qprofile-generate, -qprofile-use (Fortran)
For workloads with lots of branching and function calls

Binary Compatibility on AIX

Note: XL C/C++ for AIX 16.1.0 already introduced a new invocation 'xlclang++' which leverages the Clang front-end from LLVM project.

More info please visit: https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.0 https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.0

GNU Compiler Collection (GCC)

Availability

IBM Advance Toolchain

Languages

Compatibility and New Features on Power10

IBM Recommended and Supported Compiler Flags

Flag Description
-O3 or -Ofast Aggressive optimization. -Ofast is essentially equivalent to -O3 + -ffast-math, which also relaxes restrictions on IEEE floating-point arithmetic.
-mcpu=powern Compile using instructions supported by the Powern processor. For example, to use instructions available only on Power10, select -mcpu=power10.
-flto Optional. Perform “link-time” optimization. This optimizes code across function calls where the caller and called functions exist in different compilation units, and can often provide a significant performance boost.
-funroll-loops Optional. Perform more aggressive duplication of loop bodies than the compiler normally would. Generally you should omit this, but on some codes this can provide better performance.

Note: Although -mcpu=power10 is supported as early as GCC 10.3, GCC 11.2 is preferred because earlier compilers don't support every feature implemented in the Power10 processors. Also, objects created using -mcpu=power10 will not run on POWER9 or earlier processors! However, there are ways to create code that is optimized for different processor versions [7].

[1] Red Hat: Using GCC Toolset. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/developing_c_and_cpp_applications_in_rhel_8/gcc-toolset_toolsets

[2] SUSE: Understanding the Development Tools Module. https://www.suse.com/c/suse-linux-essentials-where-are-the-compilers-understanding-the-development-tools-module/

[3] Advance Toolchain for Linux on IBM Power Systems. https://www.ibm.com/support/pages/advance-toolchain-linux-power

[4] Go Language. https://golang.org

[5] Matrix-Multiply Assist Best Practices Guide. http://www.redbooks.ibm.com/redpapers/pdfs/redp5612.pdf

[6] Using the GNU Compiler Collection. https://gcc.gnu.org/onlinedocs/gcc.pdf

[7] Target-Specific Optimization with the GNU Indirect Function Mechanism. https://developer.ibm.com/tutorials/optimized-libraries-for-linux-on-power/#target-specific-optimization-with-the-gnu-indirect-function-mechanism

Java Applications

Java applications can seamlessly take advantage of new P10 ISA features on operating systems running in P10 mode by using the Java runtime versions listed below or newer:

Java 8

Java 11

Java 17 (drivers may not be available yet)

Performance tuning references

IBM WebSphere Application Server Performance Cookbook

Oracle Database w/ AIX Best Practices

Page Size

The general recommendation for most Oracle databases on AIX is to utilize 64KB page size and not 16MB page size for the SGA. Typically, 64 KB pages yield nearly the same performance benefit as 16 MB pages without special management.

TNS Listener

Oracle 12.1 database and later releases by default will use 64k pages for text, data, and stack. However, for the TNSLISTENER, it still uses 4k pages for text, data, and stack. To enable 64k page for the listener, use the export command prior to starting the listener process. Note that running in an ASM based environment, the listener runs out of GRID_HOME and not ORACLE_HOME.

The documentation for the “srvctl setenv” command changed in 12.1 or later releases. The -t or -T was removed in favor of -env or -envs. In the Oracle Listener environment, set and export:

Shared SYMTAB

The LDR_CNTRL=SHARED_SYMTAB=Y setting does not need to be specifically set in 11.2.0.4 or later releases. The compiler linker options take care of this setting and it is no longer needed. It is not recommended to have LDR_CNTRL=SHARED_SYMTAB=Y specifically set in 12c or later releases.

Virtual Processor Folding

This is a critical setting in a RAC environment when using LPARs with processor folding enabled. If this setting is not adjusted, there is a high risk of RAC node evictions under light database workload conditions. Use: schedo -p -o vpm_xvcpus=2

VIOS & RAC Interconnect

A dedicated 10G (i.e., 10G Ethernet Adapter) connection is recommended as a minimum to provide sufficient bandwidth for cluster timing-sensitive traffic. RAC cluster traffic - interconnect traffic should be dedicated and not shared. Sharing of interconnect can cause timing delays leading to node hang/eviction issues.

Network Performance

This is a long-standing network tuning suggestion for Oracle on AIX, although the default remains at 0. TCP Setting of rfc1323=1.

More comprehensive information

Refer to link: Managing the Stability and Performance of current Oracle Database versions running AIX on Power Systems including POWER9 https://www.ibm.com/support/pages/node/6355543

Recommendations for Db2

General

Db2 Warehouse

CP4D

Db2 Best Practices

https://www.ibm.com/docs/en/db2/11.5?topic=overviews-db2-best-practices

OCP Performance

Network

Operating system

Deployment

Models: Power10, Performance, Power10 Performance

File Info : application/pdf, 15 Pages, 456.05KB

PDF preview unavailable. Download the PDF instead.

Power10 Performance Quick Start Guides macOS Version 11.3.1 (Build 20E241) Quartz PDFContext

Related Documents

Preview Oracle Database Performance on IBM AIX: 19c, 18c, 12c, and 11.2.0.4 on Power Systems
This document provides information, suggestions, and links for optimizing Oracle database performance on IBM Power Systems running AIX, covering versions 19c, 18c, 12c, and 11.2.0.4. It details considerations for both standalone and RAC environments, including AIX and POWER10 specific tuning, memory management, I/O, and CPU settings.
Preview Db2 11 for z/OS: Managing Performance Guide
Comprehensive guide to managing and optimizing the performance of IBM Db2 11 for z/OS. Covers workload definition, system tuning, I/O, storage, concurrency, database design, programming, and monitoring.
Preview IBM POWER9 Enterprise E950 Server: Unofficial Deep Dive & Specifications
An in-depth overview of the IBM POWER9 Enterprise E950 server, covering its processor, memory, I/O, operating system support (AIX, Linux), and technical specifications. Learn about its performance enhancements and capabilities.
Preview IBM Power System S914: Hybrid Multicloud Infrastructure
Discover the IBM Power System S914, a powerful 1-socket server designed for hybrid multicloud architectures and mission-critical applications. Learn about its PCIe Gen4 architecture, POWER9 processor, advanced virtualization with PowerVM, and robust security features.
Preview IBM Power S1022 (9105-22A) ENERGY STAR Certified Enterprise Server Specifications
Detailed specifications and ENERGY STAR certification information for the IBM Power S1022 (9105-22A) enterprise server, highlighting key features, performance metrics, and energy efficiency.
Preview IBM Spectrum Control 5.2.13 Quick Start Guide: Installation and Overview
A quick start guide for installing and configuring IBM Spectrum Control version 5.2.13, covering product features, licensing, and step-by-step installation procedures for storage management software.
Preview IBM WebSphere Application Server Version 8.0: Securing Applications and Their Environment
A comprehensive guide to securing applications and environments with IBM WebSphere Application Server Version 8.0, covering authentication, authorization, SSL, Java 2 security, migration, and troubleshooting.
Preview IBM 3270 Information Display System: 3274 Control Unit Description and Programmer's Guide
A comprehensive technical manual detailing the IBM 3270 Information Display System, with a specific focus on the IBM 3274 Control Unit. It covers functional and programming aspects, data streams, operations, and integration within IBM environments.