User Manual for STMicroelectronics models including: STM32H5 Series Microcontrollers, STM32H5, Series Microcontrollers, Microcontrollers
File Info : application/pdf, 24 Pages, 937.85KB
DocumentDocumentAN5212 Application note How to use STM32 cache to optimize performance and power efficiency for STM32 MCUs Introduction This application note describes the instruction cache (ICACHE) and the data cache (DCACHE), the first caches developed by STMicroelectronics. The ICACHE and DCACHE introduced on the AHB bus of the Arm® Cortex®-M33 processor are embedded in the STM32 microcontroller (MCUs) listed in the table below. These caches allow users to improve their application performance and reduce the consumption when fetching instruction and data from both internal and external memories, or for data traffic from external memories. This document gives typical examples to highlight the ICACHE and DCACHE features and facilitate their configuration. Type Microcontrollers Table 1. Applicable products Product series STM32H5 series, STM32L5 series, STM32U5 series AN5212 - Rev 5 - March 2024 For further information contact your local STMicroelectronics sales office. www.st.com 1 Note: AN5212 General information General information This application note applies to the STM32 series microcontrollers that are Arm® Cortex® core-based devices. Arm is a registered trademark of Arm Limited (or its subsidiaries) in the US and/or elsewhere. AN5212 - Rev 5 page 2/24 AN5212 ICACHE and DCACHE overview 2 ICACHE and DCACHE overview This section provides an overview of the ICACHE and DCACHE interfaces embedded in the STM32 Arm® Cortex® core-based microcontrollers. This section details the ICACHE and DCACHE diagram and integration in the system architecture. 2.1 STM32L5 series smart architecture This architecture is based on a bus matrix allowing multiple masters (Cortex-M33, ICACHE, DMA1/2, and SDMMC1) to access multiple slaves (such as flash memory, SRAM1/2, OCTOSPI1, or FSMC). The figure below describes the STM32L5 series smart architecture. C-bus Cortex-M33 with Arm TrustZone® and FPU 8-Kbyte ICACHE S-bus Figure 1. STM32L5 series smart architecture DMA1 DMA2 SDMMC1 Legend Bus multiplexer when remapped by ICACHE ICACHE access MPCBBx: Memory protection controller block-based MPCWMx: Memory protection controller watermarked Slow bus Fast bus MPCBB1 MPCBB2 MPCWM1 Flash memory SRAM1 SRAM2 AHB1 peripherals AHB2 peripherals OCTOSPI1 OTFDEC MPCWM2 MPCWM3 FSMC BusMatrix-S The Cortex-M33 performance is improved by using the 8-Kbyte ICACHE interface introduced to its C-AHB bus, when fetching code or data from the internal memories (flash memory, SRAM1, or SRAM2) through the fast bus, and also from the external memories (OCTOSPI1 or FSMC) through the slow bus. 2.2 STM32U5 series smart architecture This architecture is based on a bus matrix allowing multiple masters (Cortex-M33, ICACHE, DCACHE, GPDMA, DMA2D and SDMMCs, OTG_HS, LTDC, GPU2D, GFXMMU) to access multiple slaves (such as flash memory, SRAMs, BKPSRAM, HSPI/OCTOSPI, or FSMC). AN5212 - Rev 5 page 3/24 2.3 AN5212 ICACHE and DCACHE overview The figure below describes the STM32U5 series smart architecture. Figure 2. STM32U5 series smart architecture C-bus Cortex-M33 with TrustZone mainline and FPU ICACHE (8/32-Kbyte ) S-bus GPDMA1 DMA2D SD MMC1 SD MMC2 OTG HS LTDC GPU2D GFXMMU M0 port M1 port Port 0 Port 1 Slow-bus Fast-bus DCACHE1 (4/16-Kbyte) DCACHE2 (16-Kbyte) 32-bit bus matrix APB1 peripherals APB2 peripherals Legend Bus multiplexer Master Interface Fast bus multiplexer Slave Interface Fast bus multiplexer on STM32U59x/5Ax/5Fx/5Gx Fast bus multiplexer on STM32U575/585 MPCBBx: Block-based memory protection controller MPCWMx: Watermark-based memory protection controller Peripheral not present in STM32U535/545 Peripheral not present in STM32U535/545/575/585 Peripheral present only in STM32U5Fx/5Gx ICACHE access DCACHE1 access DCACHE2 access 128-bit cache refill FLASH (512-Kbyte/ 2/4-Mbyte) MPCBB1 MPCBB2 SRAM1 SRAM2 MPCBB3 SRAM3 MPCBB5 MPCBB6 AHB1 peripherals MPCWM4 AHB2 peripherals SRAM5 SRAM6 BKPSRAM MPCWM1 MPCWM5 MPCWM6 MPCWM2 MPCWM3 SRD OTFDEC1 OTFDEC2 MPCBB4 OCTOSPI1 OCTOSPI2 HSPI1 FSMC SRAM4 AHB3 peripherals DT70004V2 The Cortex-M33 and the GPU2D interfaces both benefit from using CACHE. · ICACHE improves the performance of Cortex-M33 when fetching code or data from the internal memories through fast bus (flash memory, SRAMs) and from external memories through slow bus (OCTOSPI1/2 and HSPI1, or FSMC). DCACHE1 improves the performance when fetching data from internal or external memories through the sbus (GFXMMU, OCTOSPI1/2 and HSPI1, or FSMC). · DCACHE2 improves the performance of GPU2D when fetching data from internal and external memories (GFXMMU, flash memory, SRAMs, OCTOSPI1/2 and HSPI1, or FSMC) through the M0 port bus. STM32H5 series smart architecture STM32H523/H533, STM32H563/H573 and STM32H562 smart architecture This architecture is based on a bus matrix allowing multiple masters (Cortex-M33, ICACHE, DCACHE, GPDMAs,Ethernet and SDMMCs) to access multiple slaves (such as flash memory, SRAMs, BKPSRAM, OCTOSPI and FMC). The figure below describes the STM32H5 series smart architecture. AN5212 - Rev 5 page 4/24 AN5212 ICACHE and DCACHE overview Figure 3. STM32H563/H573 and STM32H562 series smart architecture Cortex-M33 with TrustZone mainline and FPU GPDMA1 GPDMA2 ETHERNET MAC(1) SDMMC1 SDMMC2 (1) S-bus C-bus Port 1 Port 0 Port 1 Port 0 ICACHE (8-Kbyte) DCACHE (4-Kbyte) 1. Not available on STM32H523/H573 Slow-bus Fastbus 32-bit Bus Matrix AHB1 peripherals AHB2 peripherals MPCWM1 MPCWM2 AHB4 peripherals AHB3 peripherals MPCBB1 MPCBB2 MPCBB3 MPCWM4 OTFDEC Flash SRAM1 SRAM2 SRAM3 BKPSRAM OCTOSPI FMC DT72430V2 The Cortex-M33 benefits from using CACHE. · ICACHE improves the performance of Cortex-M33 when fetching code or data from the internal memories through fast bus (flash memory, SRAMs) and from external memories through slow bus (OCTOSPI and FMC). · DCACHE improves the performance when fetching data from external memories through the slow bus (OCTOSPI and FMC). STM32H503 smart architecture This architecture is based on a bus matrix allowing multiple masters (Cortex-M33, ICACHE and GPDMAs) to access multiple slaves (such as flash memory, SRAMs and BKPSRAM). The figure below describes the STM32H5 series smart architecture. Fast-bus C-bus Cortex-M33 with FPU ICACHE (8-Kbyte) S-bus Figure 4. STM32H503 series smart architecture Port 0 Port 1 Port 0 Port 1 GPDMA1 GPDMA2 APB1 peripherals APB2 peripherals 128-bit cache refill Legend Bus multiplexer Master interface Fast bus multiplexer Slave interface MPCBBx: Block-based memory protection controller MPCWM: Watermark-based memory protection controller MPCBB1 FLASH (128 Kbytes) SRAM1 32-bit Bus Matrix AHB1 peripherals AHB2 peripherals AHB3 peripherals MPCBB2 MPCWM SRAM2 BKPSRAM DT68871V2 AN5212 - Rev 5 page 5/24 AN5212 ICACHE and DCACHE overview The Cortex-M33 benefits from using CACHE. · ICACHE improves the performance of Cortex-M33 when fetching code or data from the internal memories through fast bus (flash memory, SRAMs). AN5212 - Rev 5 page 6/24 AN5212 ICACHE and DCACHE overview Cortex-M33 with TrustZone and FPU Execution port interface AHB1 Master ports interface BusMatrix-S 2.4 ICACHE block diagram The ICACHE block diagram is given in the figure below. Figure 5. ICACHE block diagram ICACHE Execution port C-bus ICACHE interrupt Configuration slave port for ICACHE registers access Configuration interface Cache control logic Cache FSM pLRU-t REMAP Cache memory port AHB master1 port Fast bus AHB master2 port Slow bus 2 ways Cache TAG memories 2 ways Cache Data memories The ICACHE memory includes: · the TAG memory with: the address tags that indicate which data are contained in the cache data memory the validity bits · the data memory, that contains the cached data AN5212 - Rev 5 page 7/24 AN5212 ICACHE and DCACHE overview 2.5 DCACHE block diagram The DCACHE block diagram is given in the figure below. Figure 6. DCACHE block diagram AHB Configuration slave port Read-hit monitor Read-miss monitor Configuration interface Write-hit monitor CMD range start @ Write-miss monitor CMD range end @ Control Status Input port AHB dcache_it Slave port interface Cache control logic Cache FSM pLRU-t Maintenance operations Cache memory port Master port AHB Master port interface Main AHB S-AHB or M0 port GPU2D Cortex-M33 DT71536V1 DCACHE n ways Cache TAG memories n ways Cache data memories The DCACHE memory includes: · the TAG memory with: the address tags that indicate which data are contained in the cache data memory the validity bits the privilege bits the dirty bits · the data memory, that contains the cached data AN5212 - Rev 5 page 8/24 AN5212 ICACHE and DCACHE features 3 ICACHE and DCACHE features 3.1 ICACHE features 3.1.1 Dual masters The ICACHE accesses the AHB bus matrix either over: · One AHB master port: master1 (fast bus) · Two AHB master ports: master1 (fast bus) and master2 (slow bus) This feature allows the traffic to be decoupled when accessing different memory regions (such as internal flash memory, internal SRAM and external memories), in order to reduce the CPU stalls on cache misses. The following table summarizes memory regions and their addresses. Type Internal Table 2. Memory regions and their addresses Peripheral Name Product name and region size FLASH SRAM1 SRAM2 STM32H503 128 KB STM32L5 series/ STM32U535/ 545/ STM32H523/ 533 512 KB STM32U575/ 585 STM32H563/ 573/562 2 MB STM32U59x/ 5Ax/5Fx/5Gx 4 MB STM32H503 16 KB STM32L5 series/ STM32U535/ 545/575/585 192 KB STM32H523/ 533 128 KB STM32H563/ 573/562 256 KB STM32U59x/ 5Ax/5Fx/5Gx 768 KB STM32H503 series 16 KB STM32L5 series/ STM32U535/ 545/575/585 64 KB STM32H523/ 533 64 KB Cacheable memory access Bus name Nonsecure region starting address Secure, nonsecure callable region starting address N/A 0x0800 0000 0x0C00 0000 N/A ICACHE fast bus 0x0A00 0000 0x0E00 0000 0x0A00 4000 N/A 0x0A03 0000 0x0E03 0000 0x0A04 0000 0x0E04 0000 Not cacheable memory access Bus name Nonsecure region starting address Secure, nonsecure callable region starting address N/A N/A N/A 0x2000 0000 0x3000 0000 Sbus 0x2000 4000 N/A 0x2003 0000 0x3003 0000 0x2004 0000 0x3004 0000 AN5212 - Rev 5 page 9/24 AN5212 ICACHE and DCACHE features Peripheral SRAM2 STM32H563/ 573/562 80 KB STM32U59x/ 5Ax/5Fx/5Gx 64 KB STM32U575/ 585 512 KB Internal SRAM3 STM32H523/ 533 64 KB STM32H563/ 573/562 320 KB STM32U59x/ 5Ax/5Fx/5Gx 832 KB SRAM5 STM32U59x/ 5Ax/5Fx/5Gx 832 KB SRAM6 STM32U5Fx/ 5Gx 512 KB HSPI1 STM32U59x/ 5Ax/5Fx/5Gx FMC STM32H563/ SDRAM 573/562 OCTOSPI1 bank nonsecure STM32L5/U5 series STM32H563/ 573/562 External FMC bank 3 nonsecure STM32L5/U5 series 256 MB STM32H563/ 573/562 OCTOSPI2 STM32U575/ bank 585/59x/5Ax/ nonsecure 5Fx/5Gx FMC bank 1 nonsecure STM32L5/U5 series STM32H563/ 573/562 1. To be selected when remapping such regions. Cacheable memory access 0x0A04 0000 0x0E04 0000 0x0A0C 0000 0x0E0C 0000 0x0A04 0000 0x0E04 0000 ICACHE fast bus 0x0A05 0000 0x0E05 0000 0x0A0D 0000 0x0E0D 0000 0x0A1A 0000 0x0E1A 0000 0x0A27 0000 0x0E27 0000 Alias address in the range of [0x0000 0000 to 0x07FF ICACHE FFFF] or slow bus [0x1000 N/A (1) 0000:0x1FFF FFFF] defined by means of remapping feature Not cacheable memory access 0x2004 0000 0x3004 0000 0x200C 0000 0x300C 0000 0x2004 0000 0x3004 0000 0x2005 0000 0x3005 0000 0x200D 0000 0x300D 0000 0x201A 0000 0x301A 0000 0x2027 0000 Sbus 0xA000 0000 0xC000 0000 0x9000 0000 N/A 0x8000 0000 0x7000 0000 0x6000 0000 AN5212 - Rev 5 page 10/24 3.1.2 AN5212 ICACHE and DCACHE features 1-way versus 2-way ICACHE By default, the ICACHE is configured in associative operating mode (two ways enabled), but it is possible to configure the ICACHE in direct mapped mode (one way enabled), for applications requiring a very-low power consumption. The ICACHE configuration is done with the WAYSEL bit in ICACHE_CR as follows: · WAYSEL = 0: direct mapped operating mode (1-way) · WAYSEL = 1 (default): associative operating mode (2-way) Table 3. 1-way versus 2-way ICACHE Parameter 1-way ICACHE Cache size (Kbytes) Cache number of ways 1 Cache line size Number of cache lines 512(1)/2048(2) 1. For STM32L5 series /STM32H5 series /STM32U535/545/575/585 2. For STM32U59x/5Ax/5Fx/5Gx 2-way ICACHE 8(1)/32(2) 2 128 bits (16 bytes) 256(1)/1024(2) per way AN5212 - Rev 5 page 11/24 3.1.3 AN5212 ICACHE and DCACHE features Burst type Some Octo-SPI memories support the WRAP burst, that provides the benefit of critical-word-first feature performance. The ICACHE burst type of the AHB memory transaction for remapped regions is configurable. It implements incremental burst or WRAP burst, selected with the HBURST bit in the ICACHE_CRRx register. The differences between the WRAP and the incremental bursts are given below (see also the figure): · WRAP burst: cache line size = 128 bits burst starting address = word address of the first data requested by the CPU · Incremental burst: cache line size = 128 bits burst starting address = address aligned on the boundary of the cache line containing the requested word Figure 7. Incremental versus WRAP burst 128 bits (16 bytes) group alignement boundaries Incremental burst 0x0 0x1 0x2 0x3 0x4 0x5 0x6 0x7 0x8 0x9 0xA 0xB 0xC0xD 0xE 0xF WRAP burst 0x0 0x1 0x2 0x3 0x4 0x5 0x6 0x7 0x8 0x9 0xA 0xB 0xC0xD 0xE 0xF 3.1.4 Cacheable regions and remapping feature The ICACHE is connected to the Cortex-M33 through the C-AHB bus, and caches the code region from addresses [0x0000 0000 to 0x1FFF FFFF]. Since the external memories are mapped at an address in the range [0x6000 0000 to 0xAFFF FFFF], the ICACHE supports a remap feature that allows any external memory region to be remapped at an address in the range of [0x0000 0000 to 0x07FF FFFF] or [0x1000 0000 to 0x1FFF FFFF], and to become accessible through the C-AHB bus. Up to four external memory regions can be remapped with this feature. Once a region is remapped, the remap operation occurs even if the ICACHE is disabled or if the transaction is not cacheable. The cacheable memory regions can be defined and programmed by the user in the memory protection unit (MPU). The table below summarizes the configurations of the STM32L5 and STM32U5 series memories. Table 4. Configuration of STM32L5 and STM32U5 series memories Product memory Flash memory SRAM External memories (HSPI/ OCTOSPI or FSMC) Cacheable (MPU programming) Yes or No Not recommended Yes or No Remapped in ICACHE (ICACHE_CRRx programming) Not required Required if the user wants external code fetching on CAHB bus (else on S-AHB bus) AN5212 - Rev 5 page 12/24 3.1.5 AN5212 ICACHE and DCACHE features Benefit of ICACHE external memory remapping The example in the figure below shows how to benefit from the ICACHE enhanced performance during code execution or data read when accessing an external 8-Mbyte external Octo-SPI memory (such as external flash memory or RAM). Figure 8. Octo-SPI memory remap example 0x2000 0000 SRAM1 (non-secure) Code (non-secure) 0x1080 0000 0x1000 0000 Cacheable 8-Mbyte external memory code or data (alias) Callable code (non-secure) 0xA000 0000 OCTOSPI1 memory-mapped region Remap Not cacheable 8-Mbyte 0x9080 0000 external memory code or data 0x9000 0000 FSMC Bank 3 Note: The following steps are needed to remap this external memory: 1. OCTOSPI configuration for the external memory Configure the OCTOSPI interface in order to access the external memory in Memory mapped mode (the external memory is seen as an internal memory mapped in the [0x9000 0000 to 0x9FFF FFFF] region). Since the external memory size is 8 Mbytes, it is seen at the region [0x9000 0000 to 0x907F FFFF]. The external memory at this region is accessed through the Sbus and is not cacheable. The next step shows the ICACHE configuration in order to remap this region. For the OCTOSPI configuration in memory-mapped mode, refer to the application note OctoSPI interface on STM32 microcontrollers (AN5050). AN5212 - Rev 5 page 13/24 AN5212 ICACHE and DCACHE features 2. ICACHE configuration to remap the external memory mapped region The 8 Mbytes placed in the [0x9000 0000 to 0x907F FFFF] region are remapped to the [0x1000 0000 to 0x107F FFFF] region. They can then be accessed through the slow bus (ICACHE master2 bus). ICACHE_CR register configuration a. Disable ICACHE with EN = 0. b. Select 1-way or 2-ways (depending on the application needs) with WAYSEL = 0 or 1, respectively. ICACHE_CRRx register configuration (up to four regions, x = 0 to 3) a. Select the 0x1000 0000 base address (remap address) with BASEADDR [28:21] = 0x80. b. Select the 8-Mbyte region size to remap with RSIZE[2:0] = 0x3. c. Select the 0x9000 0000 remapped address REMAPADDR[31:21] = 0x480. d. Select the ICACHE AHB master2 port for external memories with MSTSEL = 1. e. Select the WRAP burst type with HBURST = 0. f. Enable the remapping for region x with REN = 1. The following figure shows how the memory regions are seen with IAR after enabling the remap. Figure 9. Memory regions remapping example 3.1.6 Note: 3.1.7 The 8-Mbyte external memory is now remapped and can be accessed over the [0x1000 0000 to 0x107F FFFF] region. 3. ICACHE enable ICACHE_CR register configuration Enable the ICACHE with EN = 1. Hit and miss monitors ICACHE provides two monitors for performance analysis: a 32-bit hit monitor and a 16-bit miss monitor. · The hit monitor counts the cacheable AHB-transactions on slave cache port that hit ICACHE content (fetched data already available in the cache). The hit monitor counter is available in the ICACHE_HMONR register. · The miss monitor counts the cacheable AHB-transactions on slave cache port that miss ICACHE content (fetched data not already available in the cache). The miss monitor counter is available in the ICACHE_MMONR register. These two monitors do not wrap over when reaching their maximum values. These monitors are managed from the following bits in the ICACHE_CR register: · HITMEN bit (respectively MISSMEN bit) to enable/stop the hit (respectively miss) monitor · HITMRST bit (respectively MISSMRST bit) to reset the hit (respectively miss) monitor By default, theses monitors are disabled in order to reduce power consumption. ICACHE maintenance The software can invalidate the ICACHE by setting the CACHEINV bit in the ICACHE_CR register. This action invalidates the whole cache, making it empty. Meanwhile, if some remapped regions are enabled, the remap feature is still active, even when the ICACHE is disabled. As the ICACHE only manages read transactions and does not manage write transactions, it does not ensure coherency in case of writes. Consequently, the software must invalidate the ICACHE after programming a region. AN5212 - Rev 5 page 14/24 3.1.8 3.1.9 3.2 3.2.1 3.2.2 AN5212 ICACHE and DCACHE features ICACHE security ICACHE is a securable peripheral that can be configured as secure through the GTZC TZSC secure configuration register. When it is configured as secure, only secure accesses are allowed to the ICACHE registers. ICACHE can also be configured as privileged through the GTZC TZSC privilege configuration register. When ICACHE is configured as privileged, only privileged accesses are allowed to the ICACHE registers. By default, the ICACHE is nonsecure and non-privileged through the GTZC TZSC. Event and interrupt management The ICACHE manages the functional errors when detected, by setting the ERRF flag in ICACHE_SR. An interrupt can also be generated if the ERRIE bit is set in ICACHE_IER. In case of ICACHE invalidation, when the cache busy state finished, the BSYENDF flag is set in ICACHE_SR. An interrupt can also be generated if the BSYENDIE bit is set in ICACHE_IER. The table below lists the ICACHE interrupt and event flags. Register ICACHE_SR ICACHE_IER ICACHE_FCR Table 5. ICACHE interrupt and event management bits Bit name BUSYF BSYENDF ERRF ERRIE BSYENDIE CERRF CBSYENDF Bit description Cache executing a full invalidate operation Cache invalidation operation finished An error occurred during caching operation Enable interrupt for cache error Enable interrupt in case of invalidation operation finished Clears ERRF in ICACHE_SR Clears BSYENDF in ICACHE_SR Bit access type Read-only Read/write Write-only DCACHE features The purpose of the data cache is to cache external memories data loads and data stores coming from the processor or from another bus master peripheral. DCACHE manages both read and write transactions. DCACHE cacheability traffic The DCACHE caches the external memories from the master port interface through the AHB bus. The incoming memory requests are defined cacheable according to its AHB transaction memory lockup attribute. The DCACHE write policy is defined as write-through or write-back depending to the memory attribute configured by the MPU. When a region is configured as non-cacheable , the DCACHE is bypassed. Table 6. DCACHE cacheability for AHB transaction AHB lookup attribute 0 1 1 AHB bufferable attribute X 0 1 Cacheability Read and write: non cacheable Read: cacheable Write: (cacheable) write-through Read: cacheable Write: (cacheable) write-back DCACHE cacheable regions For STM32U5 series, the DCACHE1 slave interface is connected to the Cortex-M33 through the S-AHB bus, and caches the GFXMMU, FMC, and HSPI/OCTOSPIs. The DCACHE2 slave interface is connected to the DMA2D through the M0 port bus, and caches all the internal and external memories (except SRAM4 and BRKPSRAM). For STM32H5 series, the DCACHE slave interface is connected to the Cortex-M33 through the S-AHB external memories through FMC and OCTOSPI. AN5212 - Rev 5 page 15/24 Note: 3.2.3 3.2.4 3.2.5 Note: 3.2.6 AN5212 ICACHE and DCACHE features Table 7. DCACHE cacheable regions and interfaces Cacheable memory address region GFXMMU SRAM1 SRAM2 SRAM3 SRAM5 SRAM6 HSPI1 OCTOSPI1 FMC BANKs OCTOSPI2 DCACHE1 cacheable interfaces X N/A X X X X DCACHE2 cacheable interfaces X X X X X X X X X X Some interfaces are not supported in certain products. Refer to Figure 1 or the specific product reference manual. Burst type Same as ICACHE, the DCACHE supports incremental and wrapped bursts (see Section 3.1.3). For DCACHE, the burst type is configured through the HBURST bit in DCACHE_CR. DCACHE configuration During boot , DCACHE is disabled by default making the slave memory requests forwarded directly to master port. To enable DCACHE, EN bit must be set in the DCACHE_CR register. Hit and miss monitors The DCACHE implements four monitors for cache performance analysis: · Two 32-bit (R/W) hit monitor: counts the number of times the CPU read or write data in the cache memory without generating a transaction on DCACHE master ports (data already available in the cache). The (R/W) hit monitors counters are available respectively in the DCACHE_RHMONR and DCACHE_WHMONR registers. · Two 16-bit (R/W) miss monitor: counts the number of times the CPU read or write data in the cache memory and generates a transaction on DCACHE master ports, in order to load the data from the memory region (fetched data not already available in the cache). The (R/W) miss monitors counters are available respectively in the DCACHE_RMMONR and DCACHE_WMMONR registers. These four monitors do not wrap over when reaching their maximum values. These monitors are managed from the following bits in the DCACHE_CR register: · WHITMEN bit (respectively WMISSMEN bit) to enable/stop the write hit (respectively miss) monitor · RHITMEN bit (respectively RMISSMEN bit) to enable/stop the read hit (respectively miss) monitor · WHITMRST bit (respectively WMISSMRST bit) to reset the write hit (respectively miss) monitor · RHITMRST bit (respectively RMISSMRST bit) to reset the read hit (respectively miss) monitor By default, theses monitors are disabled in order to reduce power consumption. DCACHE maintenance The DCACHE offers multiple maintenance operations that can be configured through CACHECMD[2:0] in DCACHE_CR. 000: no operation (default) 001: clean range. Clean a certain range in the cache 010: invalidate range. Invalidate a certain range in the cache 010: clean and invalidate range. Clean and invalidate a certain range in the cache AN5212 - Rev 5 page 16/24 Note: 3.2.7 3.2.8 AN5212 ICACHE and DCACHE features The selected range is configured through: · CMDSTARTADDR register: command starting address · CMDENDADDR register: command ending address This register must be set before CACHECMD is written. The cache command maintenance starts when STARTCMD bit is set in DCACHE_CR register. The DCACHE also support a full CACHE invalidation by setting the CACHEINV bit in DCACHE_CR register. DCACHE security The DCACHE is a securable peripheral that can be configured as secure through the GTZC TZSC secure configuration register. When it is configured as secure, only secure accesses are allowed to the DCACHE registers. DCACHE can also be configured as privileged through the GTZC TZSC privilege configuration register. When DCACHE is configured as privileged, only privileged accesses are allowed to the DCACHE registers. By default, the DCACHE is nonsecure and non- privileged through the GTZC TZSC. Event and interrupt management The DCACHE manages the functional errors when detected, by setting the ERRF flag in DCACHE_SR. An interrupt can also be generated if the ERRIE bit is set in DCACHE_IER. In case of DCACHE invalidation, when the cache busy state finished, the BSYENDF flag is set in DCACHE_SR. An interrupt can also be generated if the BSYENDIE bit is set in DCACHE_IER. The DCACHE command status can be checked through CMDENF and BUSYCMDF through the DCACHE_SR An interrupt can also be generated if the CMDENDIE bit is set in DCACHE_IER. The table below lists the DCACHE interrupts and events flags. Register DCACHE_SR DCACHE_IER DCACHE_FCR Table 8. DCACHE Interrupt and events management bits Register BUSYF BSYENDF BUSYCMDF CMDENDF ERRF ERRIE CMDENDIE BSYENDIE CERRF CCMDENDF CBSYENDF Bit description Cache executing a full invalidate operation Cache full invalidate operation ended Cache executing a range command A range command end An error occurred during caching operation Enable interrupt for cache error Enable interrupt on range command end Enable interrupt on full invalidate operation end Clears ERRF in DCACHE_SR Clears CMDENDF in DCACHE_SR Clears BSYENDF in DCACHE_SR Bit access type Read-only Read/write Write-only AN5212 - Rev 5 page 17/24 4 Note: AN5212 ICACHE and DCACHE performance and power consumption ICACHE and DCACHE performance and power consumption Using ICACHE and DCACHE improve the application performance when accessing external memories. The following table shows the impact of ICACHE and DCACHE on CoreMark® execution when accessing external memories. Table 9. ICACHE and DCACHE performance on CoreMark execution with external memories (1) CoreMark code CoreMark Data ICACHE configuration DCACHE configuration CoreMark score/Mhz Internal Flash memory Internal SRAM Enabled (2-ways) Disabled 3.89 Internal Flash memory External Octo-SPI PSRAM ( Sbus) Enabled (2-ways) Enabled 3.89 Internal Flash memory External Octo-SPI PSRAM ( Sbus) Enabled (2-ways) Disabled 0.48 External Octo-SPI Flash (C-bus) Internal SRAM Enabled (2-ways) Disabled 3.86 External Octo-SPI Flash (C-bus) Internal SRAM Disabled Disabled 0.24 Internal Flash memory Internal SRAM Disabled Disabled 2.69 1. Test Conditions: · Applicable product: STM32U575/585 · System frequency: 160 MHz. · External Octo-SPI PSRAM memory: 80 MHz (DTR mode). · External Octo-SPI flash memory: 80 MHz (STR mode). · Compiler: IAR V8.50.4. · Internal Flash PREFETCH: ON. Using ICACHE and DCACHE reduce the power consumption when accessing internal and external memories. The following table shows the impact of ICACHE on power consumption during CoreMark execution. Table 10. CoreMark execution ICACHE impact on power consumption (1) ICACHE configuration MCU power consumption (mA) Enabled (2-ways) 7.60 Enabled (1-way) 7.13 Disabled 8.89 1. Test Conditions: · Applicable product: STM32U575/585 · CoreMark code: internal Flash memory. · CoreMark data: internal SRAM. · Internal Flash memory PREFETCH: ON. · System frequency: 160 MHz. · Compiler: IAR V8.32.2. · Voltage range: 1. · SMPS: ON. 2-way set associative configuration is more performing than 1-way set associative configuration for code that cannot be fully loaded in cache. Meanwhile, 1-way set associative cache is almost always more power efficient than 2-way set associative cache. Each code has to be evaluated in both associativity configurations, in order to select the best trade-off between performance and power consumption. The selection depends on the user priority. AN5212 - Rev 5 page 18/24 AN5212 Conclusion 5 Conclusion The first caches developed by STMicroelectronics, ICACHE and DCACHE, are able to cache internal and external memories, offering performance enhancement for data traffic and instruction fetches. This document shows the different features supported by the ICACHE and DCACHE, their configuration simplicity and flexibility allow lower development cost and faster time to market. AN5212 - Rev 5 page 19/24 Revision history Date 10-Oct-2019 27-Feb-2020 7-Dec-2021 15-Feb-2023 11-Mar-2024 AN5212 Table 11. Document revision history Version 1 2 3 4 5 Changes Initial release. Updated: · Table 2. Memory regions and their addresses · Section 2.1.7 ICACHE maintenance · Section 2.1.8 ICACHE security Updated: · Document title · Introduction · Section 1 ICACHE and DCACHE overview · Section 4 Conclusion Added: · Section 2 ICACHE and DCACHE features · Section 3 ICACHE and DCACHE performance and power consumption Updated: · Section 2.2: STM32U5 series smart architecture · Section 2.5: DCACHE block diagram · Section 3.1.1: Dual masters · Section 3.1.2: 1-way versus 2-way ICACHE · Section 3.1.4: Cacheable regions and remapping feature · Section 3.2: DCACHE features · Section 3.2.2: DCACHE cacheable regions · Section 4: ICACHE and DCACHE performance and power consumption Added: · Section 1: General information Updated: · Section 2.3: STM32H5 series smart architecture · Section 3.1.1: Dual masters AN5212 - Rev 5 page 20/24 AN5212 Contents Contents 1 General information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 2 ICACHE and DCACHE overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 STM32L5 series smart architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 STM32U5 series smart architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 STM32H5 series smart architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.4 ICACHE block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.5 DCACHE block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 ICACHE and DCACHE features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 ICACHE features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1.1 Dual masters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1.2 1-way versus 2-way ICACHE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.3 Burst type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1.4 Cacheable regions and remapping feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1.5 Benefit of ICACHE external memory remapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.6 Hit and miss monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.7 ICACHE maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.8 ICACHE security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.9 Event and interrupt management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 DCACHE features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.1 DCACHE cacheability traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.2 DCACHE cacheable regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.3 Burst type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.4 DCACHE configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.5 Hit and miss monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.6 DCACHE maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.7 DCACHE security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.8 Event and interrupt management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 ICACHE and DCACHE performance and power consumption . . . . . . . . . . . . . . . . . . . . . .18 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 Revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 List of tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22 List of figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 AN5212 - Rev 5 page 21/24 AN5212 List of tables List of tables Table 1. Table 2. Table 3. Table 4. Table 5. Table 6. Table 7. Table 8. Table 9. Table 10. Table 11. Applicable products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Memory regions and their addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1-way versus 2-way ICACHE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Configuration of STM32L5 and STM32U5 series memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 ICACHE interrupt and event management bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 DCACHE cacheability for AHB transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 DCACHE cacheable regions and interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 DCACHE Interrupt and events management bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 ICACHE and DCACHE performance on CoreMark execution with external memories . . . . . . . . . . . . . . . . . . . . 18 CoreMark execution ICACHE impact on power consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Document revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 AN5212 - Rev 5 page 22/24 AN5212 List of figures List of figures Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. Figure 7. Figure 8. Figure 9. STM32L5 series smart architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 STM32U5 series smart architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 STM32H563/H573 and STM32H562 series smart architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 STM32H503 series smart architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 ICACHE block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 DCACHE block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Incremental versus WRAP burst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Octo-SPI memory remap example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Memory regions remapping example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 AN5212 - Rev 5 page 23/24 AN5212 IMPORTANT NOTICE READ CAREFULLY STMicroelectronics NV and its subsidiaries ("ST") reserve the right to make changes, corrections, enhancements, modifications, and improvements to ST products and/or to this document at any time without notice. Purchasers should obtain the latest relevant information on ST products before placing orders. ST products are sold pursuant to ST's terms and conditions of sale in place at the time of order acknowledgment. Purchasers are solely responsible for the choice, selection, and use of ST products and ST assumes no liability for application assistance or the design of purchasers' products. No license, express or implied, to any intellectual property right is granted by ST herein. Resale of ST products with provisions different from the information set forth herein shall void any warranty granted by ST for such product. ST and the ST logo are trademarks of ST. For additional information about ST trademarks, refer to www.st.com/trademarks. All other product or service names are the property of their respective owners. Information in this document supersedes and replaces information previously supplied in any prior versions of this document. © 2024 STMicroelectronics All rights reserved AN5212 - Rev 5 page 24/24