Instruction Manual for ST models including: Cortex-M0, Cortex-M23, Cortex-M33-M35P, Cortex-M55, Cortex-M85, Cortex-M0 Plus Microcontrollers, Cortex-M0 Plus, Microcontrollers
1 ago 2024 — Let's start our description of the CPU by the processor core in charge of fetching and executing instructions. 4. Page 5. ARM Cortex-M0 → 2-stage pipeline.
File Info : application/pdf, 11 Pages, 554.07KB
DocumentDocumentSTM32U0- ARM® Core ARM Cortex®-M0+ Core Hello, and welcome to this presentation of the ARM® Cortex®M0+ core which is embedded in all products of the STM32U0 microcontroller family. 1 Cortex-M0+ processor overview · ARMv6-M architecture · Von Neuman architecture, 2-stage pipeline · Single-issue architecture · Multiply in 1-cycle · Memory Protection Unit (MPU) · Single-cycle I/O port Arm Cortex®-M0+ Nested vectored interrupt controller CPU arm v6-M Memory Protection Unit AHB-Lite Fast I/O port Data watchpoint Breakpoint unit Serial Wire Debug Ultra low power design Low power consumption and high energy efficiency Very compact code Except control instructions and branch and link, all instructions are 16 bits long 2 The Cortex®-M0+ core is part of the ARM Cortex-M group of 32-bit RISC cores. It implements the ARMv6-M architecture and features a 2-stage pipeline. The Cortex®-M0+ has a unique AHB-Lite master port, but supports concurrent instruction fetch and data access when the data access targets the Fast I/O Port address range. 2 Cortex-M processors compatibility · Seamless architecture across all applications Cortex-M23 Cortex-M33/M35P Cortex-M55 Cortex-M85 TrustZone TrustZone TrustZone TrustZone Cortex-M0/M0+ Ultra low power Small footprint Cortex-M3 Cortex-M4 Exceptional 32-bit performance with low power consumption Control and performance for mixed signal devices Binary and tool compatible Cortex-M7 Highest performance Cortex-M processor 3 STM32U0 microcontrollers integrate an ARM® Cortex®-M0+ core in order to benefit from the incomparable performance per milliwatt ratio. All Cortex®-M CPUs have a 32-bit architecture. The Cortex®-M3 was the first Cortex®-M CPU released by ARM. Then ARM decided to distinguish two product lines: high performance and low power, while maintaining the compatibility between them. The Cortex®-M0+ belongs to the low power product line. It is designed for batterypowered devices, very sensitive to power consumption. 3 Core architecture overview ARM® Cortex®-M0+ PROCESSOR CORE 2-stage pipeline BUS INTERFACE UNIT MPU AHB-Lite AHB-Lite bus matrix Internal memories Internal peripherals NVIC DEBUG 32 Interrupt requests Debug 4 The Cortex®-M0+ core delivers more performance than the Cortex®-M0 core thanks to the 2-stage instruction pipeline. Let's start our description of the CPU by the processor core in charge of fetching and executing instructions. 4 ARM Cortex-M0+ 2-stage pipeline FETCH AND PRE-DECODE DECODE AND EXECUTE Two 16-bit instruction fetches No instruction fetch Possible data access Two 16-bit instruction fetches Fetch InstrN, InstrN+1 Pre-decode InstrN Clock 1 Decode InstrN Execute InstrN Pre-decode InstrN+1 Decode InstrN+1 Execute InstrN+1 Fetch InstrN+2, InstrN+3 Pre-decode InstrN+2 Clock 2 Clock 3 5 Most V6-M instructions are 16 bits long. There are only six 32bit instructions and most of them are control instructions, rarely used. However, the branch and link instruction, which is used to call a sub-program is also 32 bits long, in order to support a large offset between this instruction and the label pointing to the next instruction to be executed. Ideally one 32-bit access loads two 16-bit instructions, which results in less fetches per instruction. During clock number 2, no instruction fetch occurs. The AHB Lite port is available to execute a data access when instruction N is a load/store instruction. 5 Branch performance · Cortex®-M0+ core · Maximum two 16-bit branch shadow instructions ... Label: Inst0 B Inst1 Inst2 ... InstN InstN+1 Label ; Branch to Label ; Branch shadow instruction ; Branch shadow instruction Fetch and pre-decode Decode and execute Clock 1 Inst0, B Label Clock 2 Inst0 Clock 3 Ins1, Inst2 B Label Clock 4 InstrN, InstN+1 6 On a given branch, fewer pre-fetched instructions are wasted (thanks to the 2-stage pipeline). In clock number 1, the processor fetches Inst0 and an unconditional branch instruction. In clock number 2, it executes Instr0. In clock number 3, it executes the branch instruction while fetching the two next sequential instructions Inst1 and Inst2 called branch shadow instructions. In clock number 4, the processor discards Inst1 and Inst2 and fetches InstrN and InstN+1. Cortex-M0, M3 and M4 implement a 3-stage pipeline: Fetch, Decode and Execute. The number of branch shadow instructions is larger: up to four 16-bit instructions. 6 Core architecture overview ARM® Cortex®-M0+ PROCESSOR CORE 2-stage pipeline BUS INTERFACE UNIT MPU SINGLE-CYCLE I/O PORT AHB-Lite AHB-Lite bus matrix GPIO ports Internal memories Internal peripherals NVIC DEBUG 32 Interrupt requests Debug 7 The Cortex®-M0+ has neither an embedded cache nor internal RAM. Consequently, any instruction fetch transaction is steered to the AHB-Lite interface and any data access is steered either to the AHB-Lite interface or the Single-cycle I/O port. Note that the STM32U0 implements a SoC-level instruction cache, external to the CPU, located in the embedded flash controller. The AHB-Lite master port is connected to a bus matrix, enabling the CPU to access memories and peripherals. Since transactions are pipelined on AHB-Lite, the best throughput is 32 bits of data or instructions per clock, with a minimum 2clock latency. The Cortex®-M0+ also features a Single-cycle I/O Port, enabling the CPU to access data with a 1-clock latency. An external decoding logic determines the address range in which data accesses are steered to this port. 7 In the STM32U0, the Single-cycle I/O Port is not used to access GPIO port registers. GPIO ports are mapped to AHB instead, allowing to be accessed by DMA. 7 Memory protection unit · MPU attribute settings define access permissions · 8 independent memory regions · Can execute code? · Can write data ? · Unprivileged mode access? 8 The MPU in STM32U0 microcontroller offers support for eight independent memory regions, with independent configurable attributes for: - access permission: allowed or not read/write in privileged/unprivileged mode, - execution permission: executable region or region prohibited for instruction fetch. 8 References · For more details, please refer to the following documentation: · STM32G0 Series Cortex®-M0+ processor programming manual (PM0223) · Managing memory protection unit (MPU) in STM32 MCUs (AN4838) · ARM website at the following link: · http://www.arm.com/products/processors/cortex-m/cortex-m0+-processor.php 9 For more details, please refer to these application notes and the Cortex®-M0+ programming manual available on www.st.com website. Also visit the ARM website where you will find more information about the Cortex®-M0+ core. 9 Thank you © STMicroelectronics - All rights reserved. ST logo is a trademark or a registered trademark of STMicroelectronics International NV or its affiliates in the EU and/or other countries. For additional information about ST trademarks, please refer to www.st.com/trademarks. All other product or service names are the property of their respective owners. 10 Thanks for attending this presentation! 10