May, 2010 # Migrating from PowerQUICC II Pro to QorlQ P1 / P2 and H/W Design Guide Haim Cohen ## **Session Objectives** ### This session will: - Provide overview and highlight the new features introduced with the QorlQ™ P1 and P2 device family of processors - Provide the detail and guidance on migrating designs from PowerQUICC II Pro<sup>™</sup> to the QorlQ<sup>™</sup> family - Provide overview on H/W design guide for QorIQ™ P2020 ## **Focus Freescale Market Segments** # 20 Years of Communications Processing Evolution ## **SOI CMOS Technology Roadmap** #### 90nm Dual Core Processor #### 45nm Dual Core Same functionality and performance delivered today in C90SOI is 50% lower power in 45SOI #### 32nm Dual Core Same functionality and performance in 45SOI is 50% lower power in 32SOI P1022 P1013 ## **Qorl Q Platform Levels** PLATFORMS / PRODUCTS **DESCRIPTION** APPLICATION EXAMPLES QorlQ P5 **Highest-performing PRODUCTS:** embedded processors To be announced **Service Provider Network Storage Admission Control Networks Routers** Oor O P4 Tap the full potential PRODUCTS: of multicore with this P4080 "many-core" platform P4040 **Metro Carrier** IMS Controller Radio Network Serving Node Edge Router Router (GSN) Control Oorlo Ps PRODUCTS: Your first step into true To be announced multicore performance Converged SSL IPSec. Access Gateway Media Gateway Eirewall QorlQ P2 Unprecedented PRODUCTS: performance per watt in P2020 this highly integrated P2010 platform **Unified Threat** VolP Carrier-Class Wireless Media Basestation Management Media Gateway Gateway Qorl Q P1 A highly integrated, cost-PRODUCTS: effective, low power P1020 P1011 platform P1021 P1012 Home Media Hub Network Attached Storage Integrated Services Router # **PowerQUICC Migration to QorlQ™ Platforms** ## Introducing Freescale's QorlQ platforms Designed to enable the development of the next era of networking applications by delivering - Improved Processing Performance Highperformance Power® Architecture based multicore solutions - Power-Efficiency 45nm process technology for industry-leading power-toperformance solution - Programmability Programming tools, ecosystem partners, common software and <u>pin compatible processors</u> Teams validating part at component and board level Alpha samples launched in February 2009 # QorlQ P2020/10 Block Diagram • Single/Dual e500 Power Architecture™ core • 800 - 1200 MHz Red = New Green = Enhanced Blue = Same (Compared to PQ2 Pro) - 512KB Frontside L2 cache w/ECC, HW cache coherent - 36 bit physical addressing, DP-FPU #### System Unit - 64/32b DDR2/DDR3 with ECC - Integrated SEC 3.1 Security Engine - Open-PIC Interrupt Controller, Perf Mon, 2x I2C, Timers, 16 GPIO's. DUART - 16-bit Enhanced Local Bus supports booting from NAND Flash - One USB 2.0 Host Controller with ULPI interface - eSPI controller supporting booting from SPI serial Flash - SD/MMC card controller supporting booting from Flash cards - Three 10/100/1000 Ethernet Controllers (eTSEC) w/ Jumbo Frame support, SGMII interface - Enhanced features: Parser/Filer, QOS, IP-Checksum Offload, Lossless Flow Control - IEEE 1588v2 support - Two Serial Rapid I/O Controllers with integrated message unit operating up to 3.125GHz - Three PCI Express 1.0a Controllers operating at 2.5GHz #### Process & Package - 45nm SOI, 1.05V +/- 50mV, 0C to 125C Tj - with -40C to 125C Tj option - 689-pin TePBGAII, 31x31mm - · 8.0 W (Est) Dual Core at 1.2GHz ## QorlQ1020 / P1011 Block Diagram 5.0 W (Est) – Dual Core @ 800MHz - Single / Dual e500 Power Architecture® core; 533 -800 MHz - 256KB Frontside L2 cache w/ECC, HW cache coherent - 36 bit physical addressing, DP-FPU #### System Unit - 32-bit DDR2/DDR3, 667 MHz data rate w/ECC - Integrated SEC 3.3 Security Engine - Open-PIC Interrupt Controller, Perf Mon, 2x I2C, Timers, 16 GPIO's, DUART - 16-bit Enhanced Local Bus supports booting from NAND Flash - Two USB 2.0Controllers Host/Device support - SPI controller supporting booting from SPI serial Flash - SD/MMC card controller supporting booting from Flash cards - TDM interface - Three 10/100/1000 Ethernet Controllers (eTSEC) w/ Jumbo Frame support, SGMII interface - Enhanced features: Parser/Filer, QOS, IP-Checksum Offload, Lossless Flow Control with Interface options: - •IEEE1588v2 Support - Two PCI Express 1.0a Controllers operating up to 2.5Gbps - Power Management #### Process & Package - 45nm SOI, 0.95V+/-50mV, -40C to 125C Tj - 689-pin TePBGAII # MPC834x/831x to QorlQ P1 Migration | Feature | MPC8343E | MPC8313E | MPC8314E/ MPC8315E | P1020E / P1011E | |----------------------|-----------------------------------------------------------------------|------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------| | Core | e300 | e300 | e300 | Dual / Single e500v2 | | CPU Speed | Up to 400MHz | Up to 400 MHz | Up to 400 MHz | 533MHz - 800MHz | | L1 I/D Cache | 32K I/D | 16K I/D | 16K I/D | 32K I/D | | L2 Cache | <u>-</u> | - | - | 256K I/D | | Memory<br>Controller | 32/64 bit DDR2 up to<br>333MHz | 16/32 bit DDR/2 up to<br>333MHz | 16/32 bit DDR/2 up to<br>266MHz | 32 bit DDR2/3 ,up to 667MHz | | Ethernet | 2-10/100/1000<br>(MII, RGMII & RTBI only) | 2-10/100/1000 with SGMII | 2-10/100/1000 with SGMII | 3-10/100/1000 with SGMII | | Local Bus | 8/16/32-bit ROM boot | 8/16-bit/66MHz w/ NAND<br>& NOR Boot | 8/16-bit/66MHz w/ NAND &<br>NOR Boot | 8/16/32-bit 83MHz w/ NOR, 8/16-bit<br>NAND Boot | | PCIE | - | - | x2 supported | 2 Controllers, up to x4 lanes | | SATA | - | - | x2 (MPC8315E only) | - | | USB | Hi-Speed<br>Host or Device | 1 -2.0 Host or Device<br>w/PHY | 1 -2.0 Host or Device<br>w/PHY | 2 x 2.0 Host or Device | | Security | E version only | SEC 2.2 | SEC 3.3 | SEC 3.3.1 | | Other | 1x 32Bit / 66MHz PCI, Dual<br>I2C, DUART, SPI<br>Interrupt Controller | 1x 32Bit / 66MHz PCI,<br>Dual I2C, DUART, SPI<br>Interrupt Controller,<br>2 SERDES Lanes | 1x 32Bit / 66MHz PCI,<br>Dual I2C, DUART, SPI<br>Interrupt Controller,<br>2 SERDES Lanes | SD/MMC Support, TDM,<br>DUART, Dual I2C, SPI, Interrupt<br>controller<br>4 SERDES Lanes | | Package | 620 PBGA | 516 TePBGA | 620 TePBGA | 689 TePBGA II | | Power | <5W @ 667MHz | ~2W @ 333MHz | ~2W | ~ 5 W (typ) @ 800MHz Dual Core | | Samples | Now | Now | Now | May 2010 | | Production | Now | Now | Feb 2009 | Q4 2010 | | Process | 130nm | 90nm | 90nm | 45nm | | | | | | | # MPC837x to QorlQ P1 /P2 Migration | | MPC8379E | MPC8378E | MPC8377E | P1020 / P1011 | P2020 / P2010 | |----------------------|-----------------------------------------------|--------------------------------------------------|-----------------------------------------------|------------------------------------------------------------|-----------------------------------------------------------| | Core | e300 | e300 | e300 | Dual e500v2 /<br>Single e500v2 | Dual e500v2 /<br>Single e500v2 | | CPU Speed | Up to 800 MHz | Up to 800 MHz | Up to 800 MHz | 533MHz - 800MHz | 800MHz / 1200 MHz | | L1 I/D Cache | 32KI/ 32KD | 32KI/32KD | 32KI/32KD | 32K I/D | 32K I/D | | L2 Cache | | | | 256K I/D | 512K I/D | | Memory<br>Controller | 32/64 bit DDR/2 up to<br>400MHz | 32/64 bit DDR/2 up to<br>400MHz | 32/64 bit DDR/2 up to<br>400MHz | 32 bit DDR2/3<br>up to 667MHz | 32/64 bit DDR2/3<br>up to 800MHz | | Local bus | 32 bit w/NAND boot support | 32 bit w/NAND boot support | 32 bit w/NAND boot support | 8/16/32-bit 83MHz w/ NOR,<br>8/16-bit NAND Boot<br>Support | 8/16-bit/ 150MHz w/<br>NAND & NOR Boot | | PCI | 1-32 bit up to 66MHz<br>(2.3) | 1-32 bit up to 66MHz<br>(2.3) | 1-32 bit up to 66MHz<br>(2.3) | - | - | | PCI Express | - | 2-x1 | 2-x1 | 2 PCIE (1.0a)<br>Controllers<br>Up to x4 Lanes | 3 PCIE (1.0a)<br>Controllers<br>Up to x4 Lanes | | SATA | 4x1 SATA 2.0 w/PHY | - | 2x1 SATA 2.0 w/PHY | - | - | | Ethernet | 2-10/100/1000 (RGMII,<br>RTBI, RMII, MII) | 2-10/100/1000 (SGMII,<br>RGMII, RTBI, RMII, MII) | 2-10/100/1000 (RGMII,<br>RTBI, RMII, MII) | 3-10/100/1000<br>with SGMII support | 3-10/100/1000<br>with SGMII support | | USB | 1- 2.0 Host or Device | 1-2.0 Host or Device | 1- 2.0 Host or Device | 2 x 2.0 Host or<br>Device | 1 x 2.0 Host or<br>Device | | Security | SEC 3.0 | SEC 3.0 | SEC 3.0 | SEC 3.3.1 | SEC 3.1 | | Other | Dual UART<br>Dual I2C<br>Interrupt Controller | Dual UART<br>Dual I2C<br>Interrupt Controller | Dual UART<br>Dual I2C<br>Interrupt Controller | SD/MMC, TDM, Dual UART Dual I2C Interrupt Controller | SD/MMC,<br>Dual UART,<br>Dual I2C<br>Interrupt Controller | | Package | Te PBGA | Te PBGA | Te PBGA | 689 TePBGA II | 689 TePBGA II | | General<br>Samples | Oct 2007 | Oct 2007 | Oct 2007 | May 2010 | Now | | Production | NOW | NOW | NOW | Q4 2010 | July 2010 | # **Qorl Q™ P1 Series Comparison** | | P1011 | P1020 | P1012 | P1021 | P1013 | P1022 | |----------------|------------------------|--------------------|--------------------------|--------------------------|-------------------------------------------|----------------------| | | | | | | | | | | e500 Up to 800MHz | • | e500 Up to | e500 Up to 800MHz | e500 Up to 1000MHz | e500 Up to 1000MHz | | CPU | 32K I/D | 800MHz 32K I/D | 800MHz 32K I/D | 32K I/D | 32K I/D | 32K I/D | | L2 Cache | 256KB | 256KB | 256KB | 256KB | 256KB | 256KB | | DDR <b>V</b> F | | | | | | | | Type/Width | DDR2/3 32-bit | DDR2/3 32-bit | DDR2/3 32-bit | DDR2/3 32-bit | DDR2/3 32/64-bit | DDR2/3 32/64-bit | | 10/100/1000 | | | | | | | | (with | | | | | | | | IEEE1588v2 | 3 w/(2) SGMII | 3 w/(2) SGMII | 3 w/(2) SGMII | 3 w/(2) SGMII | 2 w/(2) SGMII | 2 w/(2) SGMII | | TDM/I2S | Yes | Yes | Yes | Yes | Yes | Yes | | | | | | | | | | | | | | | | | | PCI Express | 2 controllers w/ 4 | 2 controllers w/ 4 | 2 controllers w/ 4 | 2 controllers w/ 4 | 3 controllers w/ 4 | 3 controllers w/ 4 | | 1.0a | SERDES | SERDES | SERDES | SERDES | SERDES | SERDES | | USB 2.0 | 2 | 2 | 1 | 1 | 2 | 2 | | Memory Card | SD/MMC | SD/MMC | SD/MMC | SD/MMC | SD/MMC | SD/MMC | | GPIO | 16 | 16 | 16 | 16 | 87 | 87 | | SATA | - | - | - | - | Yes | Yes | | LCD Controller | - | - | - | - | Yes | Yes | | | | | | | | | | | OD! 0 100 DUADT | SPI, 2x I2C, | SPI, 2x I2C, | 001 0 100 011407 | 0DI 0 100 DI 14 DT | 0DI 0 100 DI 14 DT | | | SPI, 2x I2C, DUART | DUART | DUART | SPI, 2x I2C, DUART | SPI, 2x I2C, DUART | SPI, 2x I2C, DUART | | QUICC Engine | - | - | Yes | Yes | - | - | | Accelerators | SEC3.3 | SEC3.3 | SEC3.3 | SEC3.3 | SEC3.1 | SEC3.1 | | | | | | | Dawa Nan Claan Janua | Dama Nam Clasm | | Power | | | | | Doze, Nap, Sleep, Jogm<br>Oacket-lossless | | | | Dama Nan Olasa | Dama Nam Class | Dama Nam Class | Dama Nam Ola | | Jogm Oacket-lossless | | Management | Doze, Nap, Sleep<br>3W | 3.5W | Doze, Nap, Sleep<br>3.2W | Doze, Nap, Sleep<br>3.7W | Deepsleep<br>3W | Deepsleep<br><4W | | Typical Power | SVV | 3.377 | J.∠VV | S./VV | SVV | <b>\4 VV</b> | | Package | 689 TePBGAII | 689 TePBGAII | 689 TePBGAII | 689 TePBGAII | 689 TePBGAII | 689 TePBGAII | # **Challenges of Migration** Before we move on, lets consider what are some of the challenges of design migration... ## Hardware design - Power supply design - What level of actual re-design is required? - Can I use existing interfaces? - What are the requirements for new interfaces? - Power management and thermal performance - Future-proofing and design re-use for other platforms ## Software design - Code set migration. Instructions set, library and function support - Operating system and tool support - Software portability and level of re-coding for new features Migrating to dual and multi-core ## **QorlQ P1 / P2 Device Features** ## e500v2 Core Architecture Up to 1.2 GHz L1: 32KB, 8-way set associátive, Parity L2: 512KB, Front Side: 8way set associative. ECC Cache line locking supported MESI cache coherence Peak IPC 2 Instructions plus 1 branch Out of Order Execution Multiple Book E APUs 16 TLB variable page sizes 512-entry 4K Pages 36-bit Physical Address ## **L2 Cache Controller** Shared 256KB / 512 KB unified front-side L2 cache Eight-way associatively (each way: 64 KB) #### **Assignment Granularity:** - One, two, four, or all eight "ways" of the cache can be assigned as the following: - SRAM - Stash-Only - CPU0 L2 Ónly - CPU1 L2 Only - Both CPU0 and CPU1 L2 #### Stash-Only regions can now be defined - Prevents stash data from polluting processor data and vice-versa - One, two or four "ways" of the cache can be dedicated as Stash-Only #### Stash Allocate Disable mode added · Allows update of all resident cache lines without allocation of new lines ## e500 Internal Busses e300 uses internal coherent system bus CSB to interconnect cores and interfaces ## Core Complex Bus (CCB) is evolution of CSB bus - Offers extended addressing and enhanced data flow performance - Used to connect e500 cores and caches together to the rest of device via ECM module ## e500 Coherency Module (ECM) - Module used to ensure coherency between e500 cache and external interfaces on the board - Functions include queuing & buffering functions for CCB, arbitration across CCB and I/O masters, and transaction processing - Low-latency path between DDR controllers and cores / caches ## On-Chip Network (OCeaN) Switch Fabric - A multi-port, on-chip, non-blocking crossbar switch fabric designed for highspeed interconnects - 2.7GByte/s peak bandwidth per port ## **DDR Controller** #### QorlQ P1 / P2 platforms offer DDR2 & DDR3 support No DDR support – change from PowerQUICC II Pro #### QorlQ P1020 / P1011 - Support 32-bit data bus configuration with ECC - Support up to 667MHz DDR interface speed. - Support for up to 4Gb devices, x8, x16, x32 configurations - Max memory support of 8GB, with 2 memory banks supported #### QorlQ P2020/P2010 - Support 64-bit and 32-bit data bus configuration with ECC. - Support up to 800MHz DDR interface speed. - Support for up to 4Gb devices, x8, x16, x32 configurations - Max memory support 16GB, with 4 memory banks supported #### Other common features - · Supports self-refresh mode - Battery backup - Initialization bypass - Chip-select interleaving - Automatic DRAM initialization - Error injection - On die termination ## **DDR Controller** #### DDR 3 – New feature - Lower power performance ~25% compared to DDR2 (Source JEDEC) - Supply voltage reduced from 1.8V to 1.5V. - Support for "Fly by" routing, - Results in fewer stubs and improved signal integrity for faster clock speeds - Introduced additional registers for write-levelling control for DDR3 - Asynchronous reset pin for cold or warm reset of memories - Separate voltage reference pins for address and data signals for noise reduction - Improved pin-out for signal integrity and reduced skew. - Dynamic ODT for also aids signal integrity DDR clocking – ability to asynchronously clock from platform clock supported for higher speed memory devices # Enhanced Three-Speed Ethernet (eTSEC) Controllers eTSEC MAC controllers support 10Mbps, 100Mbps, 1Gbps Ethernet /IEEE 802.3 interfaces Similar in specification to MPC831x & MPC837x devices Backwards compatible with TSEC controllers used on MPC834x #### Support following PHY interfaces - GMII, RGMII, MII, TBI, RMII, RTBI, SGMII, & 8/16-bit FIFO mode - SGMII interfaces share available SERDES lanes #### Advanced functions part of enhanced controller: - TCP/IP acceleration - QOS support for up to 8 queues - MAC address recognitions - CRC generation & checking - Extraction and allocation of data to L2 cache - · Remote monitoring statistics support #### IEEE® 1588 Timer support #### Interrupt virtualisation added to P1 devices - eTSEC controllers and interrupts can be grouped, to be assigned to a particular core by software - Further available information in software section ## **Security** Embedded security block (SEC) used for off-loading computationally intensive security functions Existing SEC architecture migrated to QorlQ P1 / P2 devices. SEC block contains specialised execution unit for different encryption algorithms - PKE Public Key Encryption Unit - AES Advanced Encryption Standard Unit - DES Data Encryption Standard Execution Unit - CRC Cyclic Redundancy Check Unit - MDE Message Digest Execution Unit - RNG Random Number Generator - SNOW3G & KASUMI 2G/GSM & 3G encryption Execution unit support varies depending on target silicon application Backwards compatible driver support across various SEC engine versions 4 Crypto-channels are initiators and data fetchers for the SEC block ## **SERDES Interfaces** #### SERDES interface is consistent with MPC831x / MPC837x devices - Additional SERDES support added to QorlQ P1 & P2 devices - 4 SERDES lanes shared across PCIe, SGMII, & SRIO controllers ### PCI Express v1.0a - Up to x4 lane support on P1 & P2 devices, supports 8Gbps(half-duplex) max data rate as per PowerQUICC II Pro - Up to 3 PCle controllers available on P2 devices, for multiple configurations ## Serial RapidIO – (P2 devices Only) - New high-speed switched fabric for embedded systems - Supports multiple lane configurations & speeds (1.25/2.5/3.125Gbauds) - Max data rate of 10Gbps (half-duplex, 4x) SGMII – Up to 2 SGMII interfaces with PCIe / SRIO support PCI Express 1.0a compatible Supports x1, x2, and x4 link widths @ 2.5 Gbaud, 2.0 Gb/s Auto-detection of number of connected lanes Selectable as root complex or endpoint at initialization 32- and 64-bit addressing into PCI Express address space Root complex inbound support for MSI and INTx Endpoint support for outbound MSI Reads/writes carried across ports, but not a switch 256 byte maximum payload size One virtual channel Strong and relaxed ordering rules 8 non-posted, 6 posted transactions 3 inbound + 1 configuration window - Translates upper 52b of PCI addr to upper 24b of local addr - Window sizes of 4 KB to 64 GB - · Settings: read/write type, prefetchable, and target - 1 MB Config window maps to CCSR region #### 4 outbound + 1 default window - Translates upper 24b of local addr to upper 52b of PCI addr - Select I/O or memory for reads and writes - Window sizes of 4 KB to 64 GB ## **PCI Express® Interface** ## **Enhanced Secure Digital Host Controller (eSDHC)** Memory card interface available on QorlQ Platforms Similar in specification to the eSDHC on MPC837x Provides high-speed, flexible data storage capability to a system Designed to work with various SD & MMC card formats SD, SDHC, miniSD, SD Combo, MMC, MMCplus & RS-MMC cards Supports capacities of up to 32GB, with different speeds Boot interface support – new feature On-chip ROM used to load device driver prior to loading of boot data from the card. ## **Additional Features** ## Programmable Interrupt Controller (PIC) - Interrupts added between cores and for SRIO - Interrupts based on OpenPIC architecture - 16 interrupt priority levels (15 is highest) - Priority level is the reversed from PowerQUICC II Pro - Critical interrupts ## Performance monitoring - Core performance monitoring available from PowerQUICC II Pro - Device performance monitoring available as well on QorlQ / PowerQUICC III devices - Debug and monitor events on DDR, DMA, ECM bus, PCIe & other interfaces ## **Additional Interfaces (Continue)** #### **DMA Controller** - 4 channel DMA controller located on OCeaN switched fabric - Used for internal data movement and external movement to/from the device - Accessible by all cores, all internal interfaces and external masters - 2 DMA controllers available on P2020 for additional I/O performance #### USB - High-speed USB 2.0 interface, with host or device support - USB on-the-go capability supported as well. - Supports ULPI interface to an external PHY #### eSPI controller Interface consistent with PowerQUICC II Pro device family #### Local Bus – Enhanced local bus controller Supports 8-bit / 16-bit interface with NAND / NOR Flash support # **Hardware Design Considerations** # Hardware Design Electrical Specifications | Description | Symbols | MPC831x | MPC837x | QorlQ P1 | QorlQ P2 | |-----------------------------------------------------|--------------------|-------------|-------------------|-------------------|-------------------| | Core | VDD | 1.0V | 1.0V | 0.95V | 1.05V | | PLL supply | AVDD | 1.0V | 1.0V | 0.95V | 1.05V | | SERDES Core Supply | SVDD /<br>XCOREVDD | 1.0V | 1.0V / 1.05V*** | 0.95V | 1.05V | | SERDES pad supply | XVDD /<br>XPADVDD | 1.0V | 1.0V / 1.05V*** | 0.95V | 1.05V | | DDR2 / DDR3 I/O | GVDD | 1.8V / - | 1.8V / - | 1.8V / 1.5V | 1.8V / 1.5V | | Ethernet I/O | LVDD | 3.3V / 2.5V | 3.3V / 2.5V | 3.3V / 2.5V | 3.3V / 2.5V | | DUART, system control, I <sup>2</sup> C, GPIO, JTAG | OVdd / NVdd | 3.3V | 3.3V | 3.3V | 3.3V | | Local Bus | BVDD / NVDD | 3.3V | 3.3V / 2.5V /1.8V | 3.3V / 2.5V /1.8V | 3.3V / 2.5V /1.8V | | USB, eSPI, eSDHC* | CVDD /<br>USB_VDD | 3.3V | 3.3V / 2.5V /1.8V | 3.3V / 2.5V /1.8V | 3.3V / 2.5V /1.8V | <sup>\*</sup> eSDHC not available on MPC831x <sup>\*\*</sup> SPI interface shares same supply as I2C & JTAG on MPC831x / MPC837x <sup>\*\*\*</sup> SERDES supply specs for 600MHz / 800MHz devices respectively ## **Power Sequencing** Power sequencing is different from PowerQUICC II Pro P1/P2 devices requires power rails to be applied in a specific sequence in order to ensure proper device operation. These requirements are as follows for power-up: - VDD, AVDD\_n, BVDD, LVDD, OVDD,CVDD, XVDD\_SRDS and XVDD\_SRDS - GVDD - NOTE: Items on the same line have no ordering requirement with respect to one another. - NOTE: If any of the I/O power supplies ramp prior to VDD core supplies, the associated I/O supply may drive a logic one or zero during power- up thus causing excessive current to be drawn by the device. All supplies must be at their stable values within 50 ms. ## **POR Configuration Inputs** #### P1/P2 platforms have power on reset (POR) signals used for device configuration Signals are muxed with existing I/O pins The settings for the following POR pins will determine CPU boot enable, PLL ratios and boot device selection: - LA[29:31] cfg sys pll[0:2] - LBCTL, LALE, LGPL2 cfg core0 pll[0:2] - LWE0, UART SOUT1, READY P1 cfg core1 pll[0:2] - TSEC1 TXD[6:4],TSEC1 TX ER cfg rom loc[0:3] - LA[27] cfg cpu0 boot - LA[16] cfg cpu1 boot - LGPL3, LGPL5 cfg boot seg[0:1] Further configuration signals may be used in the future to control functionality. It is advised that boards are built with the ability to pull-up or pull-down these pins. - LA[20] cfg eng use[00] - LA[21] cfg eng use[01] - LA[22] -cfg eng use[02] - UART\_SOUT[00]- cfg\_eng\_use[03] - TRIG OUT -cfg eng use[04] - MSRCID[01] -cfg\_eng\_use[05] - MSRCID[04]- cfg eng use[06] - DMA1 DDONE[00]- cfg eng use[07] POR signals are sampled on HRESET negation ## **POR Configuration Pins Termination Requirements** The following pins must NOT be pulled down during power-on reset, otherwise it may trigger the internal test mode: - DMA1\_DACK[00] - USB1\_STP - HRESET\_REQ - MSRCID[2:3] - MDVAL - ASLEEP ## **DDR Interface Design** DDR2 interface requirements consistent with PowerQUICC II Pro family Refer to Application Note AN2910: "Hardware and Layout Design Considerations for DDR2 SDRAM Memory Interfaces" DDR3 guidelines available from AN108 "Designing for DDR3 Memory on Freescale Microprocessors" ## **DDRCLK** input - Input is only required when the DDR controller is running in asynchronous mode. - Not required if DDR controller is selected to work in synchronous mode, via POR setting cfg\_ddr\_pll[0:2]=111. - It is recommended to tie it off to GND when DDR controller is running in synchronous mode. - DDR3 is only supported in asynchronous mode ## **eTSEC** Pin Termination Addition termination requirements for eTSECs compared to PowerQUICC II Pro - When eTSEC1 and eTSEC2 are used as parallel interfaces, pins TSEC1\_TX\_EN and TSEC2\_TX\_EN requires an external 4.7-kΩ pull-down resistor to prevent PHY from seeing a valid Transmit Enable before it is actively driven. - TSEC2\_TXD[01] is used as cfg\_dram\_type. It must be valid at power-up, even before HRESET assertion. ## Unused eTSEC pin termination - For I/Os, tie signals high or low through a resistor. Recommended resistor values are 2–10K ohm. - For inputs, tie signals to their inactive state through a resistor; clock inputs may be tied high or low. Recommended resistor values are 2–10 K ohm. ## **Local Bus Termination** Termination is not needed on output signals. For bidirectional I/Os, tie signals high or low though a resistor. Recommended resistor values are 2-10 K ohm For inputs, tie signals to their inactive state through a resistor. Recommended resistor values are 2-10 K ohm. # **Reset Configuration and Clocking** # **POR Configuration** QorlQ P1 / P2 devices use power-on reset (POR) configuration pins, which are sampled during the assertion of HRESET B. PowerQUICC II Pro devices load reset configuration words to initialize various device functions All POR configuration pins are typically multiplexed with the output signals All POR configuration pins have internal pull-up resistor ( $\sim$ 20 K $\Omega$ ) and those resistors are activated only during the POR configuration; - POR pins can be pulled high or low by external resistors for configuration. - Signals can also be driven by CPLD / FPGA device During HRESET, all other signal drivers connected to these POR configuration signals must be in the high-impedance state. Reason: If other devices also drive POR pin during HRESET, device may sample the wrong POR configuration information from the POR pin. ### **Power-On Reset** # Power-On reset sequence is significantly different from PowerQUICC II Pro family - HRESET# Now input only and replaces PORESET# input - HRESET REQ# Output, used to request reset - SRESET# Input only - READY\_P0 / TRIG\_OUT Core 0 ready output / external trigger output - READY\_P1 Core 1 ready output (if available) ### Signals no longer available - CFG\_RESET\_SOURCE[0:3] Used to load configuration words. Function replaced by POR configuration pins - CFG\_CLKIN\_DIV# Clock division selection is carried out by POR PLL config pins # **Clocking Quick Reference** | Functional Block | Clocked by | Restrictions | |---------------------------------------------------|------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Local Bus | CCB_clk / [ 4, 8, 16] | LBIU PLL bypass mode is recommended when LBIU frequency is at or below 83 MHz; When LBIU operates above 83 MHz, LBIU PLL is recommended to be enabled | | PCI Express® and Serial<br>RapidIO® digital logic | CCB_clk/2 | | | SerDes for PCIe, sRIO and SGMII | SD_REF_CLK/SD_REF_C<br>LK_B | 100 MHz ref clk for PCle 1.25 Gbps and 2.5 Gbps 100 MHz ref clk for SRIO 1.25 Gbps and 2.5 Gbps 125 MHz ref clk for SRIO 3.125 Gbps 100 MHz ref clk for SGMII 1.25 Gbps For PCle, CCB_clk > (527 MHz x PCI Express link width) / 8 For sRIO, CCB_clk > 2 × (0.80) × (Serial RapidIO interface frequency) × (Serial RapidIO link width) /64 | | I <sup>2</sup> C | CCB_clk / (2*<br>I2CFDR[FDR]ratio) | | | Real Time Clock (RTC) | External source or CCB_clk | The minimum pulse width of the RTC signal should be greater than 2x the period of the CCB clock; The minimum RTC frequency is zero. | # **Clocking Quick Reference (continued)** | Functional Block | Clocked by | Restrictions | |------------------------|------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | e500 core and L1 cache | Core-complex-bus clock (CCB_clk) times a multiplier | 533 MHz=< core freq =<1.2 GHz | | L2 cache and ECM | CCB_clk | 266 MHz=< CCB_clk (platform clock) =<600 MHz | | SYSCLK | External clock source | SYSCLK 66.66 MHz=min, & 100 MHz=max | | DDR | CCB_clk / 2 | 400 MHz=< DDR data rate =<800 MHz; For DDR3: the minimum data rate is 667 MHz Note:For asynchronous operation the DDR clock speed is derived from the DDR_CLK input | | eTSECs | eTSEC logic layer is clocked<br>by CCB_clk /2;<br>MAC layer is clocked by 125<br>MHz from PHY or External; | EC_GTX_CLK125 is used to generate the GTX clock for the eTSEC transmitter with 2% degradation. EC_GTX_CLK125 duty cycle can be loosened from 47/53% as long as the PHY device can tolerate the duty cycle generated by the eTSEC GTX_CLK. | | eTSEC FIFO mode | TSECn_RX_CLK and TSECn_TX_CLK | For FIFO GMII mode: FIFO TX/RX clock frequency <= platform clock frequency / 4.2 For FIFO encoded mode: FIFO TX/RX clock frequency <= platform clock frequency / 3.2 | ### e500 Clock Control Architecture and Low Power States | | Processor<br>Clocks | Snoops<br>Respond to | Interrupts<br>Respond to | Comments | |-------|----------------------|----------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Run | On | Yes | Yes | All units operating normally. Dynamic Power Management (DPM) may be enabled | | Doze | On | Yes | Yes | The core has halted instruction fetching, but all other functional blocks in the core and device are running. | | Nap | Off except time base | No | Yes | The core has halted instruction fetching. Snooping of the L1 caches is disabled. All of the core's functional units except the timer are shut down. All functional blocks in the device are running. | | Sleep | Off | No | Yes | Instruction fetching is halted. snooping of L1 caches is disabled. Most functional blocks are shut down in both the e500 cores and the system logic, except to the interrupt controller (PIC) unit and eTSEC | ### **Boot Modes** ### Possible Boot interfaces - Local bus - DDR2 / DDR3 memory controller - PCI Express interface - Serial RapidIO (P2020) - eSPI interface - eSDHC interface New boot interfaces supported on P1 & P2 Boot sequencer – Migrated from PowerQUICC II Pro Boot hold-off – registers used to suspend core booting - Used when booting two cores. CPU0 boots while CPU1 waits - Used when external master boots device over eg. RapidIO or PCIe Reset - e500 begins execution from fixed location of 0xFFFF\_FFFC ### **Runtime Power Management** #### Dynamic power management • Withholds clocks to e500's unused execution units, MMUs, caches, and other blocks without performance impact #### Programmable power mode - Programmable transition (per core) between e500 modes: full power, doze, nap, and sleep - Three external pins track power mode of cores - POWMGTCR puts device into sleep or doze #### Memory Controller - · Dynamic power management - Doesn't clock DRAM when no transactions - Sleep/doze mode. DRAM put into self-refresh mode, controller goes into sleep mode #### I/O power management eTSEC's Magic Packet support: specially defined Ethernet packet received on eTSEC wakes chip from sleep # **Configuration Power Management** - Multiplier flexibility to optimize core, internal bus, and DDR performance and power - Disable unused blocks through DEVDISR register - e500 (each individually) - PCI Express (all three individually) - Local bus - Security block - USB - eSDHC - SPI - DMAs (both individually) - eTSECs (each individually) - DDR controller - I2C (both together) - DUART - Timers (both sets individually) # **Software Design Considerations** # **Memory Map** ### e500v2 based QorlQ devices deploy a 36-bit local address space PowerQUICC II Pro, e300 based, devices deploy a 32-bit local addressing scheme ### Local Access Windows (LAWs) - Support multiple access windows like e300 devices - Maximum window size increased from 2GBytes to 32GBytes ### ATMU - Address Translation & Mapping Unit - Used for translating between local & external address spaces - Further ATMUs added on QorlQ platforms for additional PCle controllers & SRIO ### CCSR - Command, Configuration & Status Registers - Replacement to IMMR space registers on PowerQUICC II Pro - All registers of the SoC are contained within a 1MByte address region - Offers additional flexibility can be relocated, and provides easier external access # P2020 Ak Block Core 0 Core 1 ### eTSEC Updates - Memory mapping New eTSEC features introduced to P1 devices - P2 and earlier devices have eTSEC 1.x - P1 devices have eTSEC 2.x eTSEC 1.x has one common 4k block of common registers which can be used by any core In eTSEC 2.x, the Ethernet traffic can be categorized in two groups Separate 4k block of status & control registers are for each groups. These 4k blocks can be associated to each of the cores Another 4k block, common to both the core, is available for management # eTSEC Updates - Interrupt steering In eTSEC 1.x, all the Rx and Tx interrupts are routed to any one core through PIC In eTSEC 2.x, the hardware queues can be mapped to any of the two groups. The interrupt controller gets two Rx and two Tx interrupts from each eTSECs. These interrupts can be routed to either of the cores. # e300 to e500v2 Migration Privileges: Programming Model Power Architecture cores operates in either of the following two modes - Supervisor Mode: This is the highest privilege mode where entire programming model is available for the software. Operating systems and boot loaders operate on this mode. - User mode: Resources, which can affect whole system, are not available in this mode. User-level applications or non-trustable programs operate on this mode. # e300 to e500v2 Migration User Level Registers ### e300 to e500v2 Migration Supervisor Level Registers # e300 to e500v2 Migration Exceptions While exceptions had fixed vector addresses in e300, they are programmable in e500v2 except reset vector. In e300, reset vectors at 0x0000\_0100 while it vectors at 0xFFFF FFFC in e500v2 e500v2 has a new category of exception. Machine check # e300 to e500v2 Migration Exception Handling A new category of Machine check interrupt has been added in e500v2 # e300 to e500v2 Migration MMU e300 supports BAT and page translations while e500v2 has 512 entries of fixed 4k pages and 16 entry variable size pages e300 endian mode is controlled on system basis while e500v2 allows endian confirguration per page basis Unlike e300, MMU is always on in e500v2 After reset, a default entry in MMU maps logical address 0xffff\_f000 to physical address 0xffff\_f000 ### Cache e300 has 16/32k L1 I/D cache while e500v2 has 32k L1 I/D cache and 256k L2 cache(P1)/512k L2 cache(P2). e300 supports way-locking while e500v2 supports line-locking e500v2 has non-blocking caches (cache access is allowed even after a miss) e500v2 supports stashing on L2 ### Software Migration Considerations Operating System Migration e500v2 imposes numerous changes to classic PowerPC operating systems Areas affected include: - Instruction set use - Context switching - Exception handling - MMU operation - Reset Necessary changes have already been made by PowerPC OS vendors Linux BSPs with gcc based toolchain available from Freescale http://www.freescale.com/linux # Software Migration Considerations Migrating to Dual Core Dual core systems support for AMP & SMP architectures SMP – Symmetric Multi-Processing architecture Configuration where two or more identical cores can connect to a single shared memory space. Equal access to memory and resources AMP – Asymmetric Multi-Processing architecture Configuration where the processing elements in a multi-core processor can be used to run separate tasks. Processing elements are treated separately with own resources and memory space. AMP & SMP BSPs available from Freescale for P2020 Selection of architecture depends upon application # Symmetric Multi-processing (SMP) **SMP** is a <u>multi-processor homogeneous computer</u> <u>architecture</u> where two or more identical processors are connected to a globally shared main memory... Processors could be separate devices, all on 1 device or a mix Typically all CPUs share memory, I/O and are run one OS instance Each processor can run independent processes and threads Any idle processor can be assigned any task Issues / challenges - Challenges of migrating legacy code to multithreaded architecture - Additional overhead with scheduling and managing cores Performance speed-up depends on application ### **Asymmetric Multi-processing (AMP)** A <u>multi-processing usage model</u> in which individual processors are dedicated to particular tasks, such as running the operating system or performing user requests Each core has its own dedicated resources, no scheduling required with other devices Software will run as it does on a single core environment ### Issues / Challenges: - System design deciding how to partition resources between cores - Cannot take advantage of idle time on another CPU ### **Development System & Ecosystem** # **Application Case Studies** ### Networking (switches and routers) - Line card controller - Mid-range line card control plane - Low-end line card combined control and data plane - Shelf controller - Business gateway - Multiservice router - Wireless access points #### Telecom - AMC card - Controller on ATCA Carrier Card - Channel and control card for NodeB, BTS, WCDMA, 4G LTE, WiMax - General-purpose compute blade #### Industrial - Robotics - Test/measurement Networking/telecom - Multifunction printer - Single board computers - Industrial applications ### Key Advantages - High single-threaded performance - Enables performance without complexity of partitioning across multiple cores or threads - Highly suitable for control plane applications whose sequential nature means efficiency is lost with scaling to many cores - Pin compatibility over 4.5x frequency range enables BOM-based product differentiation - Low power ### **NAS Storage Application – PowerQUICC II Pro** e300 based platform, up to 800MHz performance, 1.92 DMIPS/MHz performance, <5W @ 800MHz Integrated SATA controller used to connect up to 2x HDD Gigabit Ethernet LAN and WAN interface Wireless capability available through PCIe express card USB interface accessible through external PHY # NAS Storage Application – QorlQ P1 / P2 Performance improvement with e500 core - Up to 1200MHz performance per CPU - 2.4DMIPS/MHz per CPU L2 cache available to improve Ethernet packet processing #### Storage capability: - SSD on PCIe express - · eSDHC interface Gigabit Ethernet LAN and WAN interface maintained Potential to add additional services using extra CPU processing capability Target CPU power - P2020 8W @ 1200MHz - P1020 5W @ 800MHz # **Application Migration - Hardware** ### New board design required for QorlQ device | Issue | Solution with QorlQ P1 /P2 | |---------------------|-------------------------------------------------------------------------------------------------------------------------------------| | Power supply | Same voltage architecture. Adjustable supplies will ease migration from 1.1V on PowerQUICC II Pro to 0.95V for QorlQ P1 | | Existing interfaces | Re-use design for existing interfaces such as Ethernet, DDR2, eSPI, JTAG and USB | | New interfaces | Use application notes, evaluation boards and other resources from http://www.freescale.com. | | Power performance | Best-in-class power performance, enable fan-less "green" design. Advanced power management. | | Simulation | IBIS models available for all devices | | Future expansion | Pin compatibility between P1011, P1020, P2020 and P2010 devices simple migration to a higher performance or lower power application | # **Application Migration – Software** ### Migration from e300 to e500 based platform | Issue | Solution with QorlQ P1 /P2 | |----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Code base migration | General code compatibility with exceptions in certain areas – floating point, interrupts, supervisor instructions. See resources at <a href="https://www.freescale.com">www.freescale.com</a> | | Operating systems & tools | Established ecosystem of operating systems BSPs, tools and drivers for e500 based platforms, thus reducing development time | | Software compatibility for reused interfaces | Code used for existing interfaces such as Ethernet and security block will migrate | | New interfaces and features | Driver-level source code available for Freescale evaluation platforms from www.freescale.com | # **Application – Migrating to Dual Core** ### Consider why? - Achieve increased performance of existing application - Accommodate new services and functions - Integration of external functions to achieve lower cost ### Do you choose AMP or SMP? - How will the application be partitioned? - Consider memory and resource access on device ### Is the existing application single threaded or multi-threaded - Single threaded - AMP Run existing code as is on single core - SMP Run multiple versions of existing code - Multi-threaded - SMP Run single instance of un-modified application Profile existing code and recode critical spots, make use of parallel processing to optimise performance # **Qorl Q™ P1 and P2 Series Summary** | Features | Benefits | |--------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------| | Best in class ecosystem | Faster time to market | | Migration path | Improved performance/watt/\$ migrating from PowerQUICC II, PowerQUICC II Pro, and PowerQUICC III | | High performance e500 2.4MIPS/MHz Power Architecture™ core | High efficiency and frequency cores means fewer cores to get the job done | | Best-in-class power | Enables fan-less, "green" and low cost designs, improves reliability | | Integrated Ethernet, TDM, USB, SD Flash controller, IEEE1588, PCI-Express, Serial Rapid IO | Flexibility to address a wide range of applications and reduced system cost | | 4.5x performance range in a single package (533MHz to 2x1200MHz) | Common hardware platform to enable wide range of system performance | | Dual and single cores | Move to dual core at your own pace without hardware changes |