Features
• • • • • • • • • • • • • • • • •
3000 Dhrystone 2.1 MIPS at 1.3 GHz Selectable Bus Clock (30 CPU Bus Dividers up to 28x) 13 Selectable Core-to-L3 Frequency Divisors Selectable MPx/60x Interface Voltage (1.8V, 2.5V) Selectable L3 Interface of 1.8V or 2.5V PD Typical 12.6W at 1 GHz at VDD = 1.3V; 8.3W at 1 GHz at VDD = 1.1V, Full Operating Conditions Nap, Doze and Sleep Modes for Power Saving Superscalar (Four Instructions Fetched Per Clock Cycle) 4 GB Direct Addressing Range Virtual Memory: 4 Hexabytes (252) 64-bit Data and 32-bit Address Bus Interface Integrated L1: 32 KB Instruction and 32 KB Data Cache Integrated L2: 512 KB 11 Independent Execution Units and Three Register Files Write-back and Write-through Operations fINT Max = 1 GHz (1.2 GHz to be Confirmed) fBUS Max = 133 MHz/166 MHz
PowerPC 7457 RISC Microprocessor PC7457/47 Preliminary Specification α-site
Description
This document is primarily concerned with the PowerPC™ PC7457; however, unless otherwise noted, all information here also applies to the PC7447. The PC7457 and PC7447 are implementations of the PowerPC microprocessor family of reduced instruction set computer (RISC) microprocessors. This document describes pertinent electrical and physical characteristics of the PC7457. The PC7457 is the fourth implementation of the fourth generation (G4) microprocessors from Motorola. The PC7457 implements the full PowerPC 32-bit architecture and is targeted at networking and computing systems applications. The PC7457 consists of a processor core, a 512 Kbyte L2, and an internal L3 tag and controller which support a glueless backside L3 cache through a dedicated high-bandwidth interface. The PC7447 is identical to the PC7457 except it does not support the L3 cache interface. The core is a high-performance superscalar design supporting a double-precision floating-point unit and a SIMD multimedia unit. The memory storage subsystem supports the MPX bus interface to main memory and other system resources. The L3 interface supports 1, 2, or 4M bytes of external SRAM for L3 cache and/or private memory data. For systems implementing 4M bytes of SRAM, a maximum of 2M bytes may be used as cache; the remaining 2M bytes must be private memory. Note that the PC7457 is a footprint-compatible, drop-in replacement in a PC7455 application if the core power supply is 1.3V.
Rev. 5345B–HIREL–02/04 5345B–HIREL–02/04
Screening
• • • CBGA Upscreenings Based on Atmel Standards Full Military Temperature Range (Tj = -55°C, +125°C), Industrial Temperature Range (Tj = -40°C, +110°C) CBGA Package, HiTCE Package for the 7447 TBC
CBGA 483
G suffix CBGA 360 Ceramic Ball Grid Array
GH suffix HITCE 360 Ceramic Ball Grid Array (TBC)
2
PC7457/47 [Preliminary]
5345B–HIREL–02/04
5345B–HIREL–02/04
Figure 1. PC7457 Microprocessor Block Diagram
Block Diagram
Additional Features
- Time Base Counter/Decrementer - Clock Multiplier - JTAG/COP Interface - Thermal/Power Management - Performance Monitor Completion Unit Completion Queue (16-Entry)
Instruction Unit Branch Processing Unit BTIC (128-Entry) BHT (2048-Entry) CTR Fetcher
Instruction Queue (12-Word)
Instruction MMU SRs (Shadow) 128-Entry ITLB
128-Bit (4 Instructions)
Tags
IBAT Array LR Dispatch Unit Data MMU SRs (Original) VR Issue (4-Entry/2-Issue) GPR Issue (6-Entry/3-Issue) FPR Issue (2-Entry/1-Issue) 128-Entry DTLB
32-Kbyte I Cache
96-Bit (3 Instructions)
Tags D Cache
32-Kbyte
DBAT Array
Reservation Stations (2-Entry) Completes up to three instructions per clock VR File 16 Rename Buffers Reservation Reservation Reservation Reservation v Station Station Station Station Reservation v Stations (2) Reservation Reservation Reservation Station Station Station Vector Touch Queue GPR File 16 Rename Buffers
EA
Load/Store Unit Vector Touch Engine + (EA Calculation) Finished Stores L1 Castout PA FPR File 16 Rename Buffers Reservation Stations (2)
Integer Unit 2 x÷
Integer Integer Integer Unit 122 Unit Unit (3) +++
FloatingPoint Unit + x÷ FPSCR
L1 Push Completed Stores
Vector Permute Unit
Vector Integer er Unit 2
Vector Integer er Unit 1
Vector FPU 128-Bit 128-Bit
32-Bit
32-Bit
32-Bit
Load Miss
64-Bit
64-Bit
PC7457/47 [Preliminary]
Memory Subsystem L1 Store Queue (LSQ) L1 Load Queue (LLQ) L1 Load Miss (5) L2 Prefetch (3) Instruction Fetch (2) Cacheable Store Request(1) L2 Store Queue (L2SQ) Snoop Push/ L1 Castouts Interventions (4) Bus Accumulator 19-Bit Address 64-Bit Data (8-Bit Parity) External SRAM (1, 2, or 4 Mbytes) 512-Kbyte UniÞed L2 Cache Controller Line Block 0 (32-Byte) Block 1 (32-Byte) Tags Status Status L3 Cache Controller(1) Line Block 0/1 Tags Status L3CR System Bus Interface Load Queue (11) Bus Store Queue Castout Queue (9)/ Push Queue (10)(2)
L1 Service Queues
Bus Accumulator 36-Bit Address Bus
Notes: 1. The L3 cache interface is not implemented on the PC7447. 2. The Castout Queue and Push Queue share resources such for a combined total of 10 entries. The Castout Queue itself is limited to 9 entries, ensuring 1 entry will be available for a push.
64-Bit Data Bus
3
General Parameters
Table 1 provides a summary of the general parameters of the PC7457. Table 1. Device Parameters
Parameter Technology Die size Transistor count Logic design Packages Core power supply I/O power supply Description 0.13 µm CMOS, nine-layer metal 9.1 mm × 10.8 mm 58 million Fully-static PC7447: surface mount 360 ceramic ball grid array (CBGA) PC7457: surface mount 483 ceramic ball grid array (CBGA) 1.3V ±500 mV DC nominal or 1.1V ±50 mV (nominal, see Table 3 on page 12 1.8V ±5% DC, or 2.5V ±5% for recommended operating conditions
Features
This section summarizes features of the PC7457 implementation of the PowerPC architecture. Major features of the PC7457 are as follows: • High-performance, superscalar microprocessor – – – – – – – • – As many as 4 instructions can be fetched from the instruction cache at a time As many as 3 instructions can be dispatched to the issue queues at a time As many as 12 instructions can be in the instruction queue (IQ) As many as 16 instructions can be at some stage of execution simultaneously Single-cycle execution for most instructions One instruction per clock cycle throughput for most instructions Seven-stage pipeline control Branch processing unit (BPU) features static and dynamic branch prediction 128-entry (32-set, four-way set-associative) branch target instruction cache (BTIC), a cache of branch instructions that have been encountered in branch/loop code sequences. If a target instruction is in the BTIC, it is fetched into the instruction queue a cycle sooner than it can be made available from the instruction cache. Typically, a fetch that hits the BTIC provides the first four instructions in the target stream 2048-entry branch history table (BHT) with two bits per entry for four levels of prediction – not-taken, strongly not-taken, taken, and strongly taken Up to three outstanding speculative branches Branch instructions that don’t update the count register (CTR) or link register (LR) are often removed from the instruction stream Eight-entry link register stack to predict the target address of Branch Conditional to Link Register (BCLR) instructions
Eleven independent execution units and three register files
4
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
– Four integer units (IUs) that share 32 GPRs for integer operands Three identical IUs (IU1a, IU1b, and IU1c) can execute all integer instructions except multiply, divide, and move to/from special-purpose register instructions IU2 executes miscellaneous instructions including the CR logical operations, integer multiplication and division instructions, and move to/from specialpurpose register instructions – Five-stage FPU and a 32-entry FPR file Fully IEEE 754-1985-compliant FPU for both single- and double-precision operations Supports non-IEEE mode for time-critical operations Hardware support for denormalized numbers Thirty-two 64-bit FPRs for single- or double-precision operands – Four vector units and 32-entry vector register file (VRs) Vector permute unit (VPU) Vector integer unit 1 (VIU1) handles short-latency AltiVec™ integer instructions, such as vector add instructions (vaddsbs, vaddshs, and vaddsws, for example) Vector integer unit 2 (VIU2) handles longer-latency AltiVec integer instructions, such as vector multiply add instructions (vmhaddshs, vmhraddshs, and vmladduhm, for example) Vector floating-point unit (VFPU) – Three-stage load/store unit (LSU) Supports integer, floating-point, and vector instruction load/store traffic Four-entry vector touch queue (VTQ) supports all four architected AltiVec data stream operations Three-cycle GPR and AltiVec load latency (byte, half-word, word, vector) with one-cycle throughput Four-cycle FPR load latency (single, double) with one-cycle throughput No additional delay for misaligned access within double-word boundary Dedicated adder calculates effective addresses (EAs) Supports store gathering
5
5345B–HIREL–02/04
Performs alignment, normalization, and precision conversion for floatingpoint data Executes cache control and TLB instructions Performs alignment, zero padding, and sign extension for integer data Supports hits under misses (multiple outstanding misses) Supports both big- and little-endian modes, including misaligned little-endian accesses • Three issue queues FIQ, VIQ, and GIQ can accept as many as one, two, and three instructions, respectively, in a cycle. Instruction dispatch requires the following: – – – Instructions can be dispatched only from the three lowest IQ entries – IQ0, IQ1, and IQ2 A maximum of three instructions can be dispatched to the issue queues per clock cycle Space must be available in the CQ for an instruction to dispatch (this includes instructions that are assigned a space in the CQ but not in an issue queue) 16 GPR rename buffers 16 FPR rename buffers 16 VR rename buffers Decode/dispatch stage fully decodes each instruction The completion unit retires an instruction from the 16-entry completion queue (CQ) when all instructions ahead of it have been completed, the instruction has finished execution, and no exceptions are pending Guarantees sequential programming model (precise exception model) Monitors all dispatched instructions and retires them in order Tracks unresolved branches and flushes instructions after a mispredicted branch Retires as many as three instructions per clock cycle 32 Kbyte, eight-way set-associative instruction and data caches Pseudo least-recently-used (PLRU) replacement algorithm 32-byte (eight-word) L1 cache block Physically indexed/physical tags Cache write-back or write-through operation programmable on a per-page or per-block basis Instruction cache can provide four instructions per clock cycle; data cache can provide four words per clock cycle Caches can be disabled in software Caches can be locked in software
•
Rename buffers – – –
• •
Dispatch unit – – Completion unit
– – – – • – – – – – – – –
Separate on-chip L1 Instruction and data caches (Harvard Architecture)
6
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
– – – – – – MESI data cache coherency maintained in hardware Separate copy of data cache tags for efficient snooping Parity support on cache and tags No snooping of instruction cache except for icbi instruction Data cache supports AltiVec LRU and transient instructions Critical double- and/or quad-word forwarding is performed as needed. Critical quad-word forwarding is used for AltiVec loads and instruction fetches. Other accesses use critical double-word forwarding On-chip, 512 Kbyte, eight-way set-associative unified instruction and data cache Fully pipelined to provide 32 bytes per clock cycle to the L1 caches A total nine-cycle load latency for an L1 data cache miss that hits in L2 PLRU replacement algorithm Cache write-back or write-through operation programmable on a per-page or per-block basis 64-byte, two-sectored line size Parity support on cache Provides critical double-word forwarding to the requesting unit Internal L3 cache controller and tags External data SRAMs Support for 1, 2, and 4M bytes (MB) total SRAM space Support for 1 or 2 MB of cache space Cache write-back or write-through operation programmable on a per-page or per-block basis 64-byte (1 MB) or 128-byte (2 MB) sectored line size Private memory capability for half (1 MB minimum) or all of the L3 SRAM space for a total of 1-, 2-, or 4-MB of private memory Supports MSUG2 dual data rate (DDR) synchronous Burst SRAMs, PB2 pipelined synchronous Burst SRAMs, and pipelined (register-register) Late Write synchronous Burst SRAMs Supports parity on cache and tags Configurable core-to-L3 frequency divisors 64-bit external L3 data bus sustains 64-bit per L3 clock cycle 52-bit virtual address; 32- or 36-bit physical address Address translation for 4 Kbyte pages, variable-sized blocks, and 256M bytes segments Memory programmable as write-back/write-through, cachinginhibited/caching-allowed, and memory coherency enforced/memory coherency not enforced on a page or block basis Separate IBATs and DBATs (eight each) also defined as SPRs
•
Level 2 (L2) cache interface – – – – – – –
•
Level 3 (L3) cache interface (not implemented on PC7447) – – – – – – – – –
– – – • – – –
Separate memory management units (MMUs) for Instructions and data
–
7
5345B–HIREL–02/04
–
Separate instruction and data translation lookaside buffers (TLBs) Both TLBs are 128-entry, two-way set-associative, and use LRU replacement algorithm TLBs are hardware- or software-reloadable (that is, on a TLB miss a page table search is performed in hardware or by system software)
•
Efficient data flow – – – – – – Although the VR/LSU interface is 128 bits, the L1/L2/L3 bus interface allows up to 256 bits The L1 data cache is fully pipelined to provide 128 bits/cycle to or from the VRs L2 cache is fully pipelined to provide 256 bits per processor clock cycle to the L1 cache As many as eight outstanding, out-of-order, cache misses are allowed between the L1 data cache and L2/L3 bus As many as 16 out-of-order transactions can be present on the MPX bus Store merging for multiple store misses to the same line. Only coherency action taken (address-only) for store misses merged to all 32 bytes of a cache block (no data tenure needed) Three-entry finished store queue and five-entry completed store queue between the LSU and the L1 data cache Separate additional queues for efficient buffering of outbound data (such as castouts and write-through stores) from the L1 data cache and L2 cache Hardware-enforced, MESI cache coherency protocols for data cache Load/store with reservation instruction pair for atomic memory references, semaphores, and other multiprocessor operations 1.6V processor core The following three power-saving modes are available to the system: Nap—Instruction fetching is halted. Only those clocks for the time base, decrementer, and JTAG logic remain running. The part goes into the doze state to snoop memory operations on the bus and then back to nap using a QREQ/QACK processor-system handshake protocol Sleep—Power consumption is further reduced by disabling bus snooping, leaving only the PLL in a locked and running state. All internal functional units are disabled Deep sleep— When the part is in the sleep state, the system can disable the PLL. The system can then disable the SYSCLK source for greater system power savings. Power-on reset procedures for restarting and relocking the PLL must be followed on exiting the deep sleep state
– – •
Multiprocessing support features include the following: – –
•
Power and thermal management – –
8
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
–
Thermal management facility provides software-controllable thermal management. Thermal management is performed through the use of three supervisor-level registers and a PC7457-specific thermal management exception Instruction cache throttling provides control of instruction fetching to limit power consumption
– • • •
Performance monitor can be used to help debug system designs and improve software efficiency In-system testability and debugging features through JTAG boundary-scan capability Testability – – – LSSD scan design IEEE 1149.1 JTAG interface Array built-in self test (ABIST) – factory test only Parity checking on system bus and L3 cache bus Parity checking on the L2 and L3 cache tag arrays
•
Reliability and serviceability – –
9
5345B–HIREL–02/04
Signal Description
Figure 2. PC7457 Microprocessor Signal Groups
18 64 BR Address Arbitration BG 1 1 8 1 2 A[0:35] Address Transfer AP[0:4] 36 5 4 2 L3_ADDR[17:0](1) L3-DATA[0:63] L3_DP[0:7] L3_VSEL L3_CLK[0:1] L3_ECHO_CLK[0:3] L3_CNTL[0:1] L3 Cache Clock/Control L3 Cache Address/Data Note: L3 cache interface is not supported in the PC7441, PC7445, or the PC7447
TS TT[0:4] TBST Address Transfer Attributes TSIZ[0:2] GBL WT CI
1 5 1 3 1 1 1 PC7457
1 1 1 1 1 1 1 1
INT SMI MCP SRESET HRESET CKSTP_IN CKSTP_OUT TBEN QREQ QACK BVSEL BMODE[0:1] PMON_IN PMON_OUT SYSCLK PLL_CFG[0:3](2) PLL_EXT EXT_QUAL CLK_OUT TCK TDI TDO TMS TRST AVDD GND Test Interface (JTAG) Clock Control Processor Status/Control Interrupts/Resets
AACK Address Transfer Termination ARTRY SHD0/SHD1 HIT
1 1 2 1
1 1 1 2 1
DBG Data Arbitration DTI[0:3] DRDY
1 4 1
1 1 4 1
D[0:63] Data Transfer DP[0:7]
64 8
1 1 1
Data Transfer Termination
TA TEA
1 1
1 1 1 1
VDD OVDD GVDD
Notes:
1. For the PC7457, there are 19 L3_ADDR signals, (L3_ADDR[0:18]. 2. For the PC7447 and PM7457, there are 5 PLL_CFG signals, (PLL_CFG[0:4].
10
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
Detailed Specification Scope Applicable Documents Requirements
General Design and Construction
Terminal Connections Depending on the package, the terminal connections are as shown in Table 16, Table 3 and Figure 2. The microcircuits are in accordance with the applicable documents and as specified herein. This specification describes the specific requirements for the microprocessor PC7457 in compliance with Atmel standard screening. 1. MIL-STD-883: Test methods and procedures for electronics 2. MIL-PRF-38535: Appendix A: General specifications for microcircuits
Absolute Maximum Ratings
Table 2. Absolute Maximum Ratings(1)
Symbol VDD
(2) (2)
Characteristic Core supply voltage PLL supply voltage BVSEL = 0 Processor bus supply voltage BVSEL = HRESET or OVDD L3VSEL = ¬HRESET L3 bus supply voltage L3VSEL = 0 L3VSEL = HRESET or GVDD Processor bus Input voltage L3 bus JTAG signals Storage temperature range
Maximum Value -0.3 to 1.60 -0.3 to 1.60 -0.3 to 1.95 -0.3 to 2.7 -0.3 to 1.65 -0.3 to 1.95 -0.3 to 2.7 -0.3 to OVDD + 0.3 -0.3 to GVDD + 0.3 -0.3 to OVDD + 0.3 -55 to 150
Unit V V V V V V V V V V °C
AVDD
OVDD(3)(4) OVDD(3)(5) GVDD GVDD
(3)(6) (3)(7)
GVDD(3)(8) VIN(9)(10) VIN VIN TSTG Notes:
(9)(10)
1. Functional and tested operating conditions are given in Table 3 on page 12. Absolute maximum ratings are stress ratings only, and functional operation at the maximums is not guaranteed. Stresses beyond those listed may affect device reliability or cause permanent damage to the device. 2. Caution: VDD/AVDD must not exceed OVDD/GVDD by more than 1V during normal operation; this limit may be exceeded for a maximum of 20 ms during power-on reset and power-down sequences. 3. Caution: OVDD/GVDD must not exceed VDD/AVDD by more than 2V during normal operation; this limit may be exceeded for a maximum of 20 ms during power-on reset and power-down sequences. 4. BVSEL must be set to 0, such that the bus is in 1.8V mode. 5. BVSEL must be set to HRESET or 1, such that the bus is in 2.5V mode. 6. L3VSEL must be set to ¬HRESET (inverse of HRESET), such that the bus is in 1.5V mode.
11
5345B–HIREL–02/04
7. L3VSEL must be set to 0, such that the bus is in 1.8V mode. 8. L3VSEL must be set to HRESET or 1, such that the bus is in 2.5V mode. 9. Caution: VIN must not exceed OVDD or GVDD by more than 0.3V at any time including during power-on reset. 10. VIN may overshoot/undershoot to a voltage and for a maximum duration as shown in Figure 3.
Recommended Operating Conditions
Table 3. Recommended Operating Conditions(1)
Recommended Value Symbol VDD AVDD
(2)
Characteristic Core supply voltage PLL supply voltage BVSEL = 0 Processor bus supply voltage BVSEL = HRESET or OVDD L3VSEL = 0 L3 bus supply voltage L3VSEL = HRESET or GVDD L3VSEL = ¬HRESET Processor bus Input voltage L3 bus JTAG signals Die-junction temperature
Min
Max
Unit V V V V V V V
1.3V ±50 mV or 1.1V ±50 mV 1.3V ±50 mV or 1.1V ±50 mV 1.8V ±5% 2.5V ±5% 1.8V ±5% 2.5V ±5% 1.5V ±5% GND GND GND -55 OVDD GVDD OVDD 125
OVDD OVDD GVDD GVDD GVDD(3) VIN VIN VIN Tj Notes:
V V V
°C
1. These are the recommended and tested operating conditions. Proper device operation outside of these conditions is not guaranteed. 2. This voltage is the input to the filter discussed in Section “PLL Power Supply Filtering” on page 54 and not necessarily the voltage at the AVDD pin which may be reduced from VDD by the filter. 3. ¬HRESET is the inverse of HRESET.
Figure 3. Overshoot/Undershoot Voltage
OVDD/GVDD + 20% OVDD/GVDD + 5% OVDD/GVDD
VIH
VIL GND GND – 0.3V GND – 0.7V Not to exceed 10% of tSYSCLK
12
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
The PC7457 provides several I/O voltages to support both compatibility with existing systems and migration to future systems. The PC7457 core voltage must always be provided at nominal 1.3V (see Table 3 for actual recommended core voltage). Voltage to the L3 I/Os and processor interface I/Os are provided through separate sets of supply pins and may be provided at the voltages shown in Table 4. The input voltage threshold for each bus is selected by sampling the state of the voltage select pins at the negation of the signal HRESET. The output voltage will swing from GND to the maximum voltage applied to the OVDD or GVDD power pins. Table 4. Input Threshold Voltage Setting
BVSEL Signal 0 ¬HRESET HRESET 1 Notes: Processor Bus Input Threshold is Relative to: 1.8V Not available 2.5V 2.5V L3VSEL Signal(1) 0 ¬HRESET HRESET 1 L3 Bus Input Threshold is Relative to: 1.8V 1.5V 2.5V 2.5V Notes
(2)(3)(5) (2)(4)(5) (2) (2)
1. Not implemented on PC7447. 2. Caution: The input threshold selection must agree with the OVDD/GVDD voltages supplied. See notes in Table 2. 3. If used, pull-down resistors should be less than 250Ω 4. Applicable to L3 bus interface only. ¬HRESET is the inverse of HRESET. 5. 1.8V I/O mode and 1.5V I/O mode are not supported in N spec at VDD = 1.1V.
Thermal Characteristics
Package Characteristics Table 5. Package Thermal Characteristics(1)
Value Symbol RθJA(2)(3) RθJMA(2)(4) RθJMA(2)(4) RθJMA(2)(4) RθJB(5) RθJC Notes:
(6)
Characteristic Junction-to-ambient thermal resistance, natural convection Junction-to-ambient thermal resistance, natural convection, four-layer (2s2p) board Junction-to-ambient thermal resistance, 200 ft./min. airflow, single-layer (1s) board Junction-to-ambient thermal resistance, 200 ft./min. airflow, four-layer (2s2p) board Junction-to-board thermal resistance Junction-to-case thermal resistance
PC7447 22 14 16 11 6 < 0.1
PC7457 20 14 15 11 6 < 0.1
Unit
°C/W °C/W °C/W °C/W °C/W °C/W
1. See “Thermal Management Information” on page 15 for more details about thermal management. 2. Junction temperature is a function of on-chip power dissipation, package thermal resistance, mounting site (board) temperature, ambient temperature, airflow, power dissipation of other components on the board, and board thermal resistance. 3. Per SEMI G38-87 and JEDEC JESD51-2 with the single-layer board horizontal. 4. Per JEDEC JESD51-6 with the board horizontal. 5. Thermal resistance between the die and the printed-circuit board per JEDEC JESD51-8. Board temperature is measured on the top surface of the board near the package. 6. Thermal resistance between the die and the case top surface as measured by the cold plate method (MIL SPEC-883 Method 1012.1) with the calculated case temperature. The actual value of RθJC for the part is less than 0.1°C/W.
13
5345B–HIREL–02/04
Internal Package Conduction Resistance
For the exposed-die packaging technology, shown in Table 4 on page 13, the intrinsic conduction thermal resistance paths are as follows: • • The die junction-to-case (actually top-of-die since silicon die is exposed) thermal resistance The die junction-to-ball thermal resistance
Figure 33 on page 58 depicts the primary heat transfer path for a package with an attached heat sink mounted to a printed-circuit board. Figure 4. C4 Package with Heat Sink Mounted to a Printed-Circuit Board
External Resistance Radiation Convection
Heat Sink Thermal Interface Material Internal Resistance Printed-Circuit Board Die/Package Die Junction Package/Leads
External Resistance
Radiation
Convection
Note the internal versus external package resistance. Heat generated on the active side of the chip is conducted through the silicon, then through the heat sink attach material (or thermal interface material), and finally to the heat sink where it is removed by forced-air convection. Because the silicon thermal resistance is quite small, for a first-order analysis, the temperature drop in the silicon may be neglected. Thus, the thermal interface material and the heat sink conduction/convective thermal resistances are the dominant terms.
14
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
Thermal Management Information This section provides thermal management information for the ceramic ball grid array (CBGA) package for air-cooled applications. Proper thermal control design is primarily dependent on the system-level design – the heat sink, airflow, and thermal interface material. To reduce the die-junction temperature, heat sinks may be attached to the package by several methods – spring clip to holes in the printed-circuit board or package, and mounting clip and screw assembly (see Figure 32 on page 55); however, due to the potential large mass of the heat sink, attachment through the printed-circuit board is suggested. If a spring clip is used, the spring force should not exceed 10 pounds. Figure 5. Package Exploded Cross-sectional View with Several Heat Sink Options
Heat Sink CBGA Package
Heat Sink Clip
Thermal Interface Material
Printed-Circuit Board
15
5345B–HIREL–02/04
Thermal Interface Materials
A thermal interface material is recommended at the package lid-to-heat sink interface to minimize the thermal contact resistance. For those applications where the heat sink is attached by spring clip mechanism, Figure 5 shows the thermal performance of three thin-sheet thermal-interface materials (silicone, graphite/oil, floroether oil), a bare joint, and a joint with thermal grease as a function of contact pressure. The use of thermal grease significantly reduces the interface thermal resistance. That is, the bare joint results in a thermal resistance approximately seven times greater than the thermal grease joint. Often, heat sinks are attached to the package by means of a spring clip to holes in the printed-circuit board (see Figure 32 on page 55). Therefore, the synthetic grease offers the best thermal performance, considering the low interface pressure and is recommended due to the high power dissipation of the PC7457. Of course, the selection of any thermal interface material depends on many factors – thermal performance requirements, manufacturability, service temperature, dielectric properties, cost, etc. Figure 6. Thermal Performance of Select Thermal Interface Material
2 Silicone Sheet (0.006 in.) Bare Joint Floroether Oil Sheet (0.007 in.) Graphite/Oil Sheet (0.005 in.) Synthetic Grease
Specific Thermal Resistance (K-in.2/W)
1.5
1
0.5
0 0 10 20 30 40 50 60 70 80
Contact Pressure (psi)
Heat Sink Selection Example
For preliminary heat sink sizing, the die-junction temperature can be expressed as follows: Tj = TI + Tr + (RθJC + Rθint + Rθsa) × Pd where: Tj is the die-junction temperature TI is the inlet cabinet ambient temperature Tr is the air temperature rise within the computer cabinet
16
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
RθJC is the junction-to-case thermal resistance Rθint is the adhesive or interface material thermal resistance Rθsa is the heat sink base-to-ambient thermal resistance Pd is the power dissipated by the device During operation, the die-junction temperatures (Tj) should be maintained less than the value specified in Table 3 on page 12. The temperature of air cooling the component greatly depends on the ambient inlet air temperature and the air temperature rise within the electronic cabinet. An electronic cabinet inlet-air temperature (Ta) may range from 30° to 40°C. The air temperature rise within a cabinet (Tr) may be in the range of 5° to 10°C. The thermal resistance of the thermal interface material (R θ int ) is typically about 1.5°C/W. For example, assuming a Ta of 30°C, a Tr of 5°C, a CBGA package RθJC = 0.1, and a typical power consumption (Pd) of 18.7W, the following expression for Tj is obtained: Die-junction temperature: Tj = 30°C + 5°C + (0.1°C/W + 1.5°C/W + θsa) × 18.7W For this example, a Rθsa value of 2.1°C/W or less is required to maintain the die junction temperature below the maximum value of Table 3 on page 12. Though the die junction-to-ambient and the heat sink-to-ambient thermal resistances are a common figure-of-merit used for comparing the thermal performance of various microelectronic packaging technologies, one should exercise caution when only using this metric in determining thermal management because no single parameter can adequately describe three-dimensional heat flow. The final die-junction operating temperature is not only a function of the component-level thermal resistance, but the system-level design and its operating conditions. In addition to the component's power consumption, a number of factors affect the final operating die-junction temperature – airflow, board population (local heat flux of adjacent components), heat sink efficiency, heat sink attach, heat sink placement, next-level interconnect technology, system air temperature rise, altitude, etc. Due to the complexity and the many variations of system-level boundary conditions for today's microelectronic equipment, the combined effects of the heat transfer mechanisms (radiation, convection, and conduction) may vary widely. For these reasons, we recommend using conjugate heat transfer models for the board, as well as system-level designs. For system thermal modeling, the PC7447 and PC7457 thermal model is shown in Figure 4 on page 14. Four volumes will be used to represent this device. Two of the volumes, solder ball, and air and substrate, are modeled using the package outline size of the package. The other two, die, and bump and underfill, have the same size as the die. The silicon die should be modeled 9.64 × 11 × 0.74 mm with the heat source applied as a uniform source at the bottom of the volume. The bump and underfill layer is modeled as 9.64 × 1 1 × 0 .69 mm (or as a collapsed volume) with orthotropic material properties: 0.6W/(m × K) in the xy-plane and 2W/(m × K) in the direction of the z-axis. The substrate volume is 25 × 25 × 1.2 mm (PC7447) or 29 × 29 × 1.2 mm (PC7457), and this volume has 18W/(m × K) isotropic conductivity. The solder ball and air layer is modeled with the same horizontal dimensions as the substrate and is 0.9 mm thick. It can also be modeled as a collapsed volume using orthotropic material properties: 0.034W/(m × K) in the xy-plane direction and 3.8W/(m × K) in the direction of the z-axis.
17
5345B–HIREL–02/04
Figure 7. Recommended Thermal Model of PC7447 and PC7457
Die z Conductivity Value Bump and Underfill kx ky kz Substrate k 18 Solder Ball and Air Die kx ky kz 0.034 0.034 3.8 y Side View of Model (Not to Scale) Substrate 0.6 0.6 2 W/(m x K) Unit Substrate Solder and Air Side View of Model (Not to Scale) x Bump and Underfill
Power Consumption Table 6. Power Consumption for PC7457
Processor (CPU) Frequency Full-Power Mode Core Power Supply Typical(1)(2) Maximum(1)(3) Nap Mode Typical(1)(2) Sleep Mode Typical(1)(2) 1.2 1.2 5.1 5.1 W 1.3 1.3 5.2 5.2 W 600 MHz 1.1 5.3 7.9 1000 MHz 1.1 8.3 11.5 1000 MHz 1.3 15.8 22.0 1200 MHz 1.3 17.5 24.2 W W Unit
Deep Sleep Mode (PLL Disabled) Typical(1)(2) Notes: 1.1 1.1 5.0 5.0 W
1. These values apply for all valid processor bus and L3 bus ratios. The values do not include I/O supply power (OVDD and GVDD) or PLL supply power (AVDD). OVDD and GVDD power is system dependent, but is typically < 5% of VDD power. Worst case power consumption for AVDD < 3 mW 2. Typical power is an average value measured at the nominal recommended VDD (see Table 3 on page 12) and 65 C while running the Dhrystone 2.1 benchmark and achieving 2.3 Dhrystone MIPs/MHz.
18
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
3. Maximum power is the average measured at nominal VDD and maximum operating junction temperature (see Table 3 on page 12) while running an entirely cache-resident, contrived sequence of instructions which keep all the execution units maximally busy. 4. Doze mode is not a user-definable state; it is an intermediate state between fullpower and either nap or sleep mode. As a result, power consumption for this mode is not tested.
Electrical Characteristics
Static Characteristics
Table 7 provides the DC electrical characteristics for the PC7457. Table 7. DC Electrical Specifications (see Table 3 on page 12 for Recommended Operating Conditions)
Symbol VIH(2) VIH VIH VIL(2)(6) VIL VIL IIN(2)(3) ITSI(2)(3)(4) VOH(6) VOH VOH VOL(6) VOL VOL CIN Notes: 1. 2. 3. 4. Capacitance, VIN = 0V, f = 1 MHz L3 interface
(5)
Characteristic
Nominal Bus Voltage(1) 1.5
Min GVDD × 0.65 OVDD/GVDD × 0.65 1.7 -0.3 -0.3 -0.3 – – OVDD/GVDD – 0.45 OVDD/GVDD – 0.45 1.8 – – – –
Max GVDD + 0.3 OVDD/GVDD + 0.3 OVDD/GVDD + 0.3 GVDD × 0.35 OVDD/GVDD × 0.35 0.7 30 30 – – – 0.45 0.45 0.6 9.5 8.0
Unit V V V V V V µA µA V V V V V V pF pF
Input high voltage (all inputs including SYSCLK)
1.8 2.5 1.5
Input low voltage (all inputs including SYSCLK)
1.8 2.5
Input leakage current, VIN = GVDD/OVDD High-impedance (off-state) Leakage current, VIN = GVDD/OVDD
– – 1.5
Output high voltage, IOH = -5 mA
1.8 2.5 1.5
Output low voltage, IOL = 5 mA
1.8 2.5 –
All other inputs(5)
–
Nominal voltages; see Table 3 on page 12 for recommended operating conditions. For processor bus signals, the reference is OVDD while GVDD is the reference for the L3 bus signals. Excludes test signals and IEEE 1149.1 boundary scan (JTAG) signals. The leakage is measured for nominal OVDD/GVDD and VDD, or both OVDD/GVDD and VDD must vary in the same direction (for example, both OVDD and VDD vary by either +5% or -5%). 5. Capacitance is periodically sampled rather than 100% tested. 6. Applicable to L3 bus interface only
19
5345B–HIREL–02/04
Dynamic Characteristics
This section provides the AC electrical characteristics for the PC7457. After fabrication, functional parts are sorted by maximum processor core frequency as shown in section “Clock AC Specifications” and tested for conformance to the AC specifications for that frequency. The processor core frequency is determined by the bus (SYSCLK) frequency and the settings of the PLL_CFG[0:4] signals. Parts are sold by maximum processor core frequency; See “Ordering Information” on page 59. Table 8 provides the clock AC timing specifications as defined in Figure 8 and represents the tested operating frequencies of the devices. The maximum system bus frequency, fSYSCLK, given in Table 8 is considered a practical maximum in a typical single-processor system. The actual maximum SYSCLK frequency for any application of the PC7457 will be a function of the AC timings of the PC7457, the AC timings for the system controller, bus loading, printed-circuit board topology, trace lengths, and so forth, and may be less than the value given in Table 8.
Clock AC Specifications
Table 8. Clock AC Timing Specifications (See Table 3 on page 12 for Recommended Operating Conditions)
Maximum Processor Core Frequency 600 MHz VDD = 1.1V Symbol fCORE(1) fVCO(1) fSYSCLK tSYSCLK
(1)(2) (2)
867 MHz VDD = 1.1V Min 500 1000 33 6 – 40 – – Max 867 1733 167 30 1 60 ±150 100
1000 MHz VDD = 1.1V Min 500 1000 33 6 – – – – Max 1000 2000 167 30 1 – – – Unit MHz MHz MHz ns ns % ps µs
Characteristic Processor frequency VCO frequency SYSCLK frequency SYSCLK cycle time SYSCLK rise and fall time SYSCLK duty cycle measured at OVDD/2 SYSCLK jitter
(5)(6) (7)
Min 500 1000 33 6 – 40 – –
Max 600 1200 167 30 1 60 ±150 100
tKR, tKF(3) tKHKL/tSYSCLK(4)
Internal PLL relock time
Maximum Processor Core Frequency 867 MHz VDD = 1.3V Symbol fCORE(1) fVCO
(1) (1)(2)
1000 MHz VDD = 1.3V Min 600 1200 33 6 – 40 – – Max 1000 2000 167 30 1 60 ±150 100
1200 MHz VDD = 1.3V Min 600 1200 33 6 – 40 – – Max 1200 2400 167 30 1 60 ±150 100
1267 MHz VDD = 1.3V Min 600 1200 33 6 – 40 – – Max 1267 2534 167 30 1 60 ±150 100 Unit MHz MHz MHz ns ns % ps µs
Characteristic Processor frequency VCO frequency SYSCLK frequency SYSCLK cycle time SYSCLK rise and fall time SYSCLK duty cycle measured at OVDD/2 SYSCLK jitter(5)(6) Internal PLL relock time
(7)
Min 600 1200 33 6 – 40 – –
Max 867 1733 167 30 1 60 ±150 100
fSYSCLK
tSYSCLK(2) tKR, tKF(3) tKHKL/ tSYSCLK(4)
20
PC7457/47 [Preliminary]
5345B–HIREL–02/04
PC7457/47 [Preliminary]
Notes: 1. Caution: The SYSCLK frequency and PLL_CFG[0:4] settings must be chosen such that the resulting SYSCLK (bus) frequency, CPU (core) frequency and PLL (VCO) frequency don’t exceed their respective maximum or minimum operating frequencies. Refer to the PLL_CFG[0:4] signal description in “PLL Configuration” on page 51 for valid PLL_CFG[0:4] settings 2. Assumes lightly-loaded, single-processor system. 3. Rise and fall times for the SYSCLK input measured from 0.4V to 1.4V. 4. Timing is guaranteed by design and characterization. 5. This represents total input jitter, short-term and long-term combined, and is guaranteed by design. 6. The SYSCLK driver’s closed loop jitter bandwidth should be