0
登录后你可以
  • 下载海量资料
  • 学习在线课程
  • 观看技术视频
  • 写文章/发帖/加入社区
创作中心
发布
  • 发文章

  • 发资料

  • 发帖

  • 提问

  • 发视频

创作活动
AT697E

AT697E

  • 厂商:

    ATMEL(爱特梅尔)

  • 封装:

  • 描述:

    AT697E - Rad-Hard 32 bit SPARC V8 Processor - ATMEL Corporation

  • 数据手册
  • 价格&库存
AT697E 数据手册
Features • SPARC V8 High Performance Low-power 32-bit Architecture • – LEON2-FT 1.0.13 compliant – 8 Register Windows Advanced Architecture: – On-chip Amba Bus – 5 Stage Pipeline – 16 kbyte Multi-sets Data Cache – 32 kbyte Multi-sets Instruction Cache On-chip Peripherals: – Memory Interface PROM Controller SRAM Controller SDRAM Controller – Timers Two 24-bit Timers Watchdog Timer – Two 8-bit UARTs – Interrupt Controller with 4 External Programmable Inputs – 32 Parallel I/O Interface – 33MHz PCI Interface Compliant with 2.2 PCI Specification Integrated 32/64-bit IEEE 754 Floating-point Unit Fault Tolerance by Design – Full Triple Modular Redundancy (TMR) – EDAC Protection – Parity Protection Debug and Test Facilities – Debug Support Unit (DSU) for Trace and Debug – IEEE 1149.1 JTAG Interface – Four Hardware Watchpoints Speed Optimized Code RAM Interface 8, 16 and 40-bit boot-PROM (Flash) Interface Possibilities Operating range – Voltages 3.3V +/- 0.30V for I/O 1.8V +/- 0.15V for Core – Temperature -55°C to 125°C Clock: 0MHz up to 100MHz Power consumption: 1W at 100MHz Performance: – 86MIPS (Dhrystone 2.1) – 23MFLOPS (Whetstone) Radiation Performance – Total dose radiation capability (parametric & functional): 60Krads (Si) – SEU error rate better than 1 E-5 error/device/day – No Single Event Latchup below a LET threshold of 70 MeV.cm²/mg Package MCGA 349 Mass: 9g Development Kit Including – AT697 Evaluation Board – AT697 Sample – GRMON Development Tool • Rad-Hard 32 bit SPARC V8 Processor AT697E • • • • • • • • • • • • Rev. 4226G–AERO–05/09 1 Description The AT697 is a highly integrated, high-performance 32-bit RISC embedded processor based on the SPARC V8 architecture. The implementation is based on the European Space Agency (ESA) LEON2 fault tolerant model. By executing powerful instructions in a single clock cycle, the AT697 achieves throughputs approaching 1MIPS per MHz, allowing the system designer to optimize power consumption versus processing speed. The AT697 is designed to be used as a building block in computers for on-board embedded real-time applications. It brings up-to-date functionality and performance for space application. The AT697 only requires memory and application specific peripherals to be added to form a complete on-board computer. The AT697 contains an on-chip Integer Unit (IU), a Floating Point Unit (FPU), separate instruction and data caches, hardware multiplier and divider, interrupt controller, debug support unit with trace buffer, two 24-bit timers, Parallel and Serial interfaces, a Watchdog, a PCI Interface and a flexible Memory Controller. The design is highly testable with the support of a Debug Support Unit (DSU) and a boundary scan through JTAG interface. An Idle mode holds the processor pipeline and allows Timer/Counter, Serial ports and Interrupt system to continue functioning. The processor is manufactured using the Atmel 0.18 µm CMOS process. It has been especially designed for space, by implementing on-chip concurrent transient and permanent error detection and correction. The AT697E is the first version of the AT697 processor. A second version of the AT697 processor, the AT697F, is under development. The AT697F will have improved radiation capabilities (up to 100Krad) and will correct all the bugs described in the AT697E errata sheet. The AT697F will be pinout compatible with the AT697E. 2 AT697E 4226G–AERO–05/09 AT697 Figure 1. AT697 Block Diagram AT697 Integer Unit (SPARC V8) I -Cache D-Cache Memory Controller BRDY* READ WRITE* A[27:0] D[31:0] ... Flash FPU TDI TDO ... SRAM JTAG AMBA Controller AMBA bridge SDRAM RxD TxD ... DSU AHB RESET* Reset PCI/AMBA bridge APB PCI CLK BYPASS ... Clock Generator Interrupt Controller interrupt config PIO RS232 RxD TxD WDOG* Watchdog Timers IOs 3 4226G–AERO–05/09 Pin Configuration MCGA349 package Table 1. AT697E MCGA349 pinout A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 VDD18 VSS18 N.C. PIO[13] CB[1] CB[6] D[3] D[8] D[12] D[17] D[21] D[25] D[30] VSS18 VDD18 VSS18 VDD18 VDD18 N.C. PIO[10] VSS33 CB[4] N.C. D[5] VSS33 D[18] D[23] N.C. N.C. VSS18 VDD18 VSS18 B C VDD18 VDD18 VSS18 PIO[9] PIO[11] VCC33 N.C. D[2] D[1] VCC33 VCC33 D[11] VCC33 D[22] D[26] D[28] VSS18 VDD18 VDD18 D VSS18 PIO[0] VCC33 N.C. N.C. Reserved PIO[15] VCC33 VSS33 VSS33 D[13] VSS33 VCC33 D[27] D[29] VCC33 D[31] VCC33 VSS18 E PIO[6] N.C. PIO[2] PIO[5] N.C. CB[0] VSS33 CB[7] D[6] Reserved D[7] D[14] VSS33 N.C. N.C. N.C. N.C. A[0] A[2] F PIO[1] PIO[4] N.C. PIO[3] VSS33 N.C. PIO[12] CB[2] VCC33 D[10] D[15] D[16] VSS33 VSS33 N.C. N.C. A[7] A[4] VSS33 G RAMS[1] RAMS[2] RAMOE[3] RAMS[4] RAMOE[1] VSS33 PIO[7] PIO[8] CB[3] D[4] N.C. D[19] A[1] A[3] A[12] A[6] VSS33 A[8] A[9] Table 2. AT697E MCGA349 pinout (suite) H 1 2 3 4 5 6 7 8 9 10 RAMOE[0] RAMOE[2] VCC33 RAMOE[4] RWE[1] RWE[3] RAMS[0] RAMS[3] CB[5] D[9] j VSS33 ROMS[1] ROMS[0] RWE[0] WRITE RWE[2] N.C. VCC33 PIO[14] D[0] k READ TCK TDI TDO VSS33 IOS TRST OE VSS33 N.C. l DSUACT DSURX DSUTX DSUEN TMS VSS33 SDDQM[0] BRDY SDRAS A/D[14] m BEXC SDCLK DSUBRE SDDQM[2] N.C. VSS33 VSS33 VCC33 A/D[22] VSS33 n VCC33 VSS33 SDDQM[1] N.C. SDDQM[3] GNT VCC33 A/D[21] A/D[16] PERR p SDWE PCI_CLK VSS33 SDCS[0] SDCAS A/D[24] A/D[30] A/D[18] A/D[17] IRDY 4 AT697E 4226G–AERO–05/09 AT697 H 11 12 13 14 15 16 17 18 19 D[20] D[24] N.C. A[10] N.C. A[11] A[19] A[13] A[15] j A[5] A[14] VCC33 VCC33 VSS33 VSS33 A[17] A[18] A[20] k A[16] A[26] A[21] A[27] VCC33 A[23] VSS33 A[22] A[25] l N.C. VDD_PLL N.C. LOCK A[24] RESET VCC33 VSS33 ERROR m A/D[12] AGNT[3] N.C. SKEW[1] Reserved LFT WDOG VSS_PLL SKEW[0] n A/D[9] A/D[1] VSS33 A/D[0] BYPASS AREQ[2] N.C. AREQ[3] VCC33 p A/D[15] A/D[8] A/D[5] AGNT[1] CLK VSS33 VSS33 N.C. AREQ[1] Table 3. AT697E MCGA349 pinout (suite 2) r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 REQ N.C. PCI_RST N.C. N.C. N.C. SYSEN VSS33 TRDY PCI_LOCK VSS33 N.C. VCC33 VCC33 N.C. N.C. VCC33 N.C. AREQ[0] t VSS18 SDCS[1] A/D[31] A/D[29] N.C. A/D[27] VSS33 VSS33 VCC33 DEVSEL VCC33 A/D[11] A/D[7] VSS33 A/D[2] VCC33 AGNT[0] AGNT[2] VSS18 u VDD18 VDD18 VSS18 VCC33 A/D[26] IDSEL VCC33 FRAME N.C. STOP VSS33 PAR A/D[10] C/BE[0] VCC33 N.C. VSS18 VDD18 VDD18 VSS18 VDD18 VSS18 N.C. VSS33 C/BE[3] A/D[20] C/BE[2] VCC33 C/BE[1] VSS33 VSS33 A/D[4] N.C. VDD18 VDD18 VSS18 VDD18 VSS18 A/D[28] A/D[25] A/D[23] A/D[19] VSS33 VCC33 SERR A/D[13] VSS33 A/D[6] A/D[3] VSS18 VDD18 v w Notes: 1. ‘Reserved’ pins shall not be driven to any voltage 2. N.C. refers to unconnected pins 5 4226G–AERO–05/09 QFP256 Package Table 4. AT697E MQFP256 pinout pin number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 pin name VCC33 PCI_REQ PCI_GNT PCI_CLK PCI_RST SDCS0 VSS VDD18 SDCS1 SDWE SDRAS VSS VSS SDCAS VCC33 SDDQM0 SDDQM1 SDDQM2 SDDQM3 SDCLK BRDY BEXC VSS VSS DSUEN DSUTX DSURX DSUBRE DSUACT TRST pin number 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 pin name TCK TMS VSS TDI TDO WRITE READ OE IOS VCC33 ROMS0 ROMS1 RWE0 RWE1 RWE2 RWE3 RAMOE0 RAMOE1 RAMOE2 RAMOE3 RAMOE4 RAMS0 VCC33 RAMS1 RAMS2 RAMS3 VSS VDD18 RAMS4 PIO0 pin number 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 pin name PIO1 PIO2 PIO3 PIO4 PIO5 PIO6 VCC33 PIO7 PIO8 PIO9 VSS VDD18 PIO10 PIO11 Reserved PIO12 PIO13 PIO14 PIO15 VCC33 CB0 CB1 CB2 CB3 VCC33 CB4 CB5 CB6 CB7 D0 6 AT697E 4226G–AERO–05/09 AT697 Table 5. AT697E MQFP256 pinout (suite) pin number 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 pin name VCC33 D1 D2 D3 D4 D5 D6 Reserved VCC33 D7 D8 D9 D10 D11 D12 VCC33 D13 D14 D15 D16 D17 VSS D18 VCC33 D19 D20 D21 D22 D23 D24 VSS VDD18 VCC33 pin number 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 pin name D25 D26 D27 D28 D29 D30 VCC33 D31 N.C. A0 A1 VSS VDD18 A2 A3 A4 VCC33 A5 A6 A7 A8 A9 A10 VCC33 A11 A12 A13 A14 A15 A16 VCC33 A17 A18 pin number 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 pin name A19 A20 A21 A22 VSS VCC33 A23 A24 A25 A26 A27 WDOG ERROR VCC33 RESET Reserved LOCK SKEW1 SKEW0 BYPASS VSS_PLL FLT VDD_PLL CLK VCC33 PCI_AREQ3 PCI_AGNT3 PCI_AREQ2 VSS VDD18 PCI_AGNT2 PCI_AREQ1 VCC33 7 4226G–AERO–05/09 Table 6. AT697E MQFP256 pinout (suite 2) pin number 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 pin name PCI_AGNT1 PCI_AREQ0 PCI_AGNT0 A/D0 VCC33 A/D1 A/D2 A/D3 A/D4 VSS VDD18 VCC33 A/D5 A/D6 A/D7 C/BE0 VSS VCC33 A/D8 A/D9 A/D10 A/D11 VCC33 pin number 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 pin name A/D12 A/D13 A/D14 A/D15 VCC33 C/BE1 PAR SERR PERR VCC33 PCI_LOCK STOP DEVSEL TRDY VCC33 IRDY FRAME VSS C/BE2 A/D16 VCC33 A/D17 A/D18 pin number 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 pin name A/D19 SYSEN A/D20 VCC33 A/D21 A/D22 A/D23 IDSEL C/BE3 VCC33 A/D24 A/D25 A/D26 VSS VDD18 A/D27 VCC33 A/D28 A/D29 A/D30 A/D31 Notes: 1. ‘Reserved’ pins shall not be driven to any voltage 2. N.C. refers to unconnected pins 8 AT697E 4226G–AERO–05/09 AT697 Pin Description IU and FPU Signals A[27:0] - Address bus (output) A[27:0] bus carries the addresses during accesses to external memory. When access to cache memory is performed, the address of the last external memory access remains driven on the address bus. D[31:0] - Data bus (bi-directional) D[31:0] bus carries the data during accesses to memory. The processor automatically configures the bus as output and drive the lines during write cycles. During accesses to 8-bit areas, only D[31:24] are used. During accesses to 16-bit areas, only D[31:16] are used. CB[7:0] - Check bits (bi-directional) CB[6:0] bus carries the EDAC checkbits during memory accesses. CB[7](1) takes the value of tcb[7] in the error control register. Processor only drives CB[7:0] during write cycles to areas programmed to be EDAC protected. Note: 1. CB[7] is implemented to enable programming of flash memories. When only 7 bits are useful for EDAC protection, 8 are needed for programming. Memory Interface Signals General management OE* - Output enable (output) This active low output is asserted during read cycles on the memory bus. BRDY* - Bus ready (input) When driven low, this input indicates to the processor that the current memory access can be terminated on the next rising clock edge. When driven high, this input indicates to the processor that it must wait and not end the current access. READ - Read cycle (output) This active high output is asserted during read cycles on the memory bus. WRITE* - Write enable (output) This active low output provides a write strobe during write cycles on the memory bus. PROM ROMS*[1:0] - PROM chip-select (output) These active low outputs provide the chip-select signal for the PROM area. ROMS*[0] is asserted when the lower half of the PROM area is accessed (0 - 0x10000000), while ROMS*[1] is asserted for the upper half. SRAM RAMOE*[4:0] - RAM output enable (output) These active low signals provide an individual output enable for each RAM bank. RAMS*[4:0] - RAM chip-select (output) These active low outputs provide the chip-select signals for each RAM bank. RWE* [3:0] - RAM write enable (output) These active low outputs provide individual write strobes for each byte. RWEN[0] controls D[31:24], RWEN[1] controls D[23:16], etc. 9 4226G–AERO–05/09 I/O IOS* - I/O select (output) This active low output is the chip-select signal for the memory mapped I/O area. SDRAM Interface SDCLK - SDRAM clock (output) SDRAM clock provides the SDRAM interface clock reference. SDCAS* - SDRAM column address strobe (output) This active low signal provides a common CAS for all SDRAM devices. SDCS*[1:0] - SDRAM chip select (output) These active low outputs provide the chip select signals for the two SDRAM banks. SDDQM[3:0] - SDRAM data mask (output) These active low outputs provide the DQM signals for both SDRAM banks. SDRAS*- SDRAM row address strobe (output) This active low signal provides a common RAS for all SDRAM devices. SDWEN - SDRAM write strobe (output) This active low signal provides a common write strobe for all SDRAM devices. System Signals CLK - Processor clock (input) The CLK input provides the main processor clock reference. RESET* - Processor reset (input) When asserted, this active low input will reset the processor and all on-chip peripherals. WDOG* - Watchdog time-out (open-drain output) This active low output is asserted when the watchdog expires. BEXC* - Bus exception (input) This active low input is sampled simultaneously with the data during accesses on the memory bus. If asserted, a memory error will be generated. ERROR* - Processor error (open-drain output) This active low output is asserted when the processor has entered error state and is halted. This happens when traps are disabled and a synchronous (un-maskable) trap occurs. PIO[15:0] - Parallel I/O port (bi-directional) These bi-directional signals can be used as inputs or outputs to control external devices. BYPASS - PLL bypass (input) When driven to VCC, this active high input set the PLL in bypass mode. The device is then directly clocked by the external clock. When grounded, the device is clocked through the PLL. SKEW[1:0] - Clock tree skew (input) These input signals configurate the programmable skew on the triplicated clock trees. LOCK - PLL lock (output) 10 AT697E 4226G–AERO–05/09 AT697 This active high output is asserted when the PLL output (internal node) is locked at the frequency corresponding to four times the input command. LFT - PLL passive low pass filter (input) This input is used to connect the PLL passive low pass filter. DSU Signals DSUACT - DSU active (output) This active high output is asserted when the processor is in debug mode and controlled by the DSU. DSUBRE - DSU break enable (input) A low-to-high transition on this active high input will generate break condition and put the processor in debug mode. DSUEN - DSU enable (input) The active high input enables the DSU unit. If de-asserted, the DSU trace buffer will continue to operate but the processor will not enter debug mode. DSURX - DSU receiver (input) This active high input provides the data to the DSU communication link receiver DSUTX - DSU transmitter (output) This active high input provides the output from the DSU communication link transmitter. JTAG TCK - Test Clock (input) Used to clock serial data into boundary scan latches and control sequence of the test state machine. TCK can be asynchronous with CLK. TMS - Test Mode select (input) Primary control signal for the state machine. Synchronous with TCK. A sequence of values on TMS adjusts the current state of the TAP. TDI - Test data input (input) Serial input data to the boundary scan latches. Synchronous with TCK TDO - Test data output (output) Serial output data from the boundary scan latches. Synchronous with TCK TRST - Test Reset (input) Resets the test state machine. Can be asynchronous with TCK. Shall be grounded for end application. PCI Arbiter AREQ*[3:0] - PCI bus request (Input) When asserted, these active low inputs indicate that a PCI agent is requesting the bus. AGNT*[3:0] - PCI bus grant (Output) When asserted, these active low outputs indicate that a PCI agent is granted the PCI bus. 11 4226G–AERO–05/09 PCI interface A/D[31:0] - PCI Address Data (bi-directional) Address and Data are multiplexed on the same PCI pins. During the address phase, A/D[31::00] contain a physical address (32 bits). For I/O, this is a byte address; for configuration and memory, it is a DWORD address. During data phases, A/D[07::00] contain the least significant byte and A/D[31::24] contain the most significant byte. C/BE[3:0]* - PCI Bus Command and Byte Enables (bi-directional) During the address phase of a transaction, C/BE[3::0]* define the bus command. During the data phase, C/BE[3::0]* are used as Byte Enables. The Byte Enables are valid for the entire data phase. PAR - Parity (bi-directional) The number of "1"s on A/D[31::00], C/BE[3::0]*, and PAR equals an even number FRAME* - Cycle Frame (bi-directional) It is driven by the current master to indicate the beginning and duration of an access. FRAME* is asserted to indicate a bus transaction is beginning. While FRAME* is asserted, data transfers continue. When FRAME* is deasserted, the transaction is in the final data phase or has completed. IRDY* - Initiator Ready (bi-directional) IRDY* indicates the initiating agent’s ability to complete the current data phase of the transaction. IRDY* is used in conjunction with TRDY*. During a write, IRDY* indicates that valid data is present on A/D[31::00]. During a read, it indicates the master is prepared to accept data. TRDY* - Target Ready (bi-directional) TRDY* indicates the target agent’s (selected device’s) ability to complete the current data phase of the transaction. TRDY* is used in conjunction with IRDY*. During a read, TRDY* indicates that valid data is present on AD[31::00]. During a write, it indicates the target is prepared to accept data. STOP* - Stop (bi-directional) STOP* indicates the current target is requesting the master to stop the current transaction. PCI_LOCK* - Lock (bi-directional) PCI_LOCK* indicates an atomic operation to a bridge that may require multiple transactions to complete. IDSEL - Initialization Device Select (input) Initialization Device Select is used as a chip select during configuration read and write transactions. DEVSEL* - Device Select (bi-directional) When actively driven, indicates the driving device has decoded its address as the target of the current access. As an input, DEVSEL* indicates whether any device on the bus has been selected. REQ* - PCI bus request (output) 12 AT697E 4226G–AERO–05/09 AT697 REQ* indicates to the arbiter that this agent desires use of the bus. This is a point-topoint signal. Every master has its own REQ* which must be tri-stated while PCI_RST* is asserted. GNT* - PCI Bus Grant (input) GNT* indicates to the agent that access to the bus has been granted. This is a point-topoint signal. Every master has its own GNT* which must be ignored while PCI_RST* is asserted. PCI_CLK - PCI clock (input) PCI_CLK provides timing for all transactions on PCI. All other PCI signals, except PCI_RST*, are sampled on the rising edge of PCI_CLK and all other timing parameters are defined with respect to this edge. PCI_RST* - PCI Reset (input) Reset is used to bring PCI-specific registers, sequencers, and signals to a consistent state. PERR* - Parity Error (bi-directional) Parity Error is only for the reporting of data parity errors during all PCI transactions except a Special Cycle. The PERR* pin is sustained tri-state and must be driven active by the agent receiving data two clocks following the data when a data parity error is detected. The minimum duration of PERR* is one clock for each data phase that a data parity error is detected. SERR* - System Error (bi-directional) System Error is for reporting address parity errors, data parity errors on the special cycle command, or any other system error where the result will be catastrophic. If an agent does not want a non-maskable interrupt (NMI) to be generated, a different reporting mechanism is required. SYSEN* - PCI Host (input) This active low input specifies the configuration of the device. At boot-up time, if SYSEN* is sampled at a low level, the device is configured as the host of the PCI bus. If SYSEN* is sampled at a high level, the device is configured as a satellite. 13 4226G–AERO–05/09 AT697 CPU Core This section discusses the SPARC core architecture in general. The main function of the CPU core is to ensure correct program execution. The CPU must therefore be able to access memories, perform calculations, control peripherals, and handle interrupts. The AT697 CPU core is based on the LEON2 architecture. Figure 2. Block diagram of the AT697 Integer Unit architecture call/branch address SPARC Architecture Overview I-cache data address +1 ‘0’ jmpa tbr Add f_pc Fetch d_inst d_pc Decode e_inst e_pc rs1 imm, tbr, wim, psr operand2 Execute alu/shift mul/div y 32 30 ex pc jmpl address m_inst m_pc result ytmp Memory 32 32 D-cache address/dataout datain w_inst w_pc wres Y Write 30 rd tbr, wim, psr regfile rs1 rs2 T he AT697 integer unit (IU) implements SPARC integer instructions as defined in SPARC Architecture Manual version 8. The IU is designed for highly dependable space and military applications by including fault tolerance features. To execute instructions at a rate approaching one instruction per clock cycle, the IU employs a five-stage instruction pipeline that permits parallel execution of multiple instructions. • Instruction Fetch: If the instruction cache is enabled, the instruction is fetched from the instruction cache. Otherwise, the fetch is forwarded to the memory controller. The instruction is valid at the end of this stage and is latched inside the IU. Decode: The instruction is decoded and the operands are read. Operands may come from the register file or from internal data bypasses. CALL and Branch target addresses are generated in this stage. Execute: ALU, logical, and shift operations are performed. For memory operations and for JMPL/RETT, the address is generated. Memory: Data cache is accessed. For cache reads, the data will be valid by the end of this stage, at which point it is aligned as appropriate. Store data read out in the Execute stage is written to the data cache at this time. • • • 14 AT697E 4226G–AERO–05/09 AT697 • Write: The result of any ALU, logical, shift, or cache read operations re written back to the register file. All five stages operate in parallel, working on up to five different instructions at a time. A basic ’single-cycle’ instruction enters the pipeline and completes in five cycles. By the time it reaches the write stage, four more instructions have entered and are driving through the pipeline behind it. So, after the first five cycles, a single-cycle instruction exits the pipeline and a single-cycle instruction enters the pipeline on every cycle. Of course, a ’single-cycle’ instruction actually takes five cycles to complete, but they are called single cycle because with this type of instruction the processor can complete one instruction per cycle after the initial five-cycle delay. In order to maximize performance and parallelism, the AT697 SPARC implementation uses powerful AMBA bus. Instructions in the program memory are executed with a five level pipelining. While one instruction is being executed, the next instruction is pre-fetched from the program memory. This concept enables instructions to be executed in every clock cycle. Program Counters Two 32-bit program counters (PC and nPC) are provided. The 32-bit PC contains the address of the instruction currently being executed by the IU. The nPC holds the address of the next instruction to be executed (assuming a trap does not occur). When a trap occurs, the PC address is saved in the local register (l1) while the nPC address is saved in the local register (l2). When returning from trap, l1 value is copied back to PC and l2 value is copied back to nPC. ALU - Arithmetic Logic Unit The high-performance ALU operates in direct connection with all the 32 general purpose working registers. Within a single clock cycle, arithmetic operations between general purpose registers or between a register and an immediate memory address are executed. The implementation of the architecture also provide a powerful multiplier/divider supporting both signed and unsigned multiplication/division. Support for high performance 64-bit operation is also provided.The 32-bit Y register contains the most significant word of the double-precision product of an integer multiplication, as a result of either an integer multiply instruction, or of a routine that uses the integer multiply step instruction. The Y register also holds the most significant word of the double-precision dividend for an integer divide instruction. Register File - Windows The fast access register file contains 8 SPARC register windows. Each window consists in a 32-register set. When a program is running, it has access to 32 32-bit processor registers which include 8 global registers plus 24 registers that belong to the current register window. • • • The first 8 registers in the window are called the in registers’ (i0-i7). When a function is called, these registers may contain arguments that can be used. The next 8 are the ’local registers’ (l0-l7) which are scratch registers that can be used for anything while the function executes. The last 8 registers are the ’out registers’ (o0-o7) which the function uses to pass arguments to functions that it calls. AT697 register file implementation is based on two dual-port rams. The first dual-port ram corresponds to %rs1 operand of a SPARC instruction while the second corre sponds to %rs2 operand. The two dual-port rams contents are always equal. When one function calls another, the calling function can choose to execute a SAVE instruction. This instruction decrements an internal counter, the current window pointer (cwp), shifting the register window downward. The caller’s out registers then become 15 4226G–AERO–05/09 the calling function’s in registers, and the calling function gets a new set of local and out registers for its own use. Only the pointer changes because the registers and return address do not need to be stored on a stack. The RETURN instruction acts in the opposite way Figure 3. Overlapping Windows cwp w7 ins w1 outs W1 Restore w1 locals w0 w0 outs locals W0 W7 w7 locals w7 outs w6 ins W6 w6 locals Save W5 4226G–AERO–05/09 w0 ins globals w2 w5 w1 outs locals ins W2 w2 W4 w4 locals ins w5 w4 w2 outs locals ins w4 outs w3 outs w3 w3 ins locals W3 w6 outs w5 ins The Window Invalid Mask register (WIM) is controlled by supervisor software and is used by hardware to determine whether a window overflow or underflow trap is to be generated by a SAVE, RESTORE, or RETT instruction. When a SAVE, RESTORE, or RETT instruction is executed, the current value of the CWP is compared against the WIM register. If the SAVE, RESTORE, or RETT instruction would cause the CWP to point to an “invalid” register set, a window_overflow or window_underflow trap is caused. To prevent erroneous operations from SEU errors in the main register file, each word is protected with a 7-bit EDAC checksum. The EDAC checksums are checked when the register is used as operand in an instruction. Any single-bit error is corrected and written back to the register file before the instruction is executed. If an un-correctable error is detected, a register hardware error trap (trap 0x20) is generated. The protection can be enabled/disabled by programming the ‘di’ bit from register file protection control register. By setting the ‘te’ bit, errors can be inserted in the register file to test the protection function. When the ‘te’ bit is set, the register checksum is combined with the ‘tcb’ field before being written to the register file. Due to the presence of the two dual-port rams for register file implementation, the following rules apply to the error injection test process. • • Test checkbits TCB[2:0] is Xored with checkbit[6:4] corresponding to the %rs1 operand. Test checkbits TCB[5:3] is Xored with checkbit[6:4] corresponding to the %rs2 operand. ! 0x32 = ! register file test enable ! tcb[2:0] = 0x4 ! tcb[5:3] = 0x1 Here is a simple example for the test of a single error in register file %rs1 16 AT697E AT697 mov 0x32, %l1 mov %l1, %asr16 ! clear %l3 ! => write 0x0 to %l3 ! forces 0x08 as checkbit for %l3 (error insertion in %rs1 dual-port ram) mov %g0, %l3 ! disable EDAC test mode mov %g0, %asr16 ! access to %l3 as %rs1 operand ! => single error detection and correction add %l3,%l2,%l1 A correction counter ‘cnt’ is provided for error management. The ‘cnt’ field is incremented each time a register correction is performed. It saturates at “111”. State Register The State Register (PSR) contains information about the result of the most recently executed arithmetic instruction. This information can be used for altering program flow in order to perform conditional operations. Note that the Status Register is updated after all ALU operations, as specified in the SPARC architecture specification. This will in many cases remove the need for using the dedicated compare instructions, resulting in faster and more compact code. The state also provides some global information on the current window used, the authorized interrupts and peripheral (FPU and coprocessor) presence. A global interrupt management is provided through the processor state register. Trap and Interrupts can be individually enabled/disables from within this register. Instruction Set AT697 instructions fall into six functional categories: load/store, arithmetic/logical/ shift, control transfer, read/write control register, floating-point, and miscellaneous. Please refer to SPARC V8 Architecture manual that presents all the implemented instructions. The FPU is designed to provide execution of single and double-precision floating-point instructions. During the execution of floating-point instructions the processor pipeline is held. The FPU is designed for highly dependable space and military applications, by including fault tolerance features like error detection and correction and triple modular redundancy. The FPU depends upon the IU to access all addresses and control signals for memory access. Floating-point loads and stores are executed in conjunction with the IU, which provides addresses and control signals while the FPU supplies or stores the data. Instruction fetch for integer and floating-point instructions is provided by the IU. The FPU contains 32 32-bit floating-point f registers, which are numbered from f[0] to f[31]. Unlike the windowed r registers, at a given time an instruction has access to any of the 32 f registers. The f registers can be read and written by FPop (FPop1/FPop2 format) instructions, and by load/store single/double floating-point instructions (LDF, LDDF, STF, STDF). Rounding Direction Rounding direction for floating point results is built according to the ANSI/IEEE Standard 754-1985. In this way, • 0 = round to nearest Floating Point Unit 17 4226G–AERO–05/09 • • • 1 = round to zero 2 = round to +infinity 3 = round to -infinity Figure 4. Rounding Direction Schematic Value < 0 0 -∞ round to - ∞ round to + ∞ round to zero Value > 0 round to - ∞ round to zero + ∞ round to + ∞ Fault Tolerance The processor has been especially designed for space application. To prevent erroneous operations from single event transient (SET) and single event upset (SEU) errors, the AT697 processor implements a set of protection features including : • Full triple modular redundancy (TMR) architecture The TMR architecture is based on a fully triplicated clock distribution (CLK1, CLK2 and CLK3). The PCI clock and the CPU clock are built as three-clock trees. The same triplication is applied to the PCI reset and to the CPU reset. See figure 5 for an overview of the TMR architecture. Programmable skews on the clock trees are also provided to prevent the processor from arbitrary single-event transient errors. Refer to the ‘clock’ section for detailed information on TMR implementation and skew implementation. • • • EDAC protection on Regfile EDAC protection on external memory interface Parity protection on instruction and data caches Figure 5. TMR structure - Clock triplication principle 18 AT697E 4226G–AERO–05/09 AT697 Watch Points The integer unit contains four hardware watch-points allowing generation of a trap on an arbitrary memory address range. Any binary aligned address range can be watched (the two less significant bits are ignored) Each watch-point consists in a pair of application-specific registers • • break address register The break address defines a reference address for testing. mask register The mask indicates which bits of the break address register are to be effectively taken in account during address test Configuration A watchpoint is enabled setting logical one at least one of the three bits IF, Dl or DS in the watchpoint address and mask registers. When all three bits are set logical zero, the watchpoint is disabled. If the instruction fetch bit (IF) from the watchpoint address register is set logical one, any attempt to fetch an instruction from one of the address defined by ADDR and MASK results in a trap generation. If the data store bit (DS) from the watchpoint address register is set logical one, any attempt to store data to one of the address defined by ADDR and MASK results in a trap generation. If the data load bit (DL) from the watchpoint mask register is set logical one, any attempt to load a data from one of the address defined by ADDR and MASK results in a trap generation. Operation To detect if an address is part of the memory address range that traps, address bit 31 down to bit 2 are Xored with the ADDR field from the watchpoint address register. This operation is based on the following segmentation of an address. Table 7. Address Segmentation bit num. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 field Address 8 7 6 5 4 3 2 1 0 ignored With such segmentation, it is possible to define trap segment from 4bytes up to 1Gbyte. The result of the Xor is then Anded with the MASK field of the watchpoint mask register. If the result is zero, this indicates that address specified is in the watched range. Then, a watchpoint hit error is generated. Trap 0x0B is generated. If result is different from zero, address is out of the watched address range. Figure 6. Watchpoint Hit Principle Address Bus 30 30 IF DS logic DL Trap 0x0B 30 ADDR 30 MASK Data Bus Watchpoint Mask Reg. %asry Watchpoint Address Reg. %asrx 19 4226G–AERO–05/09 Traps and Interrupts Overview The AT697 supports two types of traps: - synchronous traps - asynchronous traps also called interrupts. Synchronous traps are caused by hardware responding to a particular instruction: they occur during the instruction that caused them. Asynchronous traps occur when an external event interrupts the processor. They are not related to any particular instruction and occur between the execution of instructions. A trap is a vectored transfer of control to the supervisor through a special trap table that contains the first four instructions of each trap handler. The trap base address (TBR) of the table is established by supervisor and the displacement, within the table, is determined by the trap type. A trap causes the current window pointer to advance to the next register window and the hardware to write the program counters (PC & nPC) into two registers of the new window. Synchronous Traps The AT697 follows the general SPARC trap model. The table below shows the implemented traps and their individual priority. Table 8. Trap Overview Trap reset write error instruction_access_exception illegal_instruction privileged_instruction fp_disabled cp_disabled watchpoint_detected window_overflow window_underflow register_hadrware_error mem_address_not_aligned fp_exception TT (trap type) 0x00 0x2b 0x01 0x02 0x03 0x04 0x24 0x0B 0x05 0x06 0x20 0x07 0x08 Priority 1 2 3 5 4 6 6 7 8 8 9 10 11 Description Power-on reset Write buffer error Error during instruction fetch Edac uncorrectable error during instruction fetch UNIMP or other un-implemented instruction Execution of privileged instruction in user mode FP instruction while FPU disabled co-processor instruction while co-processor disabled Instruction or data watchpoint match SAVE into invalid window RESTORE into invalid window register file uncorrectable EDAC error Memory access to un-aligned address FPU exception 20 AT697E 4226G–AERO–05/09 AT697 data_access_exception tag overflow divide_exception trap_instruction 0x09 0x0A 0x2A 0x80 -0xFF 13 14 15 16 Access error during load or store instruction Tagged arithmetic overflow Divide by zero Software trap instruction (TA) 21 4226G–AERO–05/09 Traps Description • reset - A reset trap is caused by an external reset request. It causes the processor to begin executing at virtual address 0. After a Reset Trap, no special memory states are defined exept the PSR’s ‘et’ and ‘s’ bits that are initialized respectively ‘0’ and ‘1’. write error - An error exception occurred on a data store to memory. instruction_access_exception - A blocking error exception occurred on an instruction access. illegal_instruction - An attempt was made to execute an instruction with an unimplemented opcode, or an UNIMP instruction, or an instruction that would result in illegal processor state. privileged_instruction - An attempt was made to execute a privileged instruction while supervisor bit (s) in PSR is ‘0’ (not in supervisor mode). fp_disabled - An attempt was made to execute an FPU instruction while FPU is not enabled or not present. cp_disabled - An attempt was made to execute a co-processor instruction while coprocessor is not enabled or not present. watchpoint_detected - An instruction fetch memory address or load/store data memory address matched the contents of a pre-loaded implementation-dependent “watchpoint” register. window_overflow - A SAVE instruction attempted to cause the current window pointer (CWP) to point to an invalid window in the WIM. window_underflow - A RESTORE or RETT instruction attempted to cause the current window pointer (CWP) to point to an invalid window in the WIM. register_hardware_error - An error exception occurred on a read only register access. A register file uncorrectable error was detected. mem_address_not_aligned - A load/store instruction would have generated a memory address that was not properly aligned according to the instruction, or a JMPL or RETT instruction would have generated a non-word-aligned address. fp_exception - An FPU instruction generated an IEEE_754_exception and its corresponding trap enable mask (TEM) bit was 1, or the FPU instruction was unimplemented, or the FPU instruction did not complete, or there was a sequence or hardware error in the FPU. The type of floating-point exception is encoded in the FSR’s ftt field. data_access_exception - A blocking error exception occurred on a load/store data access. EDAC uncorrectable error. tag_overflow - A tagged arithmetic instruction was executed, and either arithmetic overflow occurred or at least one of the tag bits of the operands was non zero. trap_division_by_zero - An integer divide instruction attempted to divide by zero. trap_instruction - A software instruction (Ticc) was executed and the trap condition evaluated to true. • • • • • • • • • • • • • • • • When multiple synchronous traps occur at the same cycle (i.e hardware errors), the highest priority trap is taken, and lower priority traps are ignored. 22 AT697E 4226G–AERO–05/09 AT697 Asynchronous Traps / Interrupts The AT697 handles 11 interrupts. Interrupts can be due to external interrupt requests not directly related to any particular instruction or can be due to exception caused by particular previously executed instruction. Figure 7. Interrupt Controller Block Diagram Interrupt Sources PIO[15:0] I/O Interrupt Reg. IOIT Internal Interrupt (Timer1, Uart1,...) Data Bus Interrupt Clear Reg. ITC Interrupt Pending Reg. ITP mask priority trap1x generation Interrupt Force Reg. ITF Interrupt Mask & Priority Reg. ITMP Operation When an interrupt is generated, the corresponding bit is set in the interrupt pending register (ITP). The pending bits are ANDed with the interrupt mask register and then forwarded to the priority selector. The highest interrupt from priority level 1 will be forwarded to the IU - if no unmasked pending interrupt exists on priority level 1, then the highest unmasked interrupt from priority level 0 is forwarded. When the IU acknowledges the interrupt, the corresponding pending bit will automatically be cleared. Interrupt can also be forced by setting a bit in the interrupt force register. In this case, the IU acknowledgement will clear the force bit rather than the pending bit. After reset, the interrupt mask register is set to all zeros while the remaining control registers are undefined. Interrupt List The following table presents the assignement of the interrupts. Table 9. Interrupt Overview Interrupt 15 14 13 12 11 10 9 8 7 TT (Trap Type) 0x1F 0x1E 0x1D 0x1C 0x1B 0x1A 0x19 0x18 0x17 Source unused PCI unused unused DSU trace buffer unused Timer 2 Timer 1 I/O interrupt [3] 23 4226G–AERO–05/09 Interrupt 6 5 4 3 2 1 TT (Trap Type) 0x16 0x15 0x14 0x13 0x12 0x11 Source I/O interrupt [2] I/O interrupt [1] I/O interrupt [0] UART 1 UART 2 Internal bus error I/O interrupts As an alternate function of the general purpose interface, the AT697 allows to input interrupt from external devices. Up to four external interrupts can be programmed at the same time. The four interrupts are assigned to interrupt 4, 5, 6 and 7. Each I/O interrupt consists of four fields in the I/O interrupt register (IOIT) : ENx, LEx, PLx and ISELx. An I/O interrupt is enabled setting logical one the ENx bit in the IOIT register. Setting this bit logical zero disables the interrupt. The ISELx field in the IOIT register defines which port of the general purpose interface should generate I/O interrupt x. The port can be selected from within PIO[15:0] and D[15:0]*. Each I/O interrupt can have its trigger mode and its polarity individually configured. When the LEx bit is set logical one, the corresponding I/O interrupt is edge triggered. If the polarity bit (PLx) is driven logical one the interrupt triggers when a rising edge is applied on the pin. If the polarity bit is driven logical zero the interrupt triggers when a falling edge is applied on the pin. When the LEx bit is set logical zero, the corresponding I/O interrupt is level sensitive. If the polarity bit (PLx) is driven logical one the interrupt triggers when a high level is applied on the pin. If the polarity bit is driven logical zero the interrupt triggers when a low level is applied on the pin. The following table summarizes the I/O interrupt configurations. Table 10. I/O Interrupt Configuration LEx 0 0 1 1 PLx 0 1 0 1 Trigger low level high level falling edge rising edge Interrupt Priority The 15 interrupts handled by the AT697 are prioritised, with interrupt 15 (TT = 0x1F) having the highest priority and interrupt 1 (TT = 0x11) the lowest. It is possible to change the priority level of an interrupt using the two priority levels from the interrupt mask and priority register (ITMP). Each interrupt can be assigned to one of two levels as programmed in the Interrupt mask and priority register. Level 1 has higher priority than level 0. Within each level the interrupts are prioritised. 24 AT697E 4226G–AERO–05/09 Cache Memories Overview The AT697 processor implements a Harvard architecture with separate instruction and data buses, connected to two independent cache controllers. In order to improve the speed performance of the cpu core, multi-set-caches are used for both instruction and data caches. The cache replacement policy used for both instruction and data caches is based on the LRU algorithm. The least recently used (LRU) set of the cache is replaced when new data need to be stored in cache. Cache mapping Most of the main memory areas can be cached. The cacheable areas are the PROM and RAM areas. The following table presents the caching capabilities of the processor. Table 11. Cache Capability List Address Range 0x00000000 - 0x1FFFFFFF 0x20000000 - 0x3FFFFFFF 0x40000000 -0x7FFFFFFF 0x80000000 -0xFFFFFFFF Area PROM I/O RAM Internal Cache status Cached Non-cacheable Cached Non-cacheable Operation During normal operation, the processor accesses instructions and data using ASI 0x8 0xB as defined in the SPARC standard. U sing the LDA/STA instructions, alternative address spaces as caches can be accessed. ASI[3:0] are used for the mapping when ASI[7:4] have no influence on operation. • Access with ASI 0 - 3 will force a cache miss, update the cache if the data was previously cached or allocate a new line if the data was not in the cache and the address refers to a cacheable location. Access to ASI 4 and 7 will force a cache miss and update the cache if the data was previously cached. • The following table shows the ASI implementation on the AT697. 24 AT697E 4226G–AERO–05/09 AT697 Table 12. ASI Usage ASI 0x0, 0x1, 0x2, 0x3 0x4, 0x7 0x5 0x6 0x8, 0x9, 0xA, 0xB 0xC 0xD 0xE 0xF Usage Forced cache miss (replace if cacheable) Forced cache miss (update on hit) Flush instruction cache Flush data cache Normal cached access (replace if cacheable) Instruction cache tags Instruction cache data Data cache tags Data cache data Note: Please refer to the SPARC v8 specification for detailed information on ASI usage. Instruction Cache Overview The AT697 instruction cache is a multi-set cache of 32 kbyte divided in 4 memory sets. Multi-set-cache use improves speed performance of the core. The instruction cache is divided into cache lines with 32 bytes of data. Each line has a cache tag associated with it consisting of a tag field and one valid bit per 4-byte sub-block. The instruction cache operations are controled with the cache control register (CCR). On an instruction cache miss to a cachable location, the instruction is fetched and the corresponding tag and data line updated. The instruction cache always works in one of three modes: • • • disabled, enabled or frozen. Cache Control Operation The instruction cache current state is reported in the instruction cache state field (ICS) of the cache controler register (CCR). Disabled mode Enabled mode If disabled, no cache operation is performed and load and store requests are passed directly to the memory controller. If enabled, the cache operates as described above. In the frozen state, the cache is accessed and kept in synchronisation with the main memory as if it was enabled, but no new lines are allocated on read misses. If the freeze on interrupt bit (IF) bit is set logical one on the cache control register (CCR), the instruction cache is frozen when an asynchronous interrupt is taken. This can be beneficial in real-time system to allow a more accurate calculation of worst-case execution time for a code segment. The execution of the interrupt handler will not evict any cache lines and when control is returned to the interrupted task, the cache state is identical to what it was before the interrupt. Freeze mode 25 4226G–AERO–05/09 If a cache has been frozen by an interrupt, it can only be enabled again by enabling the cache in the CCR. This is typically done at the end of the interrupt handler before control is returned to the interrupted task. Burst fetch An instruction burst fetch mode can be enabled setting logical one the burst fetch bit (IB) in the cache control register. If the burst fetch is enabled, the cache line is filled from main memory starting at the missed address and until the end of the line. At the same time, the instructions are forwarded to the IU. If the IU cannot accept the streamed instructions due to internal dependencies or multi-cycle instruction, the IU is halted until the line fill is completed. If the IU executes a control transfer instruction during the line fill, the line fill will be terminated on the next fetch. If instruction burst fetch is enabled, instruction streaming is enabled even when the cache is disabled. In this case, the fetched instructions are only forwarded to the IU and the cache is not updated. Cache Flush Instruction cache can be flushed by executing the FLUSH instruction, setting logical one the flush instruction cache bit (FI) in the cache control register, or writing any location with ASI=0x5. The flush operation takes one cycle per line during which the IU will is not halted, but during which the cache is disabled. When the flush operation is completed, the cache will resume the state indicated in the cache control register. If a memory access error occurs during a line fill with the IU halted, the corresponding valid bit in the cache tag is not set. If the IU later fetches an instruction from the failed address, a cache miss will occur, triggering a new access to the failed address. If the error remains, an instruction access error trap (tt=0x1) is generated. Instruction Cache Parity Error detection of cache tags and data is implemented using two parity bits per tag and per 4-byte data sub-block. The tag parity is generated from the tag value and the valid bits. The data parity is derived from the sub-block data. The parity bits are written simultaneously with the associated tag or sub-block and checked on each access. The two parity bits correspond to the parity of odd and even data (tag) bits. If a tag parity error is detected during a cache access, a cache miss is generated. The tag and the data are automatically updated. All valid bits except the one corresponding to the newly loaded data are cleared. Each error is reported in the instruction cache tag error counter from the CCR. The instruction cache tag error counter (ITE) is incre mented after each instruction cache tag error detection. If a data sub-block parity error occurs, a miss is also generated but only the failed subblock is updated with data from main memory. Each error is reported in the instruction cache data error counter from the CCR. The instruction cache data error counter (IDE) is incremented after each instruction cache data error detection. Error reporting Data Cache Overview The AT697 data cache is a multi-set cache of 16 kbyte divided in 2 memory sets. Multiset-cache use improves speed performance. The data cache is divided into cache lines with 16 bytes of data. Each line has a cache tag associated with it consisting of a tag field and one valid bit per 4-byte sub-block. The instruction cache operations are controled with the cache control register (CCR). Cache Control Operation 26 AT697E 4226G–AERO–05/09 AT697 Write The write policy for stores is write-through with no-allocate on write-miss. The write buffer (WRB) consists of three 32-bit registers used to temporarily hold store data until it is sent to the destination device. For half-word or byte stores, the stored data replicated into proper byte alignment for writing to a word-addressed device, before being loaded into one of the WRB registers. The WRB is emptied prior to a load-miss cache-fill sequence to avoid any stale data from being read in to the data cache. Read Cache Flush On a data cache read-miss to a cachable location, 4 bytes of data are loaded into the cache from main memory. Data cache can be flushed by executing the FLUSH instruction, setting logical one the flush data cache bit (FD) in the cache control register, or writing any location with ASI=0x6. The flush operation takes one cycle per line during which the IU will is not halted, but during which the cache is disabled. When the flush operation is completed, the cache will resume the state indicated in the cache control register. Since the processor executes in parallel with the write buffer, a write error will not cause an exception to the store instruction. Depending on memory and cache activity, the write cycle may not occur until several clock cycles after the store instructions has completed. If a write error occurs, the currently executing instruction will take trap 0x2B. Note: the 0x2B trap handler should flush the data cache, since a write hit would update the cache while the memory would keep the old value due the write error. Error Reporting If a memory access error occurs during a data load, the corresponding valid bit in the cache tag will not be set. and a data access error trap (tt=0x09) is generated. Data Cache Parity Error detection of cache tags and data is implemented using two parity bits per tag and per 4-byte data sub-block. The tag parity is generated from the tag value and the valid bits. The data parity is derived from the sub-block data. The parity bits are written simultaneously with the associated tag or sub-block and checked on each access. The two parity bits correspond to the parity of odd and even data (tag) bits. If a tag parity error is detected during a cache access, a cache miss is generated. The tag and the data are automatically updated. All valid bits except the one corresponding to the newly loaded data are cleared. Each error is reported in the instruction cache tag error counter from the CCR. The data cache tag error counter (DTE) is incremented after each data cache tag error detection. If a data sub-block parity error occurs, a miss is also generated but only the failed subblock is updated with data from main memory. Each error is reported in the data cache data error counter from the CCR. The data cache data error counter (DDE) is incremented after each data cache data error detection. Data Cache Snooper In addition to the cache controller, a snooper is implemented on the on-chip cache subsystem. The cache snooper is enabled setting logical one the snoop enable bit (DS) in the cache control register. This snooper is able to verify if a master on the internal bus accesses and modifies some cached data. If a master accesses a data in memory and this data is cached, the snooper will invalidate the corresponding cache tag. Next time the IU will access the modified data, a cache miss will be generated due to not valid tag. 27 4226G–AERO–05/09 Diagnostic Cache Access Tags and data in the instruction and data cache can be accessed through ASI address space 0xC, 0xD, 0xE and 0xF by executing LDA and STA instructions. Address bits making up the cache offset will be used to index the tag to be accessed while the least significant bits of the bits making up the address tag will be used to index the cache set. Diagnostic read of tags is possible by executing an LDA instruction with ASI=0xC for instruction cache tags and ASI=0xE for data cache tags. The cache line and the cache set are indexed by the address bits making up the cache offset and the least significant bits of the address bits making up the address tag. S imilarly, the data sub-blocks may be read by executing an LDA instruction with ASI=0xD for instruction cache data and ASI=0xF for data cache data. The sub-block to be read in the indexed cache line and set is selected by A[4:2]. The tags can be directly written by executing a STA instruction with ASI=0xC for the instruction cache tags and ASI=0xE for the data cache tags. The cache line and cache set are indexed by the address bits making up the cache offset and the least significant bits of the address bits making up the address tag. D[31:10] is written into the ATAG filed and the valid bits are written with the D[7:0] of the write data. The data sub-blocks can be directly written by executing a STA instruction with ASI=0xD for the instruction cache data and ASI=0xF for the data cache data. The sub-block to be read in the indexed cache line and set is selected by A[4:2]. Note: Diagnostic access to the cache is not possible during a FLUSH operation and will cause a data exception (trap=0x09) if attempted. 28 AT697E 4226G–AERO–05/09 AT697 Memory Interface Overview The AT697 provides a 32-bit bus capable to interface PROM, memories mapped I/O devices, asynchronous static rams (SRAM) and synchronous dynamic rams (SDRAM). The memory bus can be configured either for 8-bit, 16-bit, 32-bit or 40-bit accesses. The memory controller manages up to 2 Gbytes of external memory. The following table presents the memory controller address map. Table 13. Memory Controller address map Address Range 0x00000000 - 0x1FFFFFFF 0x20000000 - 0x2FFFFFFF 0x40000000 - 0x7FFFFFFF Size 512M 256M 1G Mapping PROM I/O SRAM/SDRAM For applications that require smaller memory areas and/or smaller performances, it is possible to configure some memory spaces as 8-bit or 16-bit wide data bus. All the configuration of the memory interface is done through the three memory controller registers : MCFG1, MCFG2 and MCFG3. MCFG1 is the register dedicated to PROM and IO configuration. SRAM and SDRAM are configured through MCFG2 and MCFG3. Here is an overview of the 32-bit interconnection between the AT697 and external memories. Figure 8. Memory Interface Overview ROMS*[1:0] OE* WRITE* IOS* CS OE WE CS OE WE PROM A D AT697 RAMS*[4:0] RAMOE*[4:0] RWE*[3:0] I/O A D CS OE WE SRAM A D SDCLK SDCSN[1:0] SDRAS* SDCAS* SDWE* SDDQM[3:0] A[27:0] D[31:0] CLK CSN RAS CAS WE DQM A[16:15] BA SDRAM A D A[14:2] To improve the bandwidth of the memory bus, accesses to consecutive addresses can be performed in burst mode. Burst transfers will be generated when the memory controller is accessed using a burst request from the internal bus. These includes instruction cache-line fills, double loads and double stores. The timing of a burst cycle is identical to the programmed basic cycle with the exception that during read cycles, the lead-out cycle will only occurs after the last transfer. 29 4226G–AERO–05/09 RAM Interface The memory controller gives the capability to control up to 1Gbyte of RAM. The global RAM area supports two RAM types : asynchronous static RAM (SRAM) and synchronous dynamic RAM (SDRAM). SRAM interface Overview The SRAM interface can manage up to five SRAM banks. The control of the SRAM memory accesses uses a standard set of pin, including chip selects (RAMS*x), output enable (RAMOE*x) and write enable (RWE*x) lines. The bank size of the four first banks of the SRAM area can be configured by setting the value of the SRAM bank size field in MCFG2. The bank size can be programmed in binary step from 8 Kbytes to 256 Mbytes. Whatever is the size of the four first banks, they are always contiguous. These memory banks are selected with RAMS*[3] down to RAMS*[0]. The fifth SRAM bank controlled by RAMS*[4] has a fix dimension. This bank always resides at the upper address 0x60000000. This bank is always 256 Mbytes large. Figure 9. SRAM bank organisation SRAM bank size Start Address 0x7C000000 0x78000000 Unused 0x74000000 0x70000000 0x6C000000 0x68000000 0x64000000 0x60000000 0x5C000000 RAMS*[3] 0x58000000 RAMS*[1] 0x54000000 RAMS*[2] 0x50000000 0x4C000000 RAMS*[1] 0x48000000 RAMS*[0] 0x44000000 RAMS*[0] 0x40000000 RAMS*[0] RAMS*[1] RAMS*[2] RAMS*[3] Unused RAMS*[4](1)(2) RAMS*[4](2) RAMS*[4](2) Unused Unused 256MB Memory assignement 128MB Memory assignement 64MB Memory assignement Notes: 1. If the SRAM bank size is set to 256Mbytes, SRAM bank 2 & bank 3 are in overlay with SRAM bank 4. In this case, bank 2 and bank 3 control signals are never asserted. Bank 4 has the priority. 2. When SDRAM is enabled, priority is given to the SDRAM. Any access to addresses higher than 0x60000000 is driven to SDRAM. No SRAM control is activated. 30 AT697E 4226G–AERO–05/09 AT697 SRAM Read Access A read access to SRAM consists in two data cycles and between zero and three waitstates. On non-consecutive accesses, a lead-out cycle is added after a read cycle to prevent bus contention due to slow turn-off time of memories or I/O devices. On consecutive accesses, no lead-out cycle is performed between the acesses but only one is performed at the end of the operations (RAMSN and RAMOE are not deasserted). When a read access to SRAM is performed, a separate output enable signal is provided for each SRAM bank and it is only asserted when that bank is selected. Figure 10. SRAM read cycle (0-waitstate) data1 data2 lead-out CLK A RAMS* RAMOE* A1 D D1 SRAM Write Access Each byte lane has an individual write strobe (RAMWE*) to allow efficient byte and halfword writes. Each write access to SRAM consists of three cycles and between zero and three waitstates. The three mandatory cycles are divided in one write setup cycle, one data cycle and one lead-out cycle. Figure 11. SRAM write cycle (0-waitstate) lead-in data lead-out CLK A RAMS* RWE* A1 D D1 If the external memory use a common write strobe for the full 16- or 32-bit data, set the read-modify-write bit MCFG2. This will enable read-modify-write cycles for sub-word writes. Waitstates For application using slow SRAM memories, the SRAM controller provides the capability to insert wait-states during the SRAM accesses. Two types of wait-states can be inserted : • Programmed delay, available for bank 0 up to bank 3 31 4226G–AERO–05/09 • ‘Hardware’ delay, available for bank 4 only Up to three waitstates can be programmed for SRAM accesses. Read and write waitstates can be individually programmed. Setting the RAMRWS value in MCFG2 register defines the number of waitstates to insert during an SRAM read. Setting the RAMWWS value in MCFG2 register defines the number of waitstates to insert during an SRAM write. Figure 12. RAM read access with one programmed waitstate data1 data2 waitstate lead-out CLK A RAMS* OE* A1 D D1 If the application needs more delay during the SRAM transfer, it is possible to introduce more delay by activating the hardware bus ready ( BRDY* ) detection in MCFG2. If the BRDY bit is driven logical one on MCFG2 and the BRDY* pin is set high, the processor wait before ending the transfer. As soon as the BRDY* pin is driven low, the processor ends the access. If the BRDY bit is driven logical zero on MCFG2, no additional delay is inserted. Figure 13. RAM read access with one BRDY* controlled waitstate data1 data2 waitstate lead-out CLK A RAMS*[4] OE* A1 D BRDY* D1 Bus width To support applications with low memory and performance requirements, the SRAM area can be configured for 8-bit operations. The configuration of SRAM in 8-bit mode is done programming the SRAM bus width field in he memory configuration registers MCFG2. When the SRAM bus is configured as an 8-bit wide bus, data 31 downto 24 shall be used as interface. 32 AT697E 4226G–AERO–05/09 AT697 Figure 14. SRAM 8-bit bus width connection A RAMS0* RAMOE0* RWE0* CS OE WE D SRAM A D A[27:0] D[31:24] AT697 A[27:0] D[31:24] Since access to memory is always done on 32-bit word basis, read access to 8-bit memory will be transformed in a burst of four read cycles. If EDAC protection is active, 5 read cycles are necessary to complete the access (please refer to “Error Management EDAC” on page 41 for more details). During write operation, only the necessary bytes are writen. In addition to the 8-bit mode, the SRAM area can be configured for 16-bit accesses. In this configuration, the SRAM device is accessed with a burst of two 16-bit accesses. No EDAC protection can be used with suh configuration. When the bus is configured as an 16-bit wide bus, data 31 downto 16 shall be used as interface. Figure 15. SRAM 16-bit bus width connection A RAMS0* RAMOE0* RWE0* CS OE WE D SRAM A D A[27:1] D[31:16] AT697 A[27:0] D[31:24] Write Protection Write protection is provided to prevent accidental over-writing to the RAM area. Two block protection units are available for RAM area. Each one is controlled through a write protection register (WPRn). Two major fields are defined : a TAG and a MASK. • • The TAG defines the 15 most significant bits of the address of the block to be write protected. The Mask specifies which bits of the TAG are really relevant for the protection. Operation The write protection on the RAM area is enabled setting logical one the enable bit (EN) in the write protect register (WPRn). If this bit is set logical zero, no protection is activated. Two protection modes can be programmed. If the block protect bit (BP) of the write protect register (WPRn) is set logical one the protection is active within the segment. If the BP bit is set logical zero, the exterior of the segment is protected. 33 4226G–AERO–05/09 Figure 16. RAM Protection Mode Overview Write trap Segment 1 Segment 1 Write trap RAM Write trap RAM Segment 2 Write trap Segment 2 Write trap Segment mode (bp = 0) Block mode (bp = 1) To detect if the written address is part of a protected segment (or block), address bit 29 down to bit 2 are Xored with the TAG field from the write protect register. This operation is based on the following segmentation of an address. Table 14. Address Segmentation bit num. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 field area most significant byte 8 7 6 5 4 3 2 1 0 32Kbyte protected block With such segmentation, memory block in the range of 32Kbyte up to 1Gbyte can be protected. The result of the Xor is then Anded with the MASK field of the write protect register. If the result is zero, this indicates that address specified is in the protected range. If result is different from zero, address is out of the protected address range. If a write protection error is detected, the write cycle is stopped. Then, a memory access error is generated. Trap 0x2B is generated Figure 17. RAM Write Protection Overview Address Bus 15 15 logic 15 15 MASK BP Data Bus Trap 0x2B TAG Write Protection Reg. WRPn EN SDRAM The synchronous dynamic RAM interface can manage up to two SDRAM banks. The control of the SDRAM memory accesses uses a standard set of pin, including chip selects (SDCS*x), write enable (SDWE*), data masks (SDDQM*x) and clock lines. 34 AT697E 4226G–AERO–05/09 AT697 The bank size of the two SDRAM banks can be configured by setting the value of the SDRAM bank size field in MCFG2. The bank size can be programmed in binary step from 4 Mbytes to 512 Mbytes. The controller supports 64M, 256M and 512M devices with 8 to 12 column-address bits, up to 13 row-address bits, and 4 banks. Only 32-bit data bus width is supported for SDRAM banks. Address Mapping The start address for the SDRAM banks depends upon the SRAM use in the application. If the the SRAM disable bit (SI) and the SDRAM enable bit (SE) are set logical one in the memory configuration register (MCFG2), the SDRAM start address is 0x40000000. If the the SRAM disable bit (SI) is set logical zero and the SDRAM enable bit (SE) is set logical one in the memory configuration register (MCFG2), the SDRAM start address is 0x60000000. If SE if set logical zero, no SDRAM can be used. The address bus of the SDRAMs shall be connected to A[14:2], the bank address to A[16:15]. Devices with less than 13 address pins should only use the less significant bits of A[14:2]. Figure 18. SDRAM connection overview A D AT697 SDCLK SDCSN[1:0] SDRAS* SDCAS* SDWE* SDDQM[3:0] A[27:0] D[31:0] CLK CSN RAS CAS WE DQM A[16:15] BA SDRAM A D A[14:2] SDRAM Timing Parameters To provide optimum access cycles for different SDRAM devices some SDRAM parameters can be programmed through memory configuration register 2. The programmable SDRAM parameters are the following : Table 15. SDRAM Programmable Timing Parameters Function CAS latency Precharge to activate Auto-refresh command period Auto-refresh interval tRP tRFC Parameter Range 2-3 2-3 3 - 11 10 - 32768 Unit clocks clocks clocks clocks SDRAM Commands The SDRAM controller can issue three SDRAM commands. Commands to be executed are programmed through the SDRAM command field in the memory configuration register (MCFG2). When this field is writen with a non zero value, a SDRAM command is issued : • • • if set to ‘01’, Precharge command is sent, if set to ‘10’, Auto-Refresh command is sent, if set to ‘11’, Load Mode Reg (LMR) is sent. 35 4226G–AERO–05/09 When the LMR command is issued, the CAS delay programmed in MCFG2 is used. The command field is cleared after a command is executed. When changing the value of the CAS delay, a LOAD-MODE-REGISTER command should be generated at the same time. The SDRAM controller also provides a refresh command. It can be enabled by setting logical one the refresh enable bit (SDRREF) in the memory configuration register. The Auto-Refresh command enables a periodical refresh for both SDRAM banks. The period between two Auto-Refresh command is programmed in the refresh counter reload field of the third memory configuration register (MCFG3). Depending on SDRAM type, required period is typically 7.8 or 15.6μs. This corresponds to 780 or 1560 clock cycle at 100MHz. Refresh period is calculated as Refresh Period = Reload value + 1 -------------------------------------------sysclk SDRAM Initialisation After reset, the SDRAM controller automatically performs the SDRAM initialisation sequence. It consists in PRECHARGE, two AUTO-REFRESH cycles and LOAD-MODEREG on both banks simultaneously. The controller programs the SDRAM to use page burst on read and single location access on write. A CAS latency of 3 is programmed by default. This value can be updated later by software. A read cycle consists in three main operation. First, an ACTIVATE command to the desired bank and row is performed. Then, after the programmed CAS delay, a READ command is sent. The read cycle is terminated with a PRE-CHARGE command. No bank is left open between two accesses. A burst read is performed if a burst access is requested on the internal bus. SDRAM Read Access SDRAM Write Access A write cycles consists in three main operations. First, an ACTIVATE command to the desired bank and row is performed. Then, a WRITE command is sent. The write cycle is terminated with the PRE-CHARGE command. A burst write on the internall bus generates a burst of write commands without idle cycles in-between. Access Error An access error can be indicated to the processor asserting the BEXC* signal. If enabled by setting logical one the BEXC* bit in the memory configuration register 1, the BEXC* signal is sampled with the data. If the BEXC* signal is driven low by the external device during the access, an error response is generated on the internal bus. • • • Trap 0x01 is taken if an instruction fetch is in progress Trap 0x09 is taken if a data space is in progress Trap 0x2B is taken if a data store is in progress PROM Interface Overview The memory controller give the capability to control up to 512Mbyte of PROM. The PROM interface can manage up to two PROM banks. The control of the PROM memory accesses uses a standard set of pin, including chip selects (ROMS*x), output enable (OE*), read (READ) and write (WRITE*) lines. 36 AT697E 4226G–AERO–05/09 AT697 The bank size of the PROM banks is not programmable. The lower half part of the PROM area (0x00000000 up to 0x0FFFFFFF) is controlled by the ROMS0* PROM select signal. The upper half part of the PROM area (0x10000000 up to 0x1FFFFFFF) is controlled by the ROMS1* PROM select signal. PROM Read Access A read access to PROM consists in two data cycles and waitstates if any programmed. On non-consecutive accesses, a lead-out cycle is added after a read cycle to prevent bus contention due to slow turn-off time of memories or I/O devices. On consecutive accesses, no lead-out cycle is performed between the acesses but only one is per formed at the end of the operations. Figure 19. PROM Read Cycle (0 Waitstate) data1 data2 lead-out CLK A ROMS* OE* A1 D D1 PROM Write Access Each write access to PROM consists of three cycles and of waitstates if any pro grammed. The three mandatory cycles are divided in one write setup cycle, one data cycle and one lead-out cycle. The write operation is strobed by the WRITE* signal. Figure 20. PROM Write Cycle (0 waitstate) lead-in data lead-out CLK A ROMS* WRITE* A1 D D1 Waitstates For application using slow PROM memories, the PROM controller provides the capability to insert wait-states during the accesses. Two types of wait-states can be inserted : • • Programmed delay, ‘Hardware’ delay. Up to 30 waitstates can be programmed for PROM accesses. Read and write waitstates can be individually programmed. Setting the PRRWS value in MCFG1 register defines the number of waitstates to insert during a PROM read access. Setting the PRWWS 37 4226G–AERO–05/09 value in MCFG1 register defines the number of waitstates to insert during a PROM write. PRRWS and PRWWS field can be programmed to take values from 0 up to 15. The effective number of waitstates applied during an access is then twice the programmed value. In that way, programming two waitstates result in the insertion of four wait cycles during the access. If the application needs more delay during the PROM access, it is possible to introduce more delay acting on the bus ready line ( BRDY* ). If the BRDY* pin is set high, the processor wait before ending the transfer. As soon as the BRDY* pin is driven low, the processor ends the access. After a reset operation of the processor (or at power up), the read and write waitstates fields for the PROM area are set default to 15, resulting in 30 effective waitstates. Write Protection Write protection is provided to prevent accidental over-writing to PROM area. It is controlled through the PROM write enable bit (PRWE*) from the memory configuration register 1. When set 1, this bit enables write to PROM. When set 0, no PROM write cycle is available. To support applications with low memory and performance requirements, the PROM area can be configured for 8-bit operations. The configuration of PROM in 8-bit mode is done programming the ROM bus width field in he memory configuration registers MCFG1. When the PROM bus is configured as an 8-bit wide bus, data 31 downto 24 shall be used as interface. Figure 21. PROM 8-bit bus width connection A ROMS0* OE* WRITE* CS OE WE D Bus width PROM D A A[27:0] D[31:24] AT697 A[27:0] D[31:24] Since access to memory is always done on 32-bit word basis, read access to 8-bit memory will be transformed in a burst of four read cycles. If EDAC protection is active, 5 read cycles are necessary to complete the access (please refer to protection section for more details). During write operation, only the necessary bytes are writen. In addition to the 8-bit mode, the PROM area can be configured for 16-bit accesses. In this configuration, the PROM device is accessed with a burst of two 16-bit accesses. No EDAC protection can be used with suh configuration. When the bus is configured as an 16-bit wide bus, data 31 downto 16 shall be used as interface. 38 AT697E 4226G–AERO–05/09 AT697 Figure 22. PROM 16-bit bus width connection A ROMS0* OE* WRITE* CS OE WE D PROM A D A[27:1] D[31:16] AT697 A[27:0] D[31:16] During power-up or reset operation, the PROM bus width field in MCFG1 is set with the value of PIO[1:0] inputs. Access Error An access error can be indicated to the processor asserting the BEXC* signal. If enabled by setting logical one the BEXC* bit in the memory configuration register 1, the BEXC* signal is sampled with the data. • • • Trap 0x01 is taken if an instruction fetch is in progress Trap 0x09 is taken if a data space is in progress Trap 0x2B is taken if a data store is in progress Memory Mapped I/O Overview The memory controller give the capability to control up to 256Mbyte of I/O. The I/O area consists in a single large bank. The control of the I/O area accesses uses a standard set of pin, including chip selects (IOS*x), output enable (OE*), read (READ) and write (WRITE*) lines. The size of the I/O bank is not programmable. The entire I/O area (0x20000000 up to 0x2FFFFFFF) is controlled by the IOS* select signal. I/O Read Access A read access to I/O consists in a lead-in cycle, two data cycles, waitstates if any programmed and a lead-out cycle. On non-consecutive accesses, the lead-out cycle is used to prevent bus contention due to slow turn-off time of memories or I/O devices. On consecutive accesses, no lead-out cycle is performed between the acesses but only one is performed at the end of the operations. The I/O select signal (IOSEL*) is delayed one clock to provide stable address. Figure 23. single I/O read cycle with lead-out lead-in data 1 data 2 lead-out CLK A IOS* OE* A1 D D1 39 4226G–AERO–05/09 Figure 24. consecutive I/O read cycles without lead-out lead-in data 1 data 2 lead-in data 1 data 2 lead-out CLK A IOS* OE* A1 A2 D D1 D2 I/O Write Access Each write access to I/O consists of three cycles and of waitstates if any programmed. The three mandatory cycles are divided in one write setup cycle, one data cycle and one lead-out cycle. The write operation is strobed by the WRITE* signal. Figure 25. I/O write cycle lead-in data lead-out CLK A IOS* WRITE* A1 D D1 Waitstates For application using slow I/O devices, the I/O controller provides the capability to insert wait-states during the accesses. Two types of wait-states can be inserted : • • Programmed delay, ‘Hardware’ delay. Up to 15 waitstates can be programmed for I/O accesses. Read and write waitstates are programmed simultaneously. Setting the IOWS field value in MCFG1 register defines the number of waitstates to insert during any access to/from I/O areas. IOWS field can be programmed to take values from 0 up to 15. If the application needs more delay during the I/O access, it is possible to introduce more delay acting on the bus ready line ( BRDY* ). If the bus ready bit (BRDY*) is set logical one in MCFG1 and BRDY* pin is set high, the processor wait before ending the transfer. As soon as the BRDY* pin is driven low, the processor ends the access. 40 AT697E 4226G–AERO–05/09 AT697 Write Protection Read and write protections are provided to prevent accidental accesses to I/O area. Protection is controlled through the I/O protection ‘iop’ bit from the memory configuration register 1. To support applications with low memory and performance requirements, I/O area can be configured for 8-bit operations. The configuration of I/O in 8-bit mode is done programming the I/O bus width field in he memory configuration registers MCFG1. In such configuration, I/O device is not accessed by multiple 8-bit accesses as other memory areas. Only one single access is performed When the I/O bus is configured as an 8-bit wide bus, data 31 downto 24 shall be used as interface. Figure 26. I/O 8-bit bus width connection A IOS* OE* WRITE* CS OE WE D Bus width IO A D A[27:0] D[31:24] AT697 A[27:0] D[31:24] In addition to the 8-bit mode, the I/O area can be configured for 16-bit accesses. In such configuration, I/O device is not accessed by multiple 8-bit accesses as other memory areas. Only one single access is performed. When the bus is configured as an 16-bit wide bus, data 31 downto 16 shall be used as interface. Figure 27. I/O 16-bit bus width connection A IOS* OE* WRITE* CS OE WE D IO A D A[27:0] D[31:16] AT697 A[27:0] D[31:16] Access Error An access error can be indicated to the processor asserting the BEXC* signal. If enabled by setting logical one the BEXC* bit in the memory configuration register 1, the BEXC* signal is sampled with the data. • • • Trap 0x01 is taken if an instruction fetch is in progress Trap 0x09 is taken if a data space is in progress Trap 0x2B is taken if a data store is in progress Error Management EDAC Overview The AT697 processor implements an on-chip error detector and corrector (EDAC). The on-chip memory EDAC can correct one error in a 32-bit word and detect two errors in a 32-bit word. The processor EDAC implemention enables data correction on-the-fly so that no timing penalty occurs during correction. 41 4226G–AERO–05/09 EDAC capability mapping Data error management with the EDAC can be used on both PROM and RAM memory areas. The following table presents the EDAC protection capabilities provided by the processor. Table 16. EDAC capability on Memories Address Range 0x00000000 - 0x1FFFFFFF PROM Area 8 bits 16 bits 32 bits 0x20000000 - 0x3FFFFFFF 0x40000000 - 0x7FFFFFFF I/O RAM All 8 bits 16 bits 32 bits EDAC Protected yes no yes no yes no yes PROM protection Setting logical one the PROM EDAC enable bit (PE) in the memory configuration register MCFG3, the data protection is enabled. For each read and write cycle to the PROM area the EDAC act as an error detector and an error corrector. When set logical zero, the EDAC is transparent for the PROM access. At power-on or at reset, the value of the PE bit is directly copied from the PIO2 pin. In that way, it is possible to start the application with the EDAC enabled by driving high PIO2 during the power-on sequence (or reset sequence). RAM protection Setting logical one the RAM EDAC enable bit (RE) in the memory configuration register MCFG3, the data protection is enabled. For each read and write cycle to the RAM area the EDAC act as an error detector and an error corrector. When set logical zero, the EDAC is transparent for the RAM access. The processor uses an EDAC based on a seven bit Hamming code that detects any double error on a 32-bit bus and corrects any single error on a 32-bit bus. For each 32-bit data, a seven bit a 7-bit checksum is generated. The equations below show how the Hamming checkbits (CBx) are generated: CB0 = D0 ^ D4 ^ D6 ^ D7 ^ D8 ^ D9 ^ D11 ^ D14 ^ D17 ^ D18 ^ D19 ^ D21 ^ D26 ^ D28 ^ D29 ^ D31 CB1 = D0 ^ D1 ^ D2 ^ D4 ^ D6 ^ D8 ^ D10 ^ D12 ^ D16 ^ D17 ^ D18 ^ D20 ^ D22 ^ D24 ^ D26 ^ D28 CB2 = D0 ^ D3 ^ D4 ^ D7 ^ D9 ^ D10 ^ D13 ^ D15 ^ D16 ^ D19 ^ D20 ^ D23 ^ D25 ^ D26 ^ D29 ^ D31 CB3 = D0 ^ D1 ^ D5 ^ D6 ^ D7 ^ D11 ^ D12 ^ D13 ^ D16 ^ D17 ^ D21 ^ D22 ^ D23 ^ D27 ^ D28 ^ D29 CB4 = D2 ^ D3 ^ D4 ^ D5 ^ D6 ^ D7 ^ D14 ^ D15 ^ D18 ^ D19 ^ D20 ^ D21 ^ D22 ^ D23 ^ D30 ^ D31 CB5 = D8 ^ D9 ^ D10 ^ D11 ^ D12 ^ D13 ^ D14 ^ D15 ^ D24 ^ D25 ^ D26 ^ D27 ^ D28 ^ D29 ^ D30 ^ D31 CB6 = D0 ^ D1 ^ D2 ^ D3 ^ D4 ^ D5 ^ D6 ^ D7 ^ D24 ^ D25 ^ D26 ^ D27 ^ D28 ^ D29 ^ D30 ^ D31 Operation Hamming code Write operation Read operation When the processor performs a write operation to a memory protected by the EDAC, it also output the seven bit checksum on the CB[6:0] pins. During a read operation from a protected memory, the seven bit checksum is sampled from the CB[6:0] inputs. Then, the EDAC verify the checksum to check the presence of an error. 42 AT697E 4226G–AERO–05/09 AT697 According to the checksum equations, the EDAC calculates its own checksum. Then a syndrome generator uses the calculated and the read checksum to qualify if there is no error, one error or two errors in the read word. Correctable error If a single error is detected, this leads to a correctable error. The correction is done onthe-fly during the current access and no timing penalty is induced. The read-modifywrite bit (RMW) in MCFG2 shall be set to enable write back of the corrected data. The correctable error detection event is reported in the fail address register (FAILAR) and in the fail status register (FAILSR). If unmasked, interrupt 1 (trap 0x11) is generated. Uncorrectable error If a double error is detected, this leads to an un-correctable error. An un-correctable error detection during a data access leads to a data access exception (trap 0x09). In case the double error is detected during instruction fetch, it leads to an instruction access error (trap 0x01). Figure 28. EDAC overview Memory Configuration Reg. MCFG3 Fail Address Reg. FAILAR CB[7:0] Data Bus Address Bus EDAC trap 0x01 trap 0x09 Fail Status Reg. FAILSR EDAC on 8-bit areas EDAC protection on a memory configured in 8-bit mode is also possible but, the EDAC checksum bus (CB[7:0]) is not used. The protection is done by allocating the top 25% of the memory bank to the EDAC checksums. If the EDAC is enabled, a read access will read the data bytes from the nominal address, and the EDAC checksum from the top part of the bank. A write cycle is performed the same way. The memory assignement is then : • • • 75% of the bank memory available as program or data memory, 18.75% used for checkbits 6.25% unused. EDAC testing The operation of the EDAC can be tested trough the MCFG3 memory configuration register. 43 4226G–AERO–05/09 Figure 29. EDAC testing overview WB Data Bus Memory Configuration Reg. MCFG3 TCB 8 8 CB[7:0] EDAC TCB 8 RB Write test Read test If the write bypass bit (WB) from MCFG3 is set logical one, the value of the test checksum from the TCB field replaces the normal checkbits during memory write cycles. During memory read cycles, if the read bypass bit (RB) from MCFG3 is set logical one, the memory checkbits of the loaded data is stored to the test checkbit field (TCB) of MCFG3 . 44 AT697E 4226G–AERO–05/09 AT697 Timer Unit Prescaler Timer/Counter1, Timer/Counter2 and the watchdog share the same prescaler. The prescaler consists of a 10-bit down counter clocked by the system clock. The prescaler is decremented on each clock cycle. When the prescaler underflows, it is automatically reloaded with the content of the prescaler reload register. A count tick is generated for the two timers and the watchdog. The effective division rate is equal to prescaler reload register value + 1. Figure 30. Prescaler Block Diagram Reload Reg. SCAR Control Logic count tick clock Data Bus load Counter Reg. SCAC =0x3FF Caution : The two timers and watchdog share the same decrementer. The minimum allowed prescaler division factor is 4 (reload register = 3). Timer/Counter 1 & Timer/Counter 2 Timer/Counter1, Timer/Counter2 are two general purpose 24-bit timers. They share the same decrementer. The timer value is then decremented each time the prescaler generates a timer pulse. Each timer operation is controlled through a dedicated Timer Control register (TIMCTR). A timer is enabled/disabled by setting the enable bit (en) in the timer control register. E ach time a timer underflows, an interrupt is generated. These interrupts can be masked with the Interrupt Mask and Priority register (ITMP). Setting the load bit (rl) in the Timer Control register, the content of the reload register (TIMR) is automatically reloaded in the Timer Counter register (TIMC) after an underflow and the timer continue running. If the reload bit is reset, the timer stops running after its first underflow. Timer Counter can be forced with the Timer Reload value at any time by asserting the load bit (ld) in the Timer Control register. 45 4226G–AERO–05/09 Figure 31. Timer/Counter 1/2 Block Diagram Control Reg. TIMCTRn Reload Reg. TIMRn Control Logic Data Bus timer interrupts (irq 8 & 9) count tick load Counter Reg. TIMCn enable/disable =0xFFFFFF Watchdog The watchdog operates the same way as the timers, with the difference that it is always enabled and upon underflow asserts the external signal WDOG. This signal can be used to generate a system reset. If the watchdog counter is refreshed by writing to WDG register before the counter reaches zero, the counter restarts counting from the new value. If the counter is not refreshed before the counter reaches zero, WDOG signal is asserted. After reset, the watchdog is automatically enabled and starts running. Note: Reading wdc field of the watchdog register gives the loading (or re-loading) value, not the effective count value. Figure 32. Watchdog Block Diagram Data Bus WDOG Watchdog Reg. TIMRn Control Logic clock =0xFFFFFF 46 AT697E 4226G–AERO–05/09 AT697 General Purpose Interface GPI as 32-bit I/O port The general purpose interface (GPI) consists in a 32-bit wide I/O port with alternate facilities. The interface is based on bi-directional I/O ports.The port is split in two parts, with the lower 16-bits accessible by the parallel IO pads and the upper 16-bits via the data bus. lower 16-bits The lower 16-bits of the general purpose interface are accessible through PIO[15:0]. All I/O ports have true Read-Modify-Write functionality when used as general I/O ports. This means that the direction of one port pin can be changed without unintentionally the direction of any other pin. The same applies when changing the drive value of the port. Figure 33. I/O port block diagram - PIO[15:0] IO Direction Reg. IODIR D Q Data Bus D Q IO Data Reg. IODAT Q D PIOx clock configuring the pin Each pin from PIO[15:0] consists of two register bits : IODIRx and IODATx. As shown in the “Register Description” section, the IODIRx bits are accessed at IODIR address and iodatx at IODAT address. The IODIRxbit in the IODIR register selects the direction for port number x. If IODIRx is written logic one, the corresponding pin is configured as output. If written logic zero, the pin is configured as an input. When the pin is configured as an input, a read of the IODATx bit in IODAT register returns the current value of the pin. When the pin is configured as an output, if a logical one is written to IODATx bit in IODAT register, the port x is driven high. If a logical zero is written to IODATx bit in IODAT register, the port x is driven low. switching between input & output When the port x is switched from input to output by switching IODIRx, the value of IODATx is immediatly driven on the corresponding pin.When switched from output to input by toggling IODIRx, the value from the pin is immediatly written to IODATx. upper 16-bits The upper 16-bits of the general purpose interface are accessible through D[15:0]. They can only be used when all memory areas (ROM, RAM and I/O) are 8-bit or 16-bit wide. If the SDRAM controller is enabled, the upper 16-bits cannot be used. 47 4226G–AERO–05/09 Figure 34. I/O port block diagram - D[15:0] IO Direction Reg. MEDDIR/LOWDIR D Q Data Bus D Q IO Data Reg. MEDDAT/LOWDAT Q D Dx clock configuring the pin The upper 16 bits of the general purpose interface can only be configured as outputs or inputs on byte basis. D[15:8] is referenced as the medium byte when D[7:0] is referenced as the lower byte. Each byte from D[15:0] consists of two register fields. As shown in the “Register Description” section, the direction fields are accessed at IODIR address when data fields at IODAT address. The MEDDIR bit and the LOWDIR bit in the IODIR register select the direction for respectively the medium byte ( D[15:8] ) and the lower byte ( D[7:0] ). If MEDDIR (or LOWDIR) is written logic one, the corresponding byte in D[15:0] is configured as output. If written logic zero, the byte is configured as an input. When configured as an input, a read of the MEDDAT fileds in IODAT register returns the current value of D[15:8]. When configured as an output, the logical value from MEDDAT field in IODAT register is translated in physical values on D[15:8] bus. When configured as an input, a read of the LOWDAT fileds in IODAT register returns the current value of D[7:0]. When configured as an output, the logical value from LOWDAT field in IODAT register is translated in physical values on D[7:0] bus. switching between input & output When the medium byte (or the lower) is switched from input to output by switching MEDDIR (or LOWDIR), the value of MEDDAT (or LOWDAT) is immediatly driven on the corresponding pin. When switched from output to input by toggling MEDDIR (or LOWDIR), the value from the pins are immediatly written to MEDDAT (or LOWDAT). Most GPI pins have alternate functions in addition to being general I/O. Facilities like serial communication link, interrupt input and configuration are made available through these functions. The following table summaryses the assignement of the alternate functions. Table 17. GPI alternate functions GPI port pin PIO[15] PIO[14] PIO[13] PIO[12] PIO[11] PIO[10] PIO[9] Alternate function TXD1 - UART1 transmitter data RXD1 - UART1 receiver data RTS1 - UART1 request-to-send CTS1 - UART1 clear-to-send TXD2 - UART2 transmitter data RXD2 - UART2 receiver data RTS2 - UART2 request-to-send GPI Alternate functions 48 AT697E 4226G–AERO–05/09 AT697 GPI port pin PIO[8] PIO[3] PIO[2] PIO[1:0] Alternate function CTS2 - UART2 clear-to-send UART clock - Use as alternative UART clock EDAC enable - Enable EDAC checking at reset Prom width - Defines PROM bus width at reset In addition to these alternate functions, each GPI interface pin can be configured as an interrupt input to catch interrupt from external devices. Up to four interrupts can be configured on the GPI interface by programming the I/O interrupt register (IOIT). For a detailed description of the external interrupt configuration, please refer to the “Traps and Interrupts” section. 49 4226G–AERO–05/09 PCI Arbiter A PCI arbiter is embedded on the AT697 chip. The¨PCI arbiter enables the arbitration of 4 PCI agents numbered from 3 downto 0. A round-robin algorithm is implemented as arbitration policy. The PCI arbiter is totally independent from the PCI interface An Agent on the PCI bus requests the bus by driving low its REQ* line. When the arbiter determines that the bus can be granted to an agent, it drives low the corresponding GNT* line. When the bus is granted to a PCI agent, the agent keeps the bus for only one transaction. If the agent desires more accesses, it shall continue to assert its REQ* line and wait to be granted the bus again. Operation Round Robin The round robin algorithm used for the arbitration is based on various loops with different priority levels. The implementation in the AT697 is based on two priority loops. A high priority loop is defined as level 0. A low priority loop is defined as level 1. The arbitration is done checking the REQ* lines of the PCI agents one after each other. In first place, the loop with level 0 is checked. If a a REQ* is active and no master is currently granted ther bus, the corresponding GNT* line is driven low. Then, the agent is granted the bus. At each complete round-turn in level 0, one step is done in level 1. The following figure illustrates the operation of the arbitre. Figure 35. Arbitre operation - Agent Operation level 0 level 1 time Agent 0 Agent 1 Agent 2 Agent 0 Agent 1 Agent 3 Agent 0 Agent 1 Agent 2 With : agents 0 and 1 at level 0 agents 2 and 3 at level 1 If all agents have a request at the same time, the following probabilities of access are implemented: • • • Bus Parking All agents in one level have equal probability All agents in level 1 together have the same probability of access as one agent in level 0. If no agent is in level 0, or no agent in level 0 has a request, all agents in level 1 are granted with equal probability As long as no bus request is active on the arbiter, the bus is granted to the last owner. It remains granted to the last owner until another agent requests the bus. When another request is asserted, re-arbitration occurs after one turnover cycle. After reset, the bus is parked to agent 0. Agent 0 is the default owner after a reset operation. Re-arbitration When a master is managing a transfer and another one makes a request to the arbiter, re-arbitration occurs. Only one re-arbitration is performed during a transfer. A new arbitration will take place when the master which was granted the bus frees the bus. As long as all the PCI agents have no request pending, the arbitration is performed. A re-arbitration cycle also occurs when living the bus parking state. 50 AT697E 4226G–AERO–05/09 AT697 Priority definition Two different priority levels are defined for the PCI arbiter. Level 0 is defined as the high priority level. Level 1 defines the low priority level. Assignment of the PCI agents priority level is programmable through the arbiter configuration register (ACR). Each PCI agent can be individually configured to operate either on level 0 or on level 1, except agent 3 that is defined by hardware with a low priority (level 1). Setting logical one the Px bit in the arbiter configuration register leads the agents x to a low priority level. Setting this bit logical zero leads to a high priority. After reset, all the PCI agents are configured in the low priority loop. 51 4226G–AERO–05/09 PCI Interface Overview The PCI interface implementation is compliant with the PCI 2.2 specification. It is a high performance 32-bit bus interface with multiplexed address and data lines. It is intended for use as an interconnect mechanism between processor/memory systems and peripheral controller components. The AT697 processor embedds the In-Silicon PCI core. It is interfaced to the processor core through the PCI to AMBA bridge developped by the European Space Agency. The PCI bus operations can be clocked at a frequency up to 33MHz, independently of the processor clock. Synchronization of the operation between PCI interface and AT697 core implies numerous FIFO usage. This implementation allows to use the device for Initiator (Master) and Target operations. In each mode single word and burst transfer can be executed. Two different operating modes can be used with the PCI interface : • Host Bridge The host-bridge connects the local bus of a processor to the PCI bus. Its PCI configuration registers are accessible locally by the processor, but not through PCI configuration cycles. Host-bridge initialises other satellite devices through PCI configuration commands. • Sattelite The satellite is a PCI device, configurable via PCI configuration cycles and the idsel line, but not locally. Both, host-bridge and satellites can be initiator and/or target on the bus. The present interface has universal functionality, allowing both operation modes. The mode is configured via an hardware bootstrap on the SYSEN* pin. Some other features are supported by this interface like • • • • • • • Target lock support Zero-latency Fast Back-to-Back transfers Zero wait state burst mode transfers Support for memory read line/multiple Support for memory write and invalidate commands Delayed read support Flexible error reporting by polling The PCI bus is a multiplexed one. In this way, address and data through the same medium. That is why PCI communication is based on two phase burst transfer. Each transfer is composed of the following phases : • An address phase During the address phase, the initiator of the communication drives the 32-bit address concerned by the transfer and the command involved through this transfer. The command defines the space area concerned with the transfer and the direction of the transfer. A data phase During the data phase the initiator of the communication drives the enable bit signal so that only active part of the bus is enabled. When reading, the initiator drives the enable bits and the target set the data on the bus. • 52 AT697E 4226G–AERO–05/09 AT697 PCI Initiator (Master) The PCI initiator mode of the AT697 gives a direct memory-mapped (initiator) access to the PCI bus. Any access to a memory address in the PCI address range is automatically translated by the interface into the appropriate PCI transaction. In this configuration, the PCI bus is accessed by the same instructions as the main memory. The SPARC instruction set foresees various load/store instruction types. The PCI bus foresees 32 bit wide transactions with byte-enables for each byte lane. For standard operation, the PCI interface only works in a limited address range. The address range for such initiator transaction is limited to addresses between 0xA0000000 and 0xF0000000. P CI addresses outside of this predefined range can be accessed only via DMA transactions. Instructions of different width (byte, half-word, word, double) can be performed for each address of the PCI address range. The three low significant bits of the address A[2:0] are used to determine which PCI byte enable line C/BE*[3:0] should be active during the transaction. According to the SPARC architecture, big-endian mapping is implemented, the most significant byte standing at the lower address (0x..00) and the least significant byte standing to the upper address (0x..03). A byte-writing to A[1:0] = 00 results in the byte enable pattern 0111, indicating that the e most significant byte lane (bits 31:24) of the PCI data bus is selected. The following table presents the transaction width authorized for PCI transfers. Table 18. Byte Enable Settings width Assembler C-datatype A[2:0]=000 A[2:0]=100 A[2:0]=x01 A[2:0]=x10 A[2:0]=x11 8 ld[s/u]b, stb char 0111 0111 1011 1101 1110 16 ld[s/u]h, sth short 0011 0011 not aligned 1100 not aligned 32 ld, st int 0000 0000 not aligned not aligned not aligned 64 ldd, std long long 0000 (burst) not aligned not aligned not aligned not aligned Initiator Mapping Note: PCI byte enables are active low. For non-aligned accesses, the byte enable pattern (1111) is issued on PCI, to avoid destroying data in the remote PCI target. Memory cycles Many memory cycles such as memory-read/write and memory-read-line/write-invalidate can be issued from the processor with common SPARC instruction set. Selection of the command to execute is performed setting the value of the command field (CMD) in the PCI initiator configuration (PCIIC). Setting logical ‘01’ the CMD field result in the generation of memory read/write access when PCI address is accessed. A logical value of ‘10’ result in a memory read line or write and invalidate on PCI address access. 53 4226G–AERO–05/09 For the memory commands the address issued on the PCI bus is a word address with bits (1:0) set to 00. This indicates that the linear incrementing mode is used. operation The following procedure shall be used to engage memory cycle on the PCI interface: 1. Select the initiator mode by setting logical one the MOD bit in the PCI initiator configuration register. 2. Select the memory load/store command or the memory read-line/write and invalidate command in the PCI initiator configuration register. The CMD field shall be set logical ‘01’ for simple load/store operation and shall be set logical ’11’ for read-line/write-&-invalidate. 3. Enabling the interrupt signalisation is optionnal. It can be enabled setting logical one the initiator error bits in the PCI interrupt enable register (PCIITE). Up to four interrupt sources can be defined : Initiator Error, Initiator Parity Error, PCI core error and system error. 4. Engage an access to a memory address mapped in the PCI address range. IO transaction cycles operation The following procedure shall be used to engage I/O cycle on the PCI interface: 1. Select the initiator mode by setting logical one the MOD bit in the PCI initiator configuration register. 2. Select the I/O load/store command in the PCI initiator configuration register. The CMD field shall be set logical ‘00’ for I/O operation. 3. Enabling the interrupt signalisation is optionnal. It can be enabled setting logical one the initiator error bits in the PCI interrupt enable register (PCIITE). Up to four interrupt sources can be defined : Initiator Error, Initiator Parity Error, PCI core error and system error. 4. Engage an access to an I/O address mapped in the PCI address range. Configuration cycles Target selection Accesses to a configuration address space requires the target device to be selected. Due to the address range limitation, the chip-select (IDSEL) connection necessary for device selection shall be done using only A/D[27:16]. This allows up to 12 PCI devices to be connected on the bus. Devices with chip-select line connected to A/D[31:28] can’t be configured through standard operations. DMA configuration cycles shall be used to configure the devices connected to A/D[31:28]. The PCI bus configuration cycles can be performed using the same instructions as the main memory. To generate such configuration cycle with the standard instructions, the command type field (COMMSB) of the PCI initiator configuration register (PCIIC) shall be programmed to ‘01’. Then, if a load (or store) cycle is performed to an addresss in the PCI address range, a physical configuration cycle is performed on the PCI bus. The full 32-bit address defined on the internal bus is propagated on the PCI bus. Once a target is selected (DEVSEL* asserted). 54 AT697E 4226G–AERO–05/09 AT697 Operation The following procedure shall be used to engage configuration cycle on the PCI interface: 1. Select the initiator mode by setting logical one the MOD bit in the PCI initiator configuration register. 2. Select the configuration load/store command in the PCI initiator configuration register. The CMD field shall be set logical ‘10’ for configuration operation. 3. Enabling the interrupt signalisation is optionnal. It can be enabled setting logical one the initiator error bits in the PCI interrupt enable register (PCIITE). Up to four interrupt sources can be defined : Initiator Error, Initiator Parity Error, PCI core error and system error. 4. Engage an access to an configuration space. Limitation Configuration cycles shall only be generated by the PCI host of the bus or by a PCI-toPCI bridges. By default, all requests are translated into single cycle PCI transactions, each transaction consisting in an address phase followed by a single data phase. Linear incrementing store-word sequences are translated into undetermined length PCI write bursts with up to a maximum of 255 words. The PCI burst mode is then maintained as long as possible. Read/write direction is unchanged and the address An+1 = An + 4. When the sequence is discontinued, the PCI burst stops with a last data phase during which byte enables are 1111. Double word load/store requests can be executed as a two word bursts, the burst (one address phase, two data phases) on PCI. A double word read is executed as a two word burst when the DWR bit is set logical one in the PCI initiator configuration register. When set logical zero, a double word read is translated to the PCI as two single read accesses. A double word write is executed as a two word burst when the DWW bit is set logical one in the PCI initiator configuration register. When set logical zero, a double word write is translated to the PCI as a burst of undefined length as long as the addresses are sequential. The double word mode accelerates the transfer on the PCI side except in cases, where linear incrementing bursts are done by subsequent storedouble instructions (An+1 = An + 8). In this case the double word write bit DWW shall be set logical zero. It is in general recommended to set logical one both DWR and DWW and to use the DMA to transfer large data blocks. Fast back2back cycles The PCI implementation only supports fast back2back cycles to the same target. Before using fast back-2-back transfers, fast back-2-back cycles shall be enabled setting logical one the bit COM9 in the status command register (PCISC). Bit COM9 shall only be set one if all targets on the bus support fast back-2-back transfers. Issuing a fast back to back transfer is done setting logical one the B2B bit in the PCIDMA register. Note: Fast back-2-back can only be generated by the initiator. It is not accepted by the AT697 PCI target. Special cycles Linear incrementing store-word Double word load/store 55 4226G–AERO–05/09 Error reporting Fatal (abort) and address parity errors On a fatal error ( or address parity error ), the interface flushes all the current buffer requests and all other buffer requests. Then, the interface reports the fatal error driven logical one the pci core error (CMFER) in the PCi interrupt pending register. The PCI core is restarted as soon as a new request is engaged. Data parity errors During load/read transactions, one PCI parity error is recoverable in hardware. If the PERR bit is set in the PCI initiator configuration register, the interface ignores the erroneous PCI data and retries the request. However, if the data parity error persists at the same address, it is considered to be unrecoverable. Then, an error on the internal bus is detected and the PCI initiator error is reported in the IMIER bit of the PCI interrupt pending register. Parity error is also impossible in cases where the transaction is already finished on the local bus when the error is detected/reported on the PCI bus. The parity error is then reported in the initiator parity error bit (IMPER) and error recovery must be done in software. DMA transfer A DMA facility is available on the AT697 processor. The DMA transfer are performed through the PCI interface. The DMA controller executes data transfer between the local memory and a remote target on the PCI bus. The processor core only intervenes for the initiation of the transfer. Once transfer is initiated, DMA controller is fully autonomous. DMA transfers take place in background of the processor core activity. Thus, interrupts are provided to help to synchronise the application with start and end of the transfer. The DMA interface executes only word-size transactions with all 4 byte lanes enabled. Operation The DMA is enabled setting logical one the MOD bit in the PCI initiator configuration register (PCIIC). To synchronize the application with the start and the end of the transaction, two interrupts can be enabled : DMAER for transfer control and IMIER for error control. Each DMA sequence shall program the following parameters : • • • • PCI start address PCI command type number of words to be transferred the start address in the local memory A DMA transfer is performed assuming the following operations are done in the given order : 1. Write the PCI start address of burst to the PCI start address register (PCISA). The PCISA register shall be re-writen each time a DMA transfer is initiated, even if the address is identical to the address of the previous DMA request. 2. Write together the PCI command and the number of words to be transfered in the PCI DMA configuration register (PCIDMA). Writing to the PCIDMA passes the PCI address, the word count and the PCI command to the PCI core and initiates the transaction on the PCI bus. 56 AT697E 4226G–AERO–05/09 AT697 3. Write the start address in the local memory map to the PCI DMA address register (PCIDMAA). Once the three operation are executed, data transfer is started in background. Once the specified number of words is transfered, the interface set logical one the dma end of transfer bit (DMAER) and generate an interrupt if enabled. Then DMA controller goes back to idle state. Error Reporting If the PCI core does not accept the DMA cycle request, the DMA state controller remains locked and an error is reported as initiator error with the IMIER bit set logical one. If the request on the PCI core was just delayed, rewriting PCIDMAA may succeed. If the problem persists, reset the interface by writing –1 (0xFFFFFFFF) to PCIIC. Transfer Limitation A DMA transaction may never cross a 1 KByte border. The value represented by PCIDMAA(9:2) + PCIDMA(7:0) must be less than 256. If this restriction is not respected, the data transfer stops at the 1 kByte border. Then the PCI core is flushed. Simultaneously, in the PCI interrupt pending register (PCIITP) the dma error bit (DMAER) and the initiator error bit (IMIER) are asserted logical one. If enabled with the PCI interrupr enable register (PCIITE) and unmasked in the general interrupt mask register, the PCI interrupt 14 is generated (TT = 0x1E). Debug Facilities Not implemented for application use. 57 4226G–AERO–05/09 Target Mode Transfer In the target mode, the PCI interface receives requests originated from remote PCI initiators (masters). Target data transfer is executed in background without AT697 core intervention. AT697 core can only intervenes is the configuration of the target. • • In host bridge mode the target is configured by the AT697 core In satellite mode the configuration is done by a remote device using the PCI command set Target Programming The target is configured through the following registers : • PCISC register bits 0/1 for memory and I/O command response bit 6 for check of data and address parity error bit 7 for response to data and address parity error base address registers memory base address : MEMBAR1, MEMBAR2 I/O base address : IOBAR PCITPA register to indicate the storage location PCITSC(7) bit to write data in memory • • • transaction Ordering As specified in the PCI standard, delayed read functionality is implemented, obeying to the following rules: • The interface stores one delayed read at a time. When a read request was retried (because local data not yet available), the interface remains locked for any other target read (targeting different addresses). The initiator of the original read has to repeat its request to the same address. A retried (delayed) read can be interrupted by one or more PCI write accesses. The PCI standard requires this write command to be processed first, to prevent a system lock-up. Meanwhile, the interface will prefetch read-data into the TXMT FIFO. After the (interfering) write, when the read request is repeated, and the requested data is available in the FIFO the delayed transfer completes normally. • • All target read accesses are generally prefetching, also reads with I/O command. Once a start address is given, the interface prefetches up to 8 words into the TXMT FIFO. After the last required data word was transferred to PCI, the PCI core automatically flushes the FIFO to discard the unused prefetched data. The interface assumes the complete local address space to be ‘prefetchable’, defined here as the fact, that reading from an address does not alter the data. This behaviour is to be considered if nonprefetchable devices (for example the UART’s) shall be read through the PCI target. PCI Error Reporting According to the PCI standard error and status are implemented in the PCI status register. The PCI standard foresees a single parity check, by which bus-errors can be detected, but not corrected. • • Read data parity errors can eventually be retried by the hardware. In other cases, recovery must be done in software. Therefore, events, which occur in the PCI interface or on the PCI bus, are saved in status bits, and optionally, the PCI interrupt (IRQ14) is asserted. Different events can be selected to assert the interrupt. By the interrupt enable register (PCIITE) configuration you can select the interrupt events which will assert IRQ14. then an interrupt handler can read the interrupting event in the status register (PCIITP). Furthermore, interrupts can be forced for test purposes by writing to PCIITF. 58 AT697E 4226G–AERO–05/09 AT697 In host-bridge configuration, this allows an error detection by polling. Certain events and errors are also reported by the interface in the interrupt status register. For each bit of this register , interrupt generation can be programmed individualy. All PCI interrupt generated are then reported to AT697 core through the PCI interrupt (IT14). The different interrupt causes are distinguished by the interrupt status registers settings. Please refer to the register description chapter for more details on interrupt status register. 59 4226G–AERO–05/09 UARTs (UART1 and UART2) Overview The Universal Asynchronous Receiver and Transmitter (UART) is a highly flexible serial communication module. The AT697 implements two uarts : UART1 and UART2. Uarts on the processor are defined as alternate functions of the general purpose interface (GPI). The two UART’s provide double buffering. Each UART consists of a transmitter holding register, a receiver holding register, a transmitter shift register, and a receiver shift register. Each of these registers are 8-bit wide. Figure 36. UART Block Diagram Uart Scaler Reg. UASCAn Uart Control Reg. UACn Uart Status Reg. UASn Baud-rate generator control logic CTS RTS Transmitter Shift Register Data Bus RX Receiver Shift Register TX Receiver Holding Register Transmitter Holding Register Uart Data Reg. UADn Each UART is fully controlled by a set of four registers including : • • • • a control register a status register a scaler register and a data register Serial Frame Frame formats A serial frame is defined to be one character of data bits with synchronisation bits (start and stop bits), and optionnaly a parity bit for error checking. Two frame formats are accepted by the AT697 UARTs, the only difference being the presence or the absence of the parity bit. All the frames are built on an eight data bits basis. A frame starts with the synchronization start bit followed by the least significant data bit. Then the next data bits, up to a total of eight, are succeeding, ending with the most significant bit. If enabled by setting the PE bit in the uart control register (UCRx), the parity bit is inserted after the data bits and before the stop bit. The following figure illustrates the accepted frame formats. 60 AT697E 4226G–AERO–05/09 AT697 Figure 37. Data frame format Data frame, no parity: Start D0 D1 D2 D3 D4 D5 D6 D7 Stop Data frame with parity: Parity bit Start D0 D1 D2 D3 D4 D5 D6 D7 Parity Stop The parity bit is calculated by doing an exclusive-or of all the data bits. The odd parity is configured setting logical one the PS bit in the uart control register (UACn). In this case, the result of the exclusive or is inverted. An even parity can be selected setting logical zero the PS bit. If used, the parity bit is located between the last data bit and the stop bit of the serial frame. The relation between the parity bit and data bits is as follows: P even = d 7 ⊕ … ⊕ d 3 ⊕ d 2 ⊕ d 1 ⊕ d 0 ⊕ 0 P odd = d 7 ⊕ … ⊕ d 3 ⊕ d 2 ⊕ d 1 ⊕ d 0 ⊕ 1 Peven Podd dn Parity bit using even parity Parity bit using odd parity Data bit n of the character Clock Generation The clock generation logic generates the base clock for the Transmitters and Receivers. The bit rate of the UART is issued from the clock generator after a combination between the input clock of the clock module and a scaler. Two clock inputs can be used by the clock generator : • • An internal clock An external clock Uart Clock Each UART can be configured to use either the internal or the external clock source by programming the EC bit in the uart control register (UACn). If set logical zero, the UART is clocked by the internal clock. If EC is set logical one, the UART is clocked by the external clock. When using the external configuration, the UART clock shall be provided by PIO[3] from the general purpose interface. This clock input is used as an alternate function for PIO[3]. caution : When using the external clock source, the frequency of PIO[3] must be less than half the frequency of the system clock. Baud Rate Generation To generate the bit-rate, each UART has a programmable 12-bits clock divider (UASCAn). According to the configuration of the EC bit in the uart control register, the scaler is clocked either by the system or by an external clock. Each time the scaler underflows, a UART tick is generated. The scaler is automatically reloaded with the value of the UART scaler register after each underflow. The resulting UART tick frequency should be 8 times the desired baud-rate. 61 4226G–AERO–05/09 The following equation shall be used to calculate the scaler value to define, depending on the clock source and the expected baud rate. uartclk × 10 -------------------------------- – 5 baudrate × 8 s caler = -----------------------------------------10 variable description : • • • uartclk : frequency of the uart clock baudrate : expected baud rate scaler : value to set in UASCAn to reach the expected baudrate Communication Operations Transmitter Operation UARTS operations are controlled through the uart control registers (UACn) and the Uart status registers (UASn). The transmitter is enabled setting logical one the TE bit in the UART control register. When ready to transmit, data is transferred from the transmitter holding register to the transmitter shift register and converted to a serial frame on the transmitter serial output pin (TX). Following the transmission of the stop bit, if a new character is not available in the transmitter holding register, the transmitter serial data output remains high and the transmitter shift register empty bit (TS) in the UART status register is set logical one. Transmission resumes and the TS bit is cleared when a new character is loaded in the transmitter holding register. If the transmitter is disabled, it will continue operating until the character currently being transmitted is completely sent out. The transmitter holding register cannot be loaded when the transmitter is disabled. If flow control is enabled, the CTS input must be low in order for the character to be transmitted. If it is deasserted in the middle of a transmission, the character in the shift register is transmitted and the transmitter serial output then remains inactive until CTS is asserted again. If the CTS is connected to a receivers RTS, overrun can effectively be prevented. Receiver Operation The receiver is enabled for data reception when the receiver enable bit (RE) in the UART control register is set logical one. The receiver looks for a high to low transition of a start bit on the receiver serial data input pin. If a transition is detected, the state of the serial input is sampled a half bit clocks later. If the serial input is sampled high the start bit is invalid and the search for a valid start bit continues. If the serial input is still low, a valid start bit is assumed and the receiver continues to sample the serial input at one bit time intervals until the proper number of data bits and the parity bit have been assembled and one stop bit has been detected. During this process the least significant bit is received first. The serial input is sampled three times for each bit and averaged to filter out noise. The data is then transferred to the receiver holding register and the data ready bit (DR) is set logical one in the UART status register. The parity, framing and overrun error bits are set at the received byte boundary, at the same time as the receiver ready bit is set. If both receiver holding and shift registers contain an un-read character when a new start bit is detected, then the character held in the receiver shift register will be lost and the overrun bit (OV) is set logical one in the UART status register. 62 AT697E 4226G–AERO–05/09 AT697 If flow control is enabled, then the RTS will be negated (high) when a valid start bit is detected and the receiver holding register contains an un-read character. When the holding register is read, the RTS will automatically be reasserted again. A correctly received byte is indicated by the data ready bit (DR) in the UART status register (UASn). In case of error (framing error, stop bit error,...), the respective bits FE, PE, ... are set logical one in the UART status register when the data ready bit remains logical zero. Interrupt Generation The two UARTs can be configured to generate interrupt each time a byte is received or a byte is sent. If the TI bit in the UART control register is set logical one, an interrupt is issued after each character sending. If set logical zero, no interrupt is issued on character sending. If the RI bit in the UART control register is set logical one, an interrupt is issued after each character reception. If set logical zero, no interrupt is issued after a character reception. If the receiver interrupt is enabled, when error is detected during the reception of a character,an interrupt is generated. To identify the origin of the transaction failure, refer to the uart status register bits (OV, PE, TE) that indicate either it is a parity, a framing or an overrun error. Loop back mode If the LB bit in the UART control register is set, the UART will be in loop back mode. In this mode, the transmitter output is internally connected to the receiver input and the RTS is connected to the CTS. It is then possible to perform loop back tests to verify operation of receiver, transmitter and associated software routines. In this mode, the outputs remain in the inactive state, in order to avoid sending out data. 63 4226G–AERO–05/09 Debug Support Unit - DSU Overview The AT697 processor includes an hardware debug support unit to aid software debugging on target hardware. The support is provided through two modules: a debug support unit (DSU) and a debug communication link (DCL). The DSU can put the processor in debug mode, allowing read/write access to all processor registers and cache memories. The DSU also contains a trace buffer which stores executed instructions or data transfers on the internal bus. The debug communications link implements a simple read/write protocol and uses standard asynchronous UART communications. Figure 38. Debug Support Unit and Communication Link AT697 processor Trace Buffer DSUEN DSUBRE DSUACT Debug Support Unit I-Cache D-Cache AT697 SPARC V8 Integer unit Debug I/F AHB interface AMBA AHB DSUTX DSURX Debug Comm. Link It is possible to debug the processor through any master on the internal bus. The PCI interface is build in as a master on the internal bus. All debug features are available from any PCI master. Debug Support Unit The debug support unit is used to control the trace buffer and the processor debug mode. The DSU master occupies a 2 Mbyte address space on the internal bus. Through this address space, any other masters like PCI can access the processor registers and the contents of the trace buffer. The DSU control registers can be accessed at any time, while the processor registers and caches can only be accessed when the processor has entered debug mode. The trace buffer can be accessed only when tracing is disabled or completed. In debug mode, the processor pipeline is held and the processor is controlled by the DSU. Entering the debug mode can occur on the following events: • • • • • • • executing a breakpoint instruction (ta 1) integer unit hardware breakpoint/watchpoint hit (trap 0x0B) rising edge of the external break signal (DSUBRE) setting the break-now (BN) bit in the DSU control register a trap that would cause the processor to enter error mode occurrence of any, or a selection of traps as defined in the DSU control register after a single-step operation 64 AT697E 4226G–AERO–05/09 AT697 • DSU breakpoint hit The debug mode can only be entered when the debug support unit is enabled through an external pin (DSUEN). Driving the DSUEN pin high enables the debug mode. When the debug mode is entered, the following actions are taken: • • • PC and nPC are saved in temporary registers (accessible by the debug unit) an output signal (DSUACT) is asserted to indicate the debug state the timer unit is (optionally) stopped to freeze the AT697 timers and watchdog The instruction that caused the processor to enter debug mode is not executed, and the processor state is kept unmodified. Execution is resumed by clearing the BN bit in the DSU control register or by de-asserting DSUEN. The timer unit will be re-enabled and execution will continue from the saved PC and nPC. Debug mode can also be entered after the processor has entered error mode, for instance when an application has terminated and halted the processor. The error mode can be reset and the processor restarted at any address. DSU Breakpoint The DSU contains two breakpoint registers for matching either internal bus addresses or executed processor instructions. A breakpoint hit is typically used to freeze the trace buffer, but can also put the processor in debug mode. Freeze operation can be delayed by programming the TDELAY field in the DSU control register to a non-zero value. In this case, the TDELAY value will be decremented for each additional trace until it reaches zero, after which the trace buffer is frozen. If the brake on trace freeze bit (BT) is set logical one in the DSU control register, the DSU forces the processor into debug mode when the trace buffer is frozen. Note: Due to pipeline delays, up to 4 additional instruction can be executed before the processor is placed in debug mode. A mask register is associated with each breakpoint, allowing breaking on a block of addresses. Only address bits with the corresponding mask bit set to ‘1’ are compared during breakpoint detection. Time Tag The DSU implements a time tag counter. This counter is decremented each clock as long as the processor is running. The counter is stopped when the processor enters debug mode. It is restarted when execution is resumed. This time tag counter is stored in the trace as an execution time reference. Trace Buffer The trace buffer consists of a circular buffer that stores the executed instructions or the internal bus data transfers. The size of the trace buffer is 512 lines of 16 bytes. The trace buffer operation is controlled through the DSU control register (DSUC) and the trace buffer control register (TBC). When the processor enters debug mode, tracing is suspended. The trace buffer can contain the executed instructions, the transfers on the internal bus or both (mixed-mode). The trace buffer control register (TBC) contains two counters (BCNT ans ICNT) that store the address of the trace buffer location that will be written on next trace. Since the buffer is circular, it actually points to the oldest entry in the buffer. The indexes are automatically incremented after each stored trace entry. Instruction trace The instruction trace mode is enabled setting logical one the trace instruction enable bit (TI) in the trace buffer control register (TBC). 65 4226G–AERO–05/09 During instruction tracing, one instruction is stored per line in the trace buffer with the exception of multi-cycle instructions. Multi-cycle instructions can be entered two or three times in the trace buffer : • For store instructions, bits [63:32] correspond to the store address on the first entry and to the stored data on the second entry (and third in case of STD). Bit 126 is set logical one on the second and third entry to indicate this. A double load (LDD) is entered twice in the trace buffer, with bits [63:32] containing the loaded data. Multiply and divide instructions are entered twice, but only the last entry contains the result. Bit 126 is set for the second entry. For FPU operation producing a double-precision result, the first entry puts the MSB 32 bits of the results in bit [63:32] while the second entry puts the LSB 32 bits in this field. • • • Table 19. Trace buffer data allocation, Instruction tracing mode Bits 127 126 125:96 95:64 63:34 33 32 31:0 Name Instruction breakpoint hit Multi-cycle instruction DSU counter Load/Store parameters Program counter Instruction trap Processor error mode Opcode Definition Set to ‘1’ if a DSU instruction breakpoint hit occurred. Set to ‘1’ on the second and third instance of a multi-cycle instruction (LDD, ST or FPOP) The value of the DSU counter Instruction result, Store address or Store data Program counter (2 lsb bits removed since they are always zero) Set to ‘1’ if traced instruction trapped Set to ‘1’ if the traced instruction caused processor error mode Instruction opcode When a trace is frozen, interrupt 11 is generated. Bus Trace The bus trace mode is enabled setting logical one the trace instruction enable bit (TA) in the trace buffer control register (TBC). During bus tracing, one operation of the internal bus is stored per line in the trace buffer. Table 20. Trace Buffer Data Allocation, Internal bus Tracing Mode Bits 127 126 Name AHB breakpoint hit Definition Set to ‘1’ if a DSU AHB breakpoint hit occurred. Unused The value of the DSU counter Processor interrupt request input Processor interrupt level (psr.pil) Processor trap type (psr.tt) 125:96 DSU counter 95:92 91:88 95:80 IRL PIL Trap type 66 AT697E 4226G–AERO–05/09 AT697 Bits 79 78:77 76:74 73:71 70:67 66 65:64 63:32 31:0 Name Hwrite Htrans Hsize Hburst Hmaster Hmastlock Hresp Load/Store data Load/Store address Definition AHB HWRITE AHB HTRANS AHB HSIZE AHB HBURST AHB HMASTER AHB HMASTLOCK AHB HRESP AHB HRDATA or HWDATA AHB HADDR Mixed Trace In mixed mode, the buffer is divided on two halves, with instructions stored in the lower half and bus transfers in the upper half. The MSB bit of the AHB index counter is then automatically kept high, while the MSB of the instruction index counter is kept low. Table 21. DSU Map Address 0x800000c4 0x800000c8 0x800000cc 0x90000000 0x90000004 0x90000008 0x90000010 0x90000014 0x90000018 0x9000001C 0x90010000 - 0x90020000 ..0 ...4 ...8 ...C 0x90020000 - 0x90040000 0x90080000 - 0x90100000 0x90080000 0x90080004 Register DSU UART status register DSU UART control register DSU UART scaler register DSU control register Trace buffer control register Time tag counter AHB break address 1 AHB mask 1 AHB break address 2 AHB mask 2 Trace buffer Trace bits 127 - 96 Trace bits 95 - 64 Trace bits 63 - 32 Trace bits 31 - 0 IU/FPU register file IU special purpose registers Y register PSR register DSU Memory Map 67 4226G–AERO–05/09 Address 0x90080008 0x9008000C 0x90080010 0x90080014 0x90080018 0x9008001C 0x90080040 - 0x9008007C 0x90100000 - 0x90140000 0x90140000 - 0x90180000 0x90180000 - 0x901C0000 0x901C0000 - 0x90200000 Register WIM register TBR register PC register NPC register FSR register DSU trap register ASR16 - ASR31 (when implemented) Instruction cache tags Instruction cache data Data cache tags Data cache data The addresses of the IU/FPU registersis defined according to how many register windows has been implemented. The registers can be accessed at the following addresses (NWINDOWS = number of SPARC register windows = 8): • • • • • Debug Operations Instruction Breakpoints To insert instruction breakpoints, the breakpoint instruction (ta 1) should be used. This will leave the four IU hardware breakpoints free to be used as data watchpoints. Since cache snooping is only done on the data cache, the instruction cache must be flushed after the insertion or removal of breakpoints. To minimize the influence on execution, it is enough to clear the corresponding instruction cache tag (which is accesible through the DSU). The DSU hardware breakpoints should only be used to freeze the trace buffer, and not for software debugging since there is a 4-cycle delay from the breakpoint hit before the processor enters the debug mode. Single Stepping DSU Trap By writing the SS bit and reseting the BN bit in the DSU control register, the processor will resume execution for one instruction and then automatically enter debug mode. The DSU trap register (DTR) consists in a read-only register that indicates which SPARC trap type caused the processor to enter debug mode. When debug mode is forced by setting the BN bit in the DSU control register, the trap type is 0x0B. %on: 0x90020000 + (((psr.cwp * 64) + 32 + n) mod (NWINDOWS*64)) %ln: 0x90020000 + (((psr.cwp * 64) + 64 + n) mod (NWINDOWS*64)) %in: 0x90020000 + (((psr.cwp * 64) + 96 + n) mod (NWINDOWS*64)) %gn: 0x90020000 + (NWINDOWS*64) + 128 %fn: 0x90020000 + (NWINDOWS*64) DSU Communication Link 68 DSU communication link consists of a UART connected to the internal bus as a master. AT697E 4226G–AERO–05/09 AT697 Figure 39. DSU Communication Link Block Diagram Baud-rate generator 8*bitclk Serial port Controller AMBA APB DSURX Receiver shift register Transmitter shift register DSUTX AHB master interface AHB data/response AMBA AHB A simple communication protocol is supported to transmit access parameters and data. A link command consist of a control byte, followed by a 32-bit address, followed by optional write data. If the LR bit in the DSU control register is set, a response byte will be sent after each AHB transfer. If the LR bit is not set, a write access does not return any response, while a read access only returns the read data. Data Frame Data is sent on 8-bit basis. Figure 40. DSU UART Data Frame Start D0 D1 D2 D3 D4 D5 D6 D7 Stop Commands Through the communication link, a read or write transfer can be generated to any address on the internal bus. A response byte is can optionally be sent when the processor goes from execution mode to debug mode. Block transfers can be performed be setting the length field to n-1, where n denotes the number of transferred words. For write accesses, the control byte and address is sent once, followed by the number of data words to be written. The address is automatically incremented after each data word. For read accesses, the control byte and address is sent once and the corresponding number of data words is returned. Figure 41. DSU Commands DSU Write Command Send 11 Length -1 Addr[31:24] Addr[23:16] Addr[15:8] Addr[7:0] Data[31:24] Data[23:16] Data[15:8] Data[7:0] Receive Resp. byte (optional) Response byte encoding DSU Read command Send 10 Length -1 Addr[31:24] Addr[23:16] Addr[15:8] Addr[7:0] bit 7:3 = 000000 bit 2 = DMODE bit 1:0 = HRESP Receive Data[31:24] Data[23:16] Data[15:8] Data[7:0] Resp. byte (optional) 69 4226G–AERO–05/09 Clock Generation The UART contains a 14-bit down-counting scaler to generate the desired baud-rate. The scaler is clocked by the system clock and generates a UART tick each time it underflows. The scaler is reloaded with the value of the UART scaler reload register after each underflow. The resulting UART tick frequency should be 8 times the desired baud-rate. If not programmed by software, the baud rate will be automatically be discovered. This is done by searching for the shortest period between two falling edges of the received data (corresponding to two bit periods). When three identical two-bit periods has been found, the corresponding scaler reload value is latched into the reload register, and the BL bit is set in the UART control register. If the BL bit is reset by software, the baud rate discovery process is restarted. The baud-rate discovey is also restarted when a ‘break’ is received by the receiver, allowing to change to baudrate from the external transmitter. For proper baudrate detection, the value 0x55 should be transmitted to the receiver after reset or after sending break. The best scaler value for manually programming the baudrate can be calculated as follows: sysclk × 10 --------------------------------- – 5 baudrate × 8 scaler = -----------------------------------------10 Booting from DSU By asserting DSUEN and DSUBRE at reset time, the processor will directly enter debug mode without executing any instructions. The system can then be initialised from the communication link, and applications can be downloaded and debugged. Additionally, external (flash) PROMs for standalone booting can be re-programmed. 70 AT697E 4226G–AERO–05/09 AT697 JTAG Interface Overview The AT697 implements a standard interface compliant with the IEEE 1149.1 JTAG specification. This interface can be used for PCB testing using the JTAG boundary-scan capability. The JTAG interface is accessed through five dedicated pins. In JTAG terminology, these pins constitute the Test Access Port (TAP). The following table summarizes the TAP pins and there function at JTAG level. Table 22. TAP Pins Pin TCK Name Test Clock Type Input Description Used to clock serial data boundary into scan latches and control sequence of the test state machine. TCK can be asynchronous with CLK Primary control signal for the state machine. Synchronous with TCK. A sequence of values on TMS adjusts the current state of the TAP. Serial input data to the boundary scan latches. Synchronous with TCK Serial output data from the boundary scan latches. Synchronous with TCK Resets the test state machine. can be asynchronous with TCK TMS Test Mode select Input TDI TDO TRST Test Data Input Test Data Output Test Reset Input Output Input For more details, please refer to the ‘IEEE Standard Test Access Port and Boundary Scan’ specification. Any AT697 based system will contain several JTAG compatible chips. These are connected using the minimum (single TMS signal) configuration. This configuration contains three broadcast signals (TMS, TCK, and TRST,) which are fed from the JTAG master to all JTAG slaves in parallel, and a serial path formed by a daisy-chain connection of the serial test data pins (TDI and TDO) of all slaves. The TAP supports a BYPASS instruction which places a minimum shift path (1 bit) between the chip’s TDI and TDO pins. This allows efficient access to any single chip in the daisy-chain without board-level multiplexing. Figure 42. JTAG Serial connection using 1 TMS Signal Part 1 Part 2 TDO TRST TDI TMS TCK TDO TRST TDI TMS TCK Part 3 TDO TRST TDI Part n TDO TRST TDI TDI TMS TCK TDO TMS TCK TMS TCK TRST 71 4226G–AERO–05/09 TAP Architecture The TAP implemented in the AT697 consists of a TAP interface, a TAP controller, plus a number of shift registers including an instruction register (IR) and some registers . Figure 43. AT697 TAP Architecture Boundary Scan Register TDO TDI Device ID Register Bypass Register Mux 0 1 DQ EN ∇ TAP TMS TCK TRST Test Data Registers .... Clock DR Shift DR Update DR Reset TAP Controller Clock IR Shift IR Update IR Instruction Decode ......... Instruction Register .... Select TCK Ena TDO .... Design-Specific Data TAP Controller The TAP controller is a synchronous finite state machine (FSM) which controls the sequence of operations of the JTAG test circuitry, in response to changes at the JTAG bus. (Specifically, in response to changes at the TMS input with respect to the TCK input.) The TAP controller FSM implements the state (16 states) diagram as detailed in the following diagram. The IR is a 3-bit register which allows a test instruction to be shifted into the AT697. The instruction selects the test to be performed and the test data register to be accessed. Although any number of loops may be supported by the TAP, the finite state machine in the TAP controller only distinguishes between the IR and a DR. The specific DR can be decoded from the instruction in the IR. 72 AT697E 4226G–AERO–05/09 AT697 Figure 44. TAP - State Machine 1 Test Logic Reset 0 0 Run Test/Idle 1 Select DR Scan 1 Select IR Scan 1 0 1 Capture DR[1] 0 1 Capture IR 0 Shift DR 0 0 1 Shift IR 0 1 Transitions between states are controlled by TMS input value. 1 Exit_1 DR 1 Exit_1 IR 0 Pause DR 0 0 0 Pause IR 0 1 0 Exit_2 DR 1 Exit_2 IR 1 Update DR[1] 1 Update IR 1 0 1 0 Due to the scan cell layout, "Capture DR" and "Update DR" are states without associated action during the scanning of internal chains. TAP Instructions The following instruction are supported by the AT697 TAP. Table 23. TAP instruction set Binary Value Instruction Name 000 001 010 111 EXTEST SAMPLE/PRELOAD BYPASS IDCODE Data Register Boundary scan register Boundary scan register Bypass register Device id register Scan Chain Accessed Boundary scan chain Boundary scan chain Bypassscan chain ID register scan chain 73 4226G–AERO–05/09 BYPASS This instruction is binary coded "010" It is used to speed up shifting at board level through components that are not to be activated. EXTEST This instruction is binary coded "000" It is used to test connections between components at board level. Components output pins are controlled by boundary scan register during Capture DR on the rising edge of TCK. SAMPLE/PRELOAD This instruction is binary coded "001" It is used to get a snapshot of the normal operation by sampling I/O states during Capture DR on the rising edge of TCK. It allows also to preload a value on the output latches during Update DR on falling edge of TCK. It do not modify system behaviour. IDCODE This instruction is binary coded "111" Value of the IDCODE is loaded during Capture DR. Test Data Registers Bypass Register The following data registers are supported in the AT697 TAP: Bypass register containing a single shift register stage is connected between TDI and TDO. Figure 45. Bypass Register Cell from TDI Shift DR Clock DR & D to TDO Device ID register Device ID register is a read only 32-bit register. It is connected between TDI and TDO. Figure 46. Device ID Register 31 28 27 12 11 1 0 Vers. 0001 Part ID 1011 . 0110 . 0100 . 0101 Manufacturer’s ID 000 . 0101 . 1000 Const. 1 ID. register value: 0x 1b64 50b1 Field Definitions: [31:28]: Vers - Version number - 0x1 [27:12]: Part ID - Represent part number as assigned by Vendor- 0x b645 [11:01]: Manufacturer’s ID - Represent manufacturer’s ID as per JEDEC - 0x 058 [0]: Const - Constant tied to logic ’1’. 74 AT697E 4226G–AERO–05/09 AT697 Boundary Scan Register A single scan chain consisting of all of the boundary scan cells (input, output and in/out cells). • The purpose of the boundary scan is the support of scan-based board testing. Boundary Scan register is connected between TDI and TDO. To use the boundary scan feature, the PLL will be in bypass mode, i.e. BYPASS signal direction to VCC. Checker Scan Register A single scan chain consisting of all of the scan cells of IU parity checkers. The checkers scan is only used for factory test. Checkers scan register is connected between TDI and TDO. 75 4226G–AERO–05/09 Execution Mode Reset Mode When the RESET input is asserted for at least two cycles, the processor enters reset mode. Under this mode, the CPU and all the peripherals are halted. Only the following registers are affected by the reset. All other registers maintain their value or are undefined. Table 24. Reset Operation Register PC nPC PSR CCR MCFG1[9:8] MCFG3[8] Description program counter new program counter processor status register cache control register PROM bus width PROM EDAC enable Reset Value 0x0000 0000 0x0000 0004 et = 0 s=1 0x0000 0000 PIO[1:0] PIO[2] When RESET is deasserted, execution restarts from address 0. Debug Mode Debug mode can be entered when the DSU is enabled through the external DSUEN pin. This allows read/write access to all processor registers and caches memories. In debug mode, the processor pipeline is held and the processor is controlled by the DSU. AT697 can be idled by writing any value to the power-down register. During power-down mode, only the integer unit is halted. All other functions and peripherals operate as nominal. When a single write to the idle register is performed, idle mode is entered on the next load instruction. Idle mode is terminated when an unmasked interrupt with higher level than the current processor interrupt level is pending. Then, the integer unit is reenabled. Here is a simple example allowing Idle mode entry : ! write any value to Idle register st %g2,[%g1 + 0x18] ! enter Idle mode ld [%o1 + 0x08],%g3 Power-down/Idle Mode 76 AT697E 4226G–AERO–05/09 AT697 System Clock Overview The AT697 clock system is mainly based on two main clock trees : the PCI clock and the CPU clock. The following figure presents the clock system of the processor and its distribution. Figure 47. Clock Distribution SDCLK Interrupt Controller Timers GPI Memory Control PCI Core Caches Reg. File CPU clock CPU Core PCI Wrapper PCI clock Uarts Uart Control Reg. UACn BYPASS PLL LFT PDIV4 LOCK Alternate UART clock CLK PCI Clock External Clock The PCI clock is dedicated to the PCI Interface. It is used in particular by the PCI wrapper that shares its activity between the two clock domains. The PCI interface and its associated wrapper can only be driven from an external clock. The PCI clock shall be connected to the PCI_CLK pin of the PCI interface. This input shall be driven at a frequency in the range of 0 up to 33MHz. CPU Clock The CPU clock is routed to the parts of the system concerned with operation of the SPARC core. Examples of such modules are the CPU core itself, the register files... The CPU clock is also used by the majority of the I/O modules like Timers, Memory controller, Interrupt Controller, with the exception of the PCI Interface. The CPU clock is driven either directly by an external oscillator or by the internal PLL. External Clock To drive the device directly from an external clock source, the CLK input shall be driven by an external clock generator while the BYPASS pin is driven high. In that way, the CPU clock is the direct representation of the clock applied to CLK. When the external CPU clock source is selected, the clock input can be driven at a frequency in the range of 0MHz up to 100MHz. 77 4226G–AERO–05/09 PLL Overview The CPU clock can be issued from the internal PLL. This PLL contains a phase/frequency detector, charge pump, voltage control oscillator, lock detector and divider. Figure 48. PLL Block Diagram LFT CLK cpu clock PDIV4 LOCK Divider The PLL implemented is configured by hardware to provide a cpu clock frequency four times the frequency of the input clock. PLL control The PLL control is done by hardware through dedicated ports, including a bypass, a clock input and a filter input. The following table presents the assignement and functions of the PLL control signals. Table 25. PLL ports description Pin name LFT LOCK CLK BYPASS Function External passive loop filter input Lock Board clock input Bypass PLL filter To ensure the functionality of the PLL, an external low pass filter shall be connected to the filter input (LFT) of the PLL. Here is a presentation of the filter to setup on the LFT pin. 78 AT697E 4226G–AERO–05/09 AT697 Figure 49. Low Pass Filter Connection Pll The optimal value for this filter are the following: • • • Operation R1 = 100 ohms C1 = 100nF C2 = 10nF +/- 10% +/- 10% +/- 10% To drive the device from the internal PLL, the CLK input shall be driven by an external clock generator while the BYPASS pin is driven low. In that way, the CPU clock frequency is four time the frequency of the clock applied to CLK. When the PLL based CPU clock source is selected, the clock input shall be driven at a frequency in the range of 20MHz up to 25MHz. Fault Tolerance & Clock To prevent erroneous operations from single event transient (SET) errors and single event upset (SEU), the AT697 processor is based on full triple modular redundancy (TMR) architecture. Figure 50. TMR structure Such architecture is based on a fully triplicated clock distribution (CLK1, CLK2 and CLK3). In that way, each one of the PCI clock and the cpu clock are build as three-clock trees. 79 4226G–AERO–05/09 Skew To prevent the processor from corruption by single event transient (SET) phenomenon, additional skew can be programmed on the clock trees. The two dedicated pins SKEW1 and SKEW0 are used to program the delay induced by the skew. Here is a short description of the skew implementation : Figure 51. CPU clock tree overview BYPASS PLL SKEW[1:0] CLK cpu clock i1 i2 i3 SKEW[1:0] i1 CLK1 tree D2 = D1 D1 D2 i2 i3 SKEW[1:0] i1 CLK2 tree D3 D4 = D3 = 2 * D1 i2 i3 CLK3 tree D4 Three configuration of skew are available : • • • SKEW[1:0] = ’00’ : natural skew corresponding to the intrinsec routage of the chip SKEW[1:0] = ’01’ : medium skew ‘artificially’ injected SKEW[1:0] = ’10’ : maximum skew ‘artificially’ injected The remaining configuration (SKEW[1:0] = ’11’) is reserved and must not be used at application level. Table 26. SKEW assignements DELAY SKEW[1:0] ‘00’ ‘01’ ‘10’ ‘11’ CLK1 -> CLK2 natural D1 D1 + D2 Reserved CLK1 -> CLK3 natural D3 D3 + D4 Comments natural skew medium skew maximum skew Use of a high level of skew improves the efficiency of SET prevention but leads to an operating loss performance. Maximum speed is decreased and timings on the interfaces are slower than with natural skew. Refer to the ’Electrical Characteristics’ section for detailed timings at each skew. 80 AT697E 4226G–AERO–05/09 AT697 Package - MCGA 349 Mechanical Outlines A2 A1 e A mm D/E D1/E1 A1 A2 A b e min max 24,8 25,2 22,86 1,4 1,85 2,4 3,45 4,3 5,9 0,79 0,99 1,27 inch min 0,976 max 0,992 0,9 0,055 0,073 0,094 0,136 0,169 0,232 0,031 0,04 0,05 81 4226G–AERO–05/09 Socket / Adapter Socket reference In order to support MCGA 349 package on evaluation board that may require exchange of the chip, ATMEL had a dedicated socket developped by Adapters-Plus. The reference of the socket for the MCGA349 package is CL349SA1912F. Figure 52. CL349SA1912E socket Top View Side View A direct link to information on this socket is available at: h ttp://www.adaptplus.com/products/ic_sockets/datasheets/ds_MCGA_lockingskt.htm Provider The CL349SA1912F socket is provided by Adapters-Plus : Adapters-Plus 15 W 8TH STREET STE B. Tracy, Ca 95376 - USA Phone: 209-839-0200 Fax: 209-839-0235 www.adapt-plus.com 82 AT697E 4226G–AERO–05/09 AT697 QFP256 package Package Description 83 4226G–AERO–05/09 Registers Description Table 27. Register legend Address = 0x01010101 Bit Number 31 30 29 28 27 26 25 24 23 ... ... ... ... 9 8 7 6 5 4 3 2 1 0 field name field reserved bit access type default value after reset r=read access 0 100 1 w=write acces x = undefined or non affected by reset r/w=read and write access Integer Unit Registers Table 28. Processor State Register- PSR 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 r 0001 Bit Number 31..28 27..24 r 0001 x x r/w x x r/w xxxxxx Description r 0 r x r/w xxxx 1 Mnemonic impl ver Implementation or class of implementations of the architecture. Identify one or more particular implementations or is a readable and writable state field whose properties are implementation-dependent. indicates whether the ALU result was negative for the last instruction modifying icc field. 1 = negative 0 = not negative. indicates whether the ALU result was zero for the last instruction modifying icc field. 1 = zero 0 = not zero. indicates whether the ALU result was within the range of (was representable in) 32-bit 2’s complement notation for the last instruction that modified the icc field. 1 = overflow, 0 = no overflow. indicates whether a 2’s complement carry out (or borrow) occurred for the last instruction that modified the icc field. Carry is set on addition if there is a carry out of bit 31. Carry is set on subtraction if there is borrow into bit 31. 1 = carry, 0 = no carry. determines whether the implementation-dependent oprocessor is enabled. If disabled, a coprocessor instruction will trap. 1 = enabled, 0 = disabled. If an implementation does not support a coprocessor in ardware, PSR.EC should always read as 0 and writes to it should be ignored. determines whether the FPU is enabled. If disabled, a floating-point instruction will trap. 1 = enabled, 0 = disabled. If an implementation does not support a hardware FPU, PSR.EF should always read as 0 and writes to it should be ignored. identify the interrupt level above which the processor will accept an interrupt. 23 n 22 z 21 v 20 c 13 ec 12 11..8 ef pil 84 AT697E 4226G–AERO–05/09 s impl ver n z v c reserved ec ef pil ps et cwp r/w x 0 r/w xxxxx AT697 Bit Number 7 6 Mnemonic s ps Description determines whether the processor is in supervisor or user mode. 1 = supervisor mode, 0 = user mode. contains the value of the S bit at the time of the most recent trap. determines whether traps are enabled. A trap automatically resets ET to 0. When ET=0, an interrupt request is ignored and an exception trap causes the IU to halt execution, which typically results in a reset trap that resumes execution at address 0. 1 = traps enabled, 0 = traps disabled. comprise the current window pointer, a counter that identifies the current window into the r registers. The hardware decrements the CWP on traps and SAVE instructions, and increments it on RESTORE and RETT instructions (modulo NWINDOWS). 5 et 4..0 cwp Table 29. Window Invalid Mask - WIM 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 reserved 7 r 0 0 0 Bit 0
AT697E 价格&库存

很抱歉,暂时无法提供与“AT697E”相匹配的价格&库存,您可以联系我们找货

免费人工找货