0
登录后你可以
  • 下载海量资料
  • 学习在线课程
  • 观看技术视频
  • 写文章/发帖/加入社区
会员中心
创作中心
发布
  • 发文章

  • 发资料

  • 发帖

  • 提问

  • 发视频

创作活动
AT697F

AT697F

  • 厂商:

    ATMEL(爱特梅尔)

  • 封装:

  • 描述:

    AT697F - Rad-Hard 32 bit SPARC V8 Processor - ATMEL Corporation

  • 数据手册
  • 价格&库存
AT697F 数据手册
Features • SPARC V8 High Performance Low-power 32-bit Architecture – 8 Register Windows • Advanced Architecture: – On-chip Amba Bus – 5 Stage Pipeline – 16 kbyte Multi-sets Data Cache – 32 kbyte Multi-sets Instruction Cache On-chip Peripherals: – Memory Interface PROM Controller SRAM Controller SDRAM Controller – Timers Two32-bit Timers Watchdog 32-bitTimer – Two 8-bit UARTs – Interrupt Controller with 8 External Programmable Inputs – 32 Parallel I/O Interface – 33MHz PCI Interface Compliant with 2.2 PCI Specification Integrated 32/64-bit IEEE 754 Floating-point Unit Fault Tolerance by Design – Full Triple Modular Redundancy (TMR) – EDAC Protection – Parity Protection Debug and Test Facilities – Debug Support Unit (DSU) for Trace and Debug – IEEE 1149.1 JTAG Interface – Four Hardware Watchpoints 8 and 40-bit boot-PROM Interface Possibilities Operating range – Voltages 3.3V +/- 0.30V for I/O 1.8V +/- 0.15V for Core – Temperature -55°C to 125°C Clock: 0MHz up to 100MHz Power consumption: 1W at 100MHz Performance: – 86MIPS (Dhrystone 2.1) – 23MFLOPS (Whetstone) Radiation Performance – Tested up to a total dose of 300Krads (Si) according to the MIL-STD883 method 1019 – SEU error rate better than 1 E-5 error/device/day – No Single Event Latchup below a LET threshold of 70 MeV.cm²/mg Package MCGA349 and MQFPF256 Mass: 9g Development Kit Including – AT697F Evaluation Board – AT697F Sample • Rad-Hard 32 bit SPARC V8 Processor AT697F • • • • • Advance Information • • • • • • • 7703C–AERO–6/09 Description The AT697F is a highly integrated, high-performance 32-bit RISC embedded processor based on the SPARC V8 architecture. The implementation is based on the European Space Agency (ESA) LEON2 fault tolerant model. By executing powerful instructions in a single clock cycle, the AT697F achieves throughputs approaching 1MIPS per MHz, allowing the system designer to optimize power consumption versus processing speed. The AT697F is designed to be used as a building block in computers for on-board embedded real-time applications. It brings up-to-date functionality and performance for space application. The AT697F only requires memory and application specific peripherals to be added to form a complete on-board computer. The AT697F contains an on-chip Integer Unit (IU), a Floating Point Unit (FPU), separate instruction and data caches, hardware multiplier and divider, interrupt controller, debug support unit with trace buffer, two 32-bit timers, Parallel and Serial interfaces, a Watchdog, a PCI Interface and a flexible Memory Controller. The design is highly testable with the support of a Debug Support Unit (DSU) and a boundary scan through JTAG interface. An Idle mode holds the processor pipeline and allows Timer/Counter, Serial ports and Interrupt system to continue functioning. The processor is manufactured using the Atmel 0.18 µm CMOS process. It has been especially designed for space, by implementing on-chip concurrent transient and permanent error detection and correction. The AT697F is pinout compatible with the AT697E. Refer to section “Differences between AT697F and AT697E”, page 146“ for detailed description of the differences between AT697F and AT697FE. 2 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Figure 1. AT697F Block Diagram AT697F Integer Unit (SPARC V8) I -Cache D-Cache BRDY* READ WRITE* A[27:0] D[31:0] ... PROM FPU TDI TDO ... Memory Controller SRAM JTAG AMBA Controller AMBA bridge SDRAM RxD TxD ... DSU AHB RESET* Reset PCI/AMBA bridge PCI CLK BYPASS ... Clock Generator APB interrupt config PIO RS232 RxD TxD Interrupt Controller WDOG* Watchdog Timers IOs 3 7703C–AERO–6/09 Pin Configuration MCGA349 package Table 1. AT697F MCGA349 pinout - Advanced Information A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 VDD18 VSS18 N.C. PIO[13] CB[1] CB[6] D[3] D[8] D[12] D[17] D[21] D[25] D[30] VSS18 VDD18 VSS18 VDD18 VDD18 N.C. PIO[10] VSS33 CB[4] N.C. D[5] VSS33 D[18] D[23] N.C. N.C. VSS18 VDD18 VSS18 B C VDD18 VDD18 VSS18 PIO[9] PIO[11] VCC33 N.C. D[2] D[1] VCC33 VCC33 D[11] VCC33 D[22] D[26] D[28] VSS18 VDD18 VDD18 D VSS18 PIO[0] VCC33 N.C. N.C. Reserved PIO[15] VCC33 VSS33 VSS33 D[13] VSS33 VCC33 D[27] D[29] VCC33 D[31] VCC33 VSS18 E PIO[6] N.C. PIO[2] PIO[5] N.C. CB[0] VSS33 CB[7] D[6] Reserved D[7] D[14] VSS33 N.C. N.C. N.C. N.C. A[0] A[2] F PIO[1] PIO[4] N.C. PIO[3] VSS33 N.C. PIO[12] CB[2] VCC33 D[10] D[15] D[16] VSS33 VSS33 N.C. N.C. A[7] A[4] VSS33 G RAMS*[1] RAMS*[2] RAMOE*[3] RAMS*[4] RAMOE*[1] VSS33 PIO[7] PIO[8] CB[3] D[4] N.C. D[19] A[1] A[3] A[12] A[6] VSS33 A[8] A[9] 4 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Table 2. AT697F MCGA349 pinout - Advanced Information H 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 RAMOE*[0] RAMOE*[2] VCC33 RAMOE*[4] RWE*[1] RWE*[3] RAMS*[0] RAMS*[3] CB[5] D[9] D[20] D[24] N.C. A[10] N.C. A[11] A[19] A[13] A[15] j VSS33 ROMS*[1] ROMS*[0] RWE*[0] WRITE* RWE*[2] N.C. VCC33 PIO[14] D[0] A[5] A[14] VCC33 VCC33 VSS33 VSS33 A[17] A[18] A[20] k READ TCK TDI TDO VSS33 IOS* TRST OE* VSS33 N.C. A[16] A[26] A[21] A[27] VCC33 A[23] VSS33 A[22] A[25] l DSUACT DSURX DSUTX DSUEN TMS VSS33 SDDQM[0] BRDY* SDRAS* A/D[14] N.C. VDD_PLL N.C. LOCK A[24] RESET* VCC33 VSS33 ERROR* m BEXC* SDCLK DSUBRE SDDQM[2] N.C. VSS33 VSS33 VCC33 A/D[22] VSS33 A/D[12] AGNT*[3] N.C. SKEW[1] Reserved N.C. WDOG* VSS_PLL SKEW[0] n VCC33 VSS33 SDDQM[1] N.C. SDDQM[3] GNT* VCC33 A/D[21] A/D[16] PERR* A/D[9] A/D[1] VSS33 A/D[0] BYPASS AREQ*[2] N.C. AREQ*[3] VCC33 p SDWE* PCI_CLK VSS33 SDCS*[0] SDCAS* A/D[24] A/D[30] A/D[18] A/D[17] IRDY* A/D[15] A/D[8] A/D[5] AGNT*[1] CLK VSS33 VSS33 N.C. AREQ*[1] 5 7703C–AERO–6/09 Table 3. AT697F MCGA349 pinout - Advanced Information r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 REQ* N.C. PCI_RST* N.C. N.C. N.C. SYSEN* VSS33 TRDY* PCI_LOCK* VSS33 N.C. VCC33 VCC33 N.C. N.C. VCC33 N.C. AREQ*[0] t VSS18 SDCS*[1] A/D[31] A/D[29] N.C. A/D[27] VSS33 VSS33 VCC33 DEVSEL* VCC33 A/D[11] A/D[7] VSS33 A/D[2] VCC33 AGNT*[0] AGNT*[2] VSS18 u VDD18 VDD18 VSS18 VCC33 A/D[26] IDSEL VCC33 FRAME* N.C. STOP* VSS33 PAR A/D[10] C/BE*[0] VCC33 N.C. VSS18 VDD18 VDD18 VSS18 VDD18 VSS18 N.C. VSS33 C/BE*[3] A/D[20] C/BE*[2] VCC33 C/BE*[1] VSS33 VSS33 A/D[4] N.C. VDD18 VDD18 VSS18 VDD18 VSS18 A/D[28] A/D[25] A/D[23] A/D[19] VSS33 VCC33 SERR* A/D[13] VSS33 A/D[6] A/D[3] VSS18 VDD18 v w Notes: 1. ‘Reserved’ pins shall not be driven to any voltage 2. N.C. refers to unconnected pins 6 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION QFP256 Package Table 4. AT697F QFP256 pinout pin number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 pin name VCC33 PCI_REQ* PCI_GNT* PCI_CLK PCI_RST* SDCS*[0] VSS VDD18 SDCS*[1] SDWE* SDRAS* VSS VSS SDCAS* VCC33 SDDQM[0] SDDQM[1] SDDQM[2] SDDQM[3] SDCLK BRDY* BEXC* VSS VSS DSUEN DSUTX DSURX DSUBRE DSUACT TRST pin number 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 pin name TCK TMS VSS TDI TDO WRITE* READ OE* IOS* VCC33 ROMS*[0] ROMS*[1] RWE*[0] RWE*[1] RWE*[2] RWE*[3] RAMOE*[0] RAMOE*[1] RAMOE*[2] RAMOE*[3] RAMOE*[4] RAMS*[0] VCC33 RAMS*[1] RAMS*[2] RAMS*[3] VSS VDD18 RAMS*[4] PIO[0] pin number 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 pin name PIO[1] PIO[2] PIO[3] PIO[4] PIO[5] PIO[6] VCC33 PIO[7] PIO[8] PIO[9] VSS VDD18 PIO[10] PIO[11] Reserved PIO[12] PIO[13] PIO[14] PIO[15] VCC33 CB[0] CB[1] CB[2] CB[3] VCC33 CB[4] CB[5] CB[6] CB[7] D[0] 7 7703C–AERO–6/09 Table 5. AT697F QFP256 pinout pin number 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 pin name VCC33 D[1] D[2] D[3] D[4] D[5] D[6] Reserved VCC33 D[7] D[8] D[9] D[10] D[11] D[12] VCC33 D[13] D[14] D[15] D[16] D[17] VSS D[18] VCC33 D[19] D[20] D[21] D[22] D[23] D[24] VSS VDD18 VCC33 pin number 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 pin name D[25] D[26] D[27] D[28] D[29] D[30] VCC33 D[31] N.C. A[0] A[1] VSS VDD18 A[2] A[3] A[4] VCC33 A[5] A[6] A[7] A[8] A[9] A[10] VCC33 A[11] A[12] A[13] A[14] A[15] A[16] VCC33 A[17] A[18] pin number 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 pin name A[19] A[20] A[21] A[22] VSS VCC33 A[23] A[24] A[25] A[26] A[27] WDOG* ERROR* VCC33 RESET* Reserved LOCK SKEW[1] SKEW[0] BYPASS VSS_PLL N.C. VDD_PLL CLK VCC33 PCI_AREQ*[3] PCI_AGNT*[3] PCI_AREQ*[2] VSS VDD18 PCI_AGNT*[2] PCI_AREQ*[1] VCC33 8 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Table 6. AT697E MQFP256 pinout pin number 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 pin name PCI_AGNT*[1] PCI_AREQ*[0] PCI_AGNT*[0] A/D[0] VCC33 A/D[1] A/D[2] A/D[3] A/D[4] VSS VDD18 VCC33 A/D[5] A/D[6] A/D[7] C/BE*[0] VSS VCC33 A/D[8] A/D[9] A/D[10] A/D[11] VCC33 pin number 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 pin name A/D[12] A/D[13] A/D[14] A/D[15] VCC33 C/BE*[1] PAR SERR* PERR* VCC33 PCI_LOCK* STOP* DEVSEL* TRDY* VCC33 IRDY* FRAME* VSS C/BE*[2] A/D[16] VCC33 A/D[17] A/D[18] pin number 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 pin name A/D[19] SYSEN* A/D[20] VCC33 A/D[21] A/D[22] A/D[23] IDSEL C/BE*[3] VCC33 A/D[24] A/D[25] A/D[26] VSS VDD18 A/D[27] VCC33 A/D[28] A/D[29] A/D[30] A/D[31] Notes: 1. ‘Reserved’ pins shall not be driven to any voltage 2. N.C. refers to unconnected pins 9 7703C–AERO–6/09 Pin Description ATMEL Convention ‘*’ attached to a signal (e.g OE*) designate an active-low signal. When a bit of a register is writen in C-like style (e.g MCFG2 RAMWWS) it must be read as the RAMWWS bit in the register MCFG2. IU and FPU Signals A[27:0] - Address bus (output) A[27:0] bus carries the addresses during accesses to external memory. When access to cache memory is performed, the address of the last external memory access remains driven on the address bus. D[31:0] - Data bus (bi-directional) D[31:0] bus carries the data during accesses to memory. The processor automatically configures the bus as output and drive the lines during write transactions. During accesses to 8-bit areas, only D[31:24] are used. CB[7:0] - Check bits (bi-directional) CB[6:0] bus carries the EDAC checkbits during memory accesses. CB[7](1) takes the value of tcb[7] in the error control register. Processor only drives CB[7:0] during write transactions to areas programmed to be EDAC protected. Note: 1. CB[7] is implemented to enable programming of flash memories. When only 7 bits are useful for EDAC protection, 8 are needed for programming. Memory Interface Signals General management OE* - Output enable (output) This active low output is asserted during read transactions on the memory bus. BRDY* - Bus ready (input) When driven low, this input indicates to the processor that the current memory access can be terminated on the next rising clock edge. When driven high, this input indicates to the processor that it must wait and not end the current access. READ - Read transaction (output) This active high output is asserted during read transactions on the memory bus. WRITE* - Write enable (output) This active low output provides a write strobe during write transactions on the memory bus. PROM ROMS*[1:0] - PROM chip-select (output) These active low outputs provide the chip-select signal for the PROM area. ROMS*[0] is asserted when the lower half of the PROM area is accessed (0 - 0x10000000), while ROMS*[1] is asserted for the upper half. SRAM RAMOE*[4:0] - RAM output enable (output) These active low signals provide an individual output enable for each RAM bank. RAMS*[4:0] - RAM chip-select (output) These active low outputs provide the chip-select signals for each RAM bank. RWE* [3:0] - RAM write enable (output) These active low outputs provide individual write strobes for each byte. RWEN[0] controls D[31:24], RWEN[1] controls D[23:16], etc. I/O 10 IOS* - I/O select (output) AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION This active low output is the chip-select signal for the memory mapped I/O area. SDRAM Interface SDCLK - SDRAM clock (output) SDRAM clock provides the SDRAM interface clock reference. SDCAS* - SDRAM column address strobe (output) This active low signal provides a common CAS for all SDRAM devices. SDCS*[1:0] - SDRAM chip select (output) These active low outputs provide the chip select signals for the two SDRAM banks. SDDQM[3:0] - SDRAM data mask (output) These active low outputs provide the DQM signals for both SDRAM banks. SDRAS*- SDRAM row address strobe (output) This active low signal provides a common RAS for all SDRAM devices. SDWE* - SDRAM write strobe (output) This active low signal provides a common write strobe for all SDRAM devices. System Signals CLK - Processor clock (input) The CLK input provides the main processor clock reference. RESET* - Processor reset (input) When asserted, this active low input will reset the processor and all on-chip peripherals. WDOG* - Watchdog time-out (open-drain output) This active low output is asserted when the watchdog expires. BEXC* - Bus exception (input) This active low input is sampled simultaneously with the data during accesses on the memory bus. If asserted, a memory error will be generated. ERROR* - Processor error (open-drain output) This active low output is asserted when the processor has entered error state and is halted. This happens when traps are disabled and a synchronous (un-maskable) trap occurs. PIO[15:0] - Parallel I/O port (bi-directional) These bi-directional signals can be used as inputs or outputs to control external devices. BYPASS - PLL bypass (input) When driven to VCC, this active high input set the PLL in bypass mode. The device is then directly clocked by the external clock. When grounded, the device is clocked through the PLL. SKEW[1:0] - Clock tree skew (input) These input signals configurate the programmable skew on the triplicated clock trees. LOCK - PLL lock (output) This active high output is asserted when the PLL output (internal node) is locked at the frequency corresponding to four times the input command. DSU Signals DSUACT - DSU active (output) This active high output is asserted when the processor is in debug mode and controlled by the DSU. DSUBRE - DSU break enable (input) 11 7703C–AERO–6/09 A low-to-high transition on this active high input will generate break condition and put the processor in debug mode. DSUEN - DSU enable (input) The active high input enables the DSU unit. If de-asserted, the DSU trace buffer will continue to operate but the processor will not enter debug mode. DSURX - DSU receiver (input) This active high input provides the data to the DSU communication link receiver DSUTX - DSU transmitter (output) This active high input provides the output from the DSU communication link transmitter. JTAG TCK - Test Clock (input) Used to clock serial data into boundary scan latches and control sequence of the test state machine. TCK can be asynchronous with CLK. TMS - Test Mode select (input) Primary control signal for the state machine. Synchronous with TCK. A sequence of values on TMS adjusts the current state of the TAP. TDI - Test data input (input) Serial input data to the boundary scan latches. Synchronous with TCK TDO - Test data output (output) Serial output data from the boundary scan latches. Synchronous with TCK TRST - Test Reset (input) Resets the test state machine. Can be asynchronous with TCK. Shall be grounded for end application. PCI Arbiter AREQ*[3:0] - PCI bus request (Input) When asserted, these active low inputs indicate that a PCI agent is requesting the bus. AGNT*[3:0] - PCI bus grant (Output) When asserted, these active low outputs indicate that a PCI agent is granted the PCI bus. 12 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION PCI interface A/D[31:0] - PCI Address Data (bi-directional) Address and Data are multiplexed on the same PCI pins. During the address phase, A/D[31::00] contain a physical address (32 bits). For I/O, this is a byte address; for configuration and memory, it is a DWORD address. During data phases, A/D[07::00] contain the least significant byte and A/D[31::24] contain the most significant byte. C/BE[3:0]* - PCI Bus Command and Byte Enables (bi-directional) During the address phase of a transaction, C/BE[3::0]* define the bus command. During the data phase, C/BE[3::0]* are used as Byte Enables. The Byte Enables are valid for the entire data phase. PAR - Parity (bi-directional) The number of "1"s on A/D[31::00], C/BE[3::0]*, and PAR equals an even number FRAME* - Cycle Frame (bi-directional) It is driven by the current master to indicate the beginning and duration of an access. FRAME* is asserted to indicate a bus transaction is beginning. While FRAME* is asserted, data transfers continue. When FRAME* is deasserted, the transaction is in the final data phase or has completed. IRDY* - Initiator Ready (bi-directional) IRDY* indicates the initiating agent’s ability to complete the current data phase of the transaction. IRDY* is used in conjunction with TRDY*. During a write, IRDY* indicates that valid data is present on A/D[31::00]. During a read, it indicates the master is prepared to accept data. TRDY* - Target Ready (bi-directional) TRDY* indicates the target agent’s (selected device’s) ability to complete the current data phase of the transaction. TRDY* is used in conjunction with IRDY*. During a read, TRDY* indicates that valid data is present on AD[31::00]. During a write, it indicates the target is prepared to accept data. STOP* - Stop (bi-directional) STOP* indicates the current target is requesting the master to stop the current transaction. PCI_LOCK* - Lock (bi-directional) PCI_LOCK* indicates an atomic operation to a bridge that may require multiple transactions to complete. IDSEL - Initialization Device Select (input) Initialization Device Select is used as a chip select during configuration read and write transactions. DEVSEL* - Device Select (bi-directional) When actively driven, indicates the driving device has decoded its address as the target of the current access. As an input, DEVSEL* indicates whether any device on the bus has been selected. REQ* - PCI bus request (output) REQ* indicates to the arbiter that this agent desires use of the bus. This is a point-to-point signal. Every master has its own REQ* which must be tri-stated while RST* is asserted. GNT* - PCI Bus Grant (input) GNT* indicates to the agent that access to the bus has been granted. This is a point-to-point signal. Every master has its own GNT* which must be ignored while RST* is asserted. PCI_CLK - PCI clock (input) 13 7703C–AERO–6/09 PCI_CLK provides timing for all transactions on PCI. All other PCI signals, except RST*, are sampled on the rising edge of PCI_CLK and all other timing parameters are defined with respect to this edge. RST* - PCI Reset (input) Reset is used to bring PCI-specific registers, sequencers, and signals to a consistent state. PERR* - Parity Error (bi-directional) Parity Error is only for the reporting of data parity errors during all PCI transactions except a Special Cycle. The PERR* pin is sustained tri-state and must be driven active by the agent receiving data two clocks following the data when a data parity error is detected. The minimum duration of PERR* is one clock for each data phase that a data parity error is detected. SERR* - System Error (bi-directional) System Error is for reporting address parity errors, data parity errors on the special cycle command, or any other system error where the result will be catastrophic. If an agent does not want a non-maskable interrupt (NMI) to be generated, a different reporting mechanism is required. SYSEN* - PCI Host (input) This active low input specifies the configuration of the device. At boot-up time, if SYSEN* is sampled at a low level, the device is configured as the host of the PCI bus. If SYSEN* is sampled at a high level, the device is configured as a satellite. 14 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION AT697F CPU Core SPARC Architecture Overview This section discusses the SPARC core architecture in general. The main function of the CPU core is to ensure correct program execution. The CPU must therefore be able to access memories, perform calculations, control peripherals, and handle interrupts. The AT697F CPU core is based on the LEON2 architecture. Figure 2. Block diagram of the AT697F Integer Unit architecture call/branch address I-cache data address +1 ‘0’ jmpa tbr Add f_pc Fetch d_inst d_pc Decode imm, tbr, wim, psr e_inst e_pc rs1 operand2 Execute alu/shift mul/div y 32 30 ex pc jmpl address m_inst m_pc result ytmp Memory D-cache 32 32 address/dataout datain w_inst w_pc wres Y Write 30 tbr, wim, psr rd regfile rs1 rs2 The AT697F integer unit (IU) implements SPARC integer instructions as defined in SPARC Architecture Manual version 8. The IU is designed for highly dependable space and military applications by including fault tolerance features. To execute instructions at a rate approaching one instruction per clock cycle, the IU employs a five-stage instruction pipeline that permits parallel execution of multiple instructions. • Instruction Fetch: If the instruction cache is enabled, the instruction is fetched from the instruction cache. Otherwise, the fetch is forwarded to the memory controller. The instruction is valid at the end of this stage and is latched inside the IU. Decode: The instruction is decoded and the operands are read. Operands may come from the register file or from internal data bypasses. CALL and Branch target addresses are generated in this stage. Execute: ALU, logical, and shift operations are performed. For memory operations and for JMPL/RETT, the address is generated. Memory: Data cache is accessed. For cache reads, the data will be valid by the end of this stage, at which point it is aligned as appropriate. Store data read out in the Execute stage is written to the data cache at this time. Write: The result of any ALU, logical, shift, or cache read operations re written back to the register file. • • • • All five stages operate in parallel, working on up to five different instructions at a time. A basic ’single-cycle’ instruction enters the pipeline and completes in five cycles. 15 7703C–AERO–6/09 By the time it reaches the write stage, four more instructions have entered and are driving through the pipeline behind it. So, after the first five cycles, a single-cycle instruction exits the pipeline and a single-cycle instruction enters the pipeline on every cycle. Of course, a ’singlecycle’ instruction actually takes five cycles to complete, but they are called single cycle because with this type of instruction the processor can complete one instruction per cycle after the initial five-cycle delay. In order to maximize performance and parallelism, the AT697F SPARC implementation uses powerful AMBA bus. Instructions in the program memory are executed with a five level pipelining. While one instruction is being executed, the next instruction is pre-fetched from the program memory. This concept enables instructions to be executed in every clock cycle. Program Counters Two 32-bit program counters (PC and nPC) are provided. The 32-bit PC contains the address of the instruction currently being executed by the IU. The nPC holds the address of the next instruction to be executed (assuming a trap does not occur). When a trap occurs, the PC address is saved in the local register (l1) while the nPC address is saved in the local register (l2). When returning from trap, l1 value is copied back to PC and l2 value is copied back to nPC. ALU - Arithmetic Logic Unit The high-performance ALU operates in direct connection with all the 32 general purpose working registers. Within a single clock cycle, arithmetic operations between general purpose registers or between a register and an immediate memory address are executed. The implementation of the architecture also provide a powerful multiplier/divider supporting both signed and unsigned multiplication/division. Support for high performance 64-bit operation is also provided.The 32-bit Y register contains the most significant word of the double-precision product of an integer multiplication, as a result of either an integer multiply instruction, or of a routine that uses the integer multiply step instruction. The Y register also holds the most significant word of the double-precision dividend for an integer divide instruction. Register File Windows The fast access register file contains 8 SPARC register windows. Each window consists in a 32register set. When a program is running, it has access to 32 32-bit processor registers which include 8 global registers plus 24 registers that belong to the current register window. • • • The first 8 registers in the window are called the in registers’ (i0-i7). When a function is called, these registers may contain arguments that can be used. The next 8 are the ’local registers’ (l0-l7) which are scratch registers that can be used for anything while the function executes. The last 8 registers are the ’out registers’ (o0-o7) which the function uses to pass arguments to functions that it calls. AT697F register file implementation is based on two dual-port rams. The first dual-port ram corresponds to %rs1 operand of a SPARC instruction while the second corresponds to %rs2 operand. The two dual-port rams contents are always equal. When one function calls another, the calling function can choose to execute a SAVE instruction. This instruction decrements an internal counter, the current window pointer (cwp), shifting the register window downward. The caller’s out registers then become the calling function’s in registers, and the calling function gets a new set of local and out registers for its own use. Only the pointer changes because the registers and return address do not need to be stored on a stack. The RETURN instruction acts in the opposite way 16 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Figure 3. Overlapping Windows cwp w7 ins w1 outs W1 Restore w1 locals w1 ins w0 w0 outs locals w0 ins W0 W6 W7 w7 locals w7 outs w6 ins w6 locals Save W5 w6 outs w5 ins globals w2 w5 outs locals W2 w2 W4 w4 locals ins w5 w4 w2 outs locals ins w4 outs w3 outs w3 w3 ins locals W3 The Window Invalid Mask register (WIM) is controlled by supervisor software and is used by hardware to determine whether a window overflow or underflow trap is to be generated by a SAVE, RESTORE, or RETT instruction. When a SAVE, RESTORE, or RETT instruction is executed, the current value of the CWP is compared against the WIM register. If the SAVE, RESTORE, or RETT instruction would cause the CWP to point to an “invalid” register set, a window_overflow or window_underflow trap is caused. To prevent erroneous operations from SEU errors in the main register file, each word is protected with a 7-bit EDAC checksum. The EDAC checksums are checked when the register is used as operand in an instruction. Any single-bit error is corrected and written back to the register file before the instruction is executed. If an un-correctable error is detected, a register hardware error trap (trap 0x20) is generated. The protection can be enabled/disabled by programming the asr16 di bit from register file protection control register. By setting the asr16 te bit, errors can be inserted in the register file to test the protection function. When the asr16 te bit is set, the register checksum is combined with the asr16 tcb field before being written to the register file. Due to the presence of the two dual-port rams for register file implementation, the following rules apply to the error injection test process. • • Test checkbits TCB[2:0] is Xored with checkbit[6:4] corresponding to the %rs1 operand. Test checkbits TCB[5:3] is Xored with checkbit[6:4] corresponding to the %rs2 operand. ! 0x32 = ! register file test enable ! tcb[2:0] = 0x4 ! tcb[5:3] = 0x1 mov 0x32, %l1 mov %l1, %asr16 ! clear %l3 ! => write 0x0 to %l3 ! forces 0x08 as checkbit for %l3 (error insertion in %rs1 dual-port ram) mov %g0, %l3 ! disable EDAC test mode mov %g0, %asr16 ! access to %l3 as %rs1 operand ! => single error detection and correction add %l3,%l2,%l1 Here is a simple example for the test of a single error in register file %rs1 17 7703C–AERO–6/09 A correction counter asr16 cnt is provided for error management. The asr16 cnt field is incremented each time a register correction is performed. It saturates at “111”. State Register The State Register (PSR) contains information about the result of the most recently executed arithmetic instruction. This information can be used for altering program flow in order to perform conditional operations. Note that the Status Register is updated after all ALU operations, as specified in the SPARC architecture specification. This will in many cases remove the need for using the dedicated compare instructions, resulting in faster and more compact code. The state also provides some global information on the current window used, the authorized interrupts and peripheral (FPU and coprocessor) presence. A global interrupt management is provided through the processor state register. Trap and Interrupts can be individually enabled/disables from within this register. Instruction Set AT697F instructions fall into six functional categories: load/store, arithmetic/logical/ shift, control transfer, read/write control register, floating-point, and miscellaneous. Please refer to SPARC V8 Architecture manual that presents all the implemented instructions. Floating Point Unit The FPU is designed to provide execution of single and double-precision floating-point instructions. During the execution of floating-point instructions the processor pipeline is held. The FPU is designed for highly dependable space and military applications, by including fault tolerance features like error detection and correction and triple modular redundancy. The FPU depends upon the IU to access all addresses and control signals for memory access. Floating-point loads and stores are executed in conjunction with the IU, which provides addresses and control signals while the FPU supplies or stores the data. Instruction fetch for integer and floating-point instructions is provided by the IU. The FPU contains 32 32-bit floating-point f registers, which are numbered from f[0] to f[31]. Unlike the windowed r registers, at a given time an instruction has access to any of the 32 f registers. The f registers can be read and written by FPop (FPop1/FPop2 format) instructions, and by load/store single/double floating-point instructions (LDF, LDDF, STF, STDF). Rounding Direction Rounding direction for floating point results is built according to the ANSI/IEEE Standard 7541985. In this way, • • • • 0 = round to nearest 1 = round to zero 2 = round to +infinity 3 = round to -infinity Figure 4. Rounding Direction Schematic Value < 0 -∞ round to - ∞ round to + ∞ round to zero 0 Value > 0 round to - ∞ round to zero + ∞ round to + ∞ Fault Tolerance The processor has been especially designed for space application. To prevent erroneous operations from single event transient (SET) and single event upset (SEU) errors, the AT697F processor implements a set of protection features including : • Full triple modular redundancy (TMR) architecture The TMR architecture is based on a fully triplicated clock distribution (CLK1, CLK2 and CLK3). The PCI clock and the CPU clock are built as three-clock trees. The same triplication is applied to the PCI reset and to the CPU reset. See figure 5 for an overview of the TMR architecture. 18 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Programmable skews on the clock trees are also provided to prevent the processor from arbitrary single-event transient errors. Refer to the ‘clock’ section for detailed information on TMR implementation and skew implementation. • • • EDAC protection on Regfile EDAC protection on external memory interface Parity protection on instruction and data caches Figure 5. TMR structure - Clock triplication principle 19 7703C–AERO–6/09 Watch Points The integer unit contains four hardware watch-points allowing generation of a trap on an arbitrary memory address range. Any binary aligned address range can be watched (the two less significant bits are ignored) Each watch-point consists in a pair of application-specific registers • • break address register The break address defines a reference address for testing. mask register The mask indicates which bits of the break address register are to be effectively taken in account during address test Configuration A watchpoint is enabled setting logical one at least one of the three bits IF, Dl or DS in the watchpoint address and mask registers. When all three bits are set logical zero, the watchpoint is disabled. If the instruction fetch bit (IF) from the watchpoint address register is set logical one, any attempt to fetch an instruction from one of the address defined by ADDR and MASK results in a trap generation. If the data store bit (DS) from the watchpoint address register is set logical one, any attempt to store data to one of the address defined by ADDR and MASK results in a trap generation. If the data load bit (DL) from the watchpoint mask register is set logical one, any attempt to load a data from one of the address defined by ADDR and MASK results in a trap generation. Operation To detect if an address is part of the memory address range that traps, address bit 31 down to bit 2 are Xored with the BADx BADDx. This operation is based on the following segmentation of an address. Table 7. Address Segmentation bit num. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 field Address 8 7 6 5 4 3 2 1 0 ignored With such segmentation, it is possible to define trap segment from 4bytes up to 1Gbyte. The result of the Xor is then Anded with the BMAx BMAx. If the result is zero, this indicates that address specified is in the watched range. Then, a watchpoint hit error is generated. Trap 0x0B is generated. If result is different from zero, address is out of the watched address range. Figure 6. Watchpoint Hit Principle Address Bus 30 30 DS IF logic DL Trap 0x0B 30 ADDR 30 MASK Data Bus Watchpoint Mask Reg. %asry Watchpoint Address Reg. %asrx 20 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Traps and Interrupts Overview The AT697F supports two types of traps: - synchronous traps - asynchronous traps also called interrupts. Synchronous traps are caused by hardware responding to a particular instruction. They occur during the instruction that caused them. Asynchronous traps occur when an external event interrupts the processor. They are not related to any particular instruction and occur between the execution of instructions. A trap is a vectored transfer of control to the supervisor through a special trap table that contains the first four instructions of each trap handler. The trap base address (TBR) of the table is established by supervisor and the displacement, within the table, is determined by the trap type. A trap causes the current window pointer to advance to the next register window and the hardware to write the program counters (PC & nPC) into two registers of the new window. Synchronous Traps The AT697F follows the general SPARC trap model. The table below shows the implemented traps and their individual priority. Table 8. Trap Overview Trap reset write error instruction_access_exception illegal_instruction privileged_instruction fp_disabled cp_disabled watchpoint_detected window_overflow window_underflow register_hadrware_error mem_address_not_aligned fp_exception data_access_exception tag overflow divide_exception trap_instruction TT (trap type) 0x00 0x2b 0x01 0x02 0x03 0x04 0x24 0x0B 0x05 0x06 0x20 0x07 0x08 0x09 0x0A 0x2A 0x80 -0xFF Priority 1 2 3 5 4 6 6 7 8 8 9 10 11 13 14 15 16 Description Power-on reset Write buffer error Error during instruction fetch Edac uncorrectable error during instruction fetch UNIMP or other un-implemented instruction Execution of privileged instruction in user mode FP instruction while FPU disabled co-processor instruction while co-processor disabled Instruction or data watchpoint match SAVE into invalid window RESTORE into invalid window register file uncorrectable EDAC error Memory access to un-aligned address FPU exception Access error during load or store instruction Tagged arithmetic overflow Divide by zero Software trap instruction (TA) 21 7703C–AERO–6/09 Traps Description • reset - A reset trap is caused by an external reset request. It causes the processor to begin executing at virtual address 0. After a Reset Trap, no special memory states are defined exept the bits PSR ET’ and PSR S that are initialized respectively ‘0’ and ‘1’. write error - An error exception occurred on a data store to memory. instruction_access_exception - A blocking error exception occurred on an instruction access. illegal_instruction - An attempt was made to execute an instruction with an unimplemented opcode, or an UNIMP instruction, or an instruction that would result in illegal processor state. privileged_instruction - An attempt was made to execute a privileged instruction while supervisor bit PSR S is ‘0’ (not in supervisor mode). fp_disabled - An attempt was made to execute an FPU instruction while FPU is not enabled or not present. cp_disabled - An attempt was made to execute a co-processor instruction while coprocessor is not enabled or not present. watchpoint_detected - An instruction fetch memory address or load/store data memory address matched the contents of a pre-loaded implementation-dependent “watchpoint” register. window_overflow - A SAVE instruction attempted to cause the current window pointer (CWP) to point to an invalid window in the WIM. window_underflow - A RESTORE or RETT instruction attempted to cause the current window pointer (CWP) to point to an invalid window in the WIM. register_hardware_error - An error exception occurred on a read only register access. A register file uncorrectable error was detected. mem_address_not_aligned - A load/store instruction would have generated a memory address that was not properly aligned according to the instruction, or a JMPL or RETT instruction would have generated a non-word-aligned address. fp_exception - An FPU instruction generated an IEEE_754_exception and its corresponding trap enable mask (TEM) bit was 1, or the FPU instruction was unimplemented, or the FPU instruction did not complete, or there was a sequence or hardware error in the FPU. The type of floating-point exception is encoded in the FSR FTT. data_access_exception - A blocking error exception occurred on a load/store data access. EDAC uncorrectable error. tag_overflow - A tagged arithmetic instruction was executed, and either arithmetic overflow occurred or at least one of the tag bits of the operands was non zero. trap_division_by_zero - An integer divide instruction attempted to divide by zero. trap_instruction - A software instruction (Ticc) was executed and the trap condition evaluated to true. • • • • • • • • • • • • • • • • When multiple synchronous traps occur at the same cycle (i.e hardware errors), the highest priority trap is taken, and lower priority traps are ignored. 22 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Asynchronous Traps / Interrupts The AT697F handles up to 15 interrupts. The interrupt controller is used to prioritize and propagate interrupts requests from internal or external devices to the integer unit. Figure 7. Interrupt Controller Block Diagram Interrupt Sources PIO[15:0] I/O Interrupt Reg. IOIT1 IOIT2 Internal Interrupt (Timer1, Uart1,...) Data Bus Interrupt Clear Reg. ITC Interrupt Pending Reg. ITP mask priority Interrupt Force Reg. ITF trap1x generation Interrupt Mask & Priority Reg. ITMP Operation When an interrupt is generated, the corresponding bit is set in the interrupt pending register (ITP). The pending bits are ANDed with the interrupt mask register and then forwarded to the priority selector. The highest interrupt from priority level 1 will be forwarded to the IU - if no unmasked pending interrupt exists on priority level 1, then the highest unmasked interrupt from priority level 0 is forwarded. When the IU acknowledges the interrupt, the corresponding pending bit will automatically be cleared. Interrupt can also be forced by setting a bit in the interrupt force register. In this case, the IU acknowledgement will clear the force bit rather than the pending bit. After reset, the interrupt mask register is set to all zeros while the remaining control registers are undefined. Interrupt List The following table presents the assignement of the interrupts. Table 9. Interrupt Overview Interrupt 15 14 13 12 11 10 9 8 7 6 5 4 3 TT (Trap Type) 0x1F 0x1E 0x1D 0x1C 0x1B 0x1A 0x19 0x18 0x17 0x16 0x15 0x14 0x13 Source I/O interrupt [7] PCI I/O interrupt [6] I/O interrupt [5] DSU trace buffer I/O interrupt [4] Timer 2 Timer 1 I/O interrupt [3] I/O interrupt [2] I/O interrupt [1] I/O interrupt [0] UART 1 23 7703C–AERO–6/09 Interrupt 2 1 TT (Trap Type) 0x12 0x11 Source UART 2 Internal bus error Non Maskable Interrupt (NMI) I/O interrupts The AT697F handles interrupt 15 (trap type TT = 0x1F). This interrupt can not be masked by the integer unit of the processor. It shall be used with care as the NMI of the processor. As an alternate function of the general purpose interface, the AT697F allows to input interrupt from external devices. Up to eight external interrupts can be programmed at the same time. The external interrupts are assigned to interrupt 4, 5, 6, 7,10, 12, 13 and 15. Two registers are defined for configuration of the IO interrupts : • • IOIT1 register is used for control of IO interrupt 0, 1, 2 and 3 IOIT2 register is used for control of IO interrupt 4, 5, 6 and 7 Each I/O interrupt is controlled through four fields in one of the above register (IOITx) : ENx, LEx, PLx and ISELx. An I/O interrupt is enabled setting logical one to IOITx ENx . Setting this bit logical zero disables the interrupt. The IOITx ISELx defines which port of the general purpose interface should generate I/O interrupt x. The port can be selected from within PIO[15:0] and D[15:0]*. Each I/O interrupt can have its trigger mode and its polarity individually configured. When bit IOITx LEx is set logical one, the corresponding I/O interrupt is edge triggered. If the polarity bit IOITx PLx is driven logical one the interrupt triggers when a rising edge is applied on the pin. If the polarity bit is driven logical zero the interrupt triggers when a falling edge is applied on the pin. When the bit IOITx LEx is set logical zero, the corresponding I/O interrupt is level sensitive. If the polarity bit IOITx PLx is driven logical one the interrupt triggers when a high level is applied on the pin. If the polarity bit is driven logical zero the interrupt triggers when a low level is applied on the pin. The following table summarizes the I/O interrupt configurations. Table 10. I/O Interrupt Configuration LEx 0 0 1 1 PLx 0 1 0 1 Trigger low level high level falling edge rising edge 24 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Interrupt Priority The 15 interrupts handled by the AT697F are prioritised, with interrupt 15 (TT = 0x1F) having the highest priority and interrupt 1 (TT = 0x11) the lowest. It is possible to change the priority level of an interrupt using the two priority levels from the interrupt mask and priority register (ITMP). Each interrupt can be assigned to one of two levels as programmed in the Interrupt mask and priority register. Level 1 has higher priority than level 0. Within each level the interrupts are prioritised. 25 7703C–AERO–6/09 Memory Interface Overview The AT697F provides a 32-bit bus capable to interface PROM, memories mapped I/O devices, asynchronous static rams (SRAM) and synchronous dynamic rams (SDRAM). The memory bus can be configured either for 8-bit, 32-bit or 40-bit accesses. The memory controller manages up to 2 Gbytes of external memory. The following table presents the memory controller address map. Table 11. Memory Controller address map Address Range 0x00000000 - 0x1FFFFFFF 0x20000000 - 0x2FFFFFFF 0x40000000 - 0x7FFFFFFF Size 512M 256M 1G Mapping PROM I/O SRAM/SDRAM For applications that require smaller memory areas and/or smaller performances, it is possible to configure some memory spaces as 8-bit wide data bus. All the configuration of the memory interface is done through the three memory controller registers : MCFG1, MCFG2 and MCFG3. MCFG1 is the register dedicated to PROM and IO configuration. SRAM and SDRAM are configured through MCFG2 and MCFG3. Here is an overview of the 32-bit interconnection between the AT697F and external memories. Figure 8. Memory Interface Overview ROMS*[1:0] OE* WRITE* CS OE WE PROM A D IOS* AT697F CS OE WE I/O A D RAMS*[4:0] RAMOE*[4:0] RWE*[3:0] CS OE WE SRAM A D SDCLK SDCSN[1:0] SDRAS* SDCAS* SDWE* SDDQM[3:0] A[27:0] D[31:0] CLK CSN RAS CAS WE DQM A[16:15] BA SDRAM A D A[14:2] To improve the bandwidth of the memory bus, accesses to consecutive addresses can be performed in burst mode. Burst transfers will be generated when the memory controller is accessed using a burst request from the internal bus. These includes instruction cache-line fills, double loads and double stores. The timing of a burst cycle is identical to the programmed basic cycle with the exception that during read transactions, the lead-out cycle will only occurs after the last transfer. 26 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION RAM Interface The memory controller gives the capability to control up to 1Gbyte of RAM. The global RAM area supports two RAM types : asynchronous static RAM (SRAM) and synchronous dynamic RAM (SDRAM). SRAM interface Overview The SRAM interface can manage up to five SRAM banks. The control of the SRAM memory accesse s uses a standard set of pin, including chip selects (RAMS*x), output enable (RAMOE*x) and write enable (RWE*x) lines. The bank size of the four first banks of the SRAM area can be configured by setting the value of MCFG2 RAMBS. The bank size can be programmed in binary step from 8 Kbytes to 256 Mbytes. Whatever is the size of the four first banks, they are always contiguous. These memory banks are selected with RAMS*[3] down to RAMS*[0]. The fifth SRAM bank controlled by RAMS*[4] has a fix dimension. This bank always resides at the upper address 0x60000000. This bank is always 256 Mbytes large. Figure 9. SRAM bank organisation SRAM bank size Start Address 0x7C000000 0x78000000 Unused 0x74000000 0x70000000 0x6C000000 0x68000000 RAMS*[4](1)(2) 0x64000000 0x60000000 0x5C000000 RAMS*[3] 0x58000000 RAMS*[1] 0x54000000 RAMS*[2] 0x50000000 0x4C000000 RAMS*[1] 0x48000000 RAMS*[0] 0x44000000 RAMS*[0] 0x40000000 RAMS*[0] RAMS*[1] RAMS*[2] RAMS*[3] Unused RAMS*[4](2) RAMS*[4](2) Unused Unused 256MB Memory assignement 128MB Memory assignement 64MB Memory assignement Notes: 1. If the SRAM bank size is set to 256Mbytes, SRAM bank 2 & bank 3 are in overlay with SRAM bank 4. In this case, bank 2 and bank 3 control signals are never asserted. Bank 4 has the priority. 2. When SDRAM is enabled, priority is given to the SDRAM. Any access to addresses higher than 0x60000000 is driven to SDRAM. No SRAM control is activated. SRAM Read Access A read access to SRAM consists in two data cycles and between zero and three waitstates. On non-consecutive accesses, a lead-out cycle is added after a read cycle to prevent bus contention due to slow turn-off time of memories or I/O devices. On consecutive accesses, no lead-out cycle is performed between the acesses but only one is performed at the end of the operations (RAMSN and RAMOE are not deasserted). 27 7703C–AERO–6/09 When a read access to SRAM is performed, a separate output enable signal is provided for each SRAM bank and it is only asserted when that bank is selected. Figure 10. SRAM read transaction (0-waitstate) data1 data2 lead-out CLK A RAMS* RAMOE* A1 D D1 SRAM Write Access Each byte lane has an individual write strobe (RAMWE*) to allow efficient byte and half-word writes. Each write access to SRAM consists of three states and between zero and three waitstates. The three mandatory states are divided in one write setup cycle, one data cycle and one lead-out cycle. Figure 11. SRAM write transaction (0-waitstate) lead-in data lead-out CLK A RAMS* RWE* A1 D D1 If the external memory use a common write strobe for the full 32-bit data, set the MCFG2 RMW. This will enable read-modify-write transactions for sub-word writes. Waitstates For application using slow SRAM memories, the SRAM controller provides the capability to insert wait-states during the SRAM accesses. Two types of wait-states can be inserted : • • Programmed delay, available for bank 0 up to bank 4 ‘Hardware’ bus delay, available for bank 4 only Up to three waitstates can be programmed for SRAM accesses. Read and write waitstates can be individually programmed. Setting the MCFG2 RAMRWS value defines the number of waitstates to insert during an SRAM read. Setting the MCFG2 R AMWWS value define s the number of waitstates to insert during an SRAM write. 28 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Figure 12. RAM read access with one programmed waitstate data1 data2 waitstate lead-out CLK A RAMS* OE* A1 D D1 For and only for RAM bank 4, If the application needs more delay during the SRAM transfer, it is possible to introduce more delay by activating the hardware bus ready ( BRDY* ) detection in MCFG2. Refer to paragraph “BRDY Wait states”, page 38. Bus width To support applications with low memory performance requirements, the SRAM area can be configured for 8-bit operations. The configuration of SRAM in 8-bit mode is done programming MCFG2 RAMWDH, SRAM bus width field. When the SRAM bus is configured as an 8-bit wide bus, data 31 downto 24 shall be used as interface. Figure 13. SRAM 8-bit bus width connection A RAMS0* RAMOE0* RWE0* CS OE WE A[27:0] D SRAM A D D[31:24] AT697F A[27:0] D[31:24] Since access to memory is always done on 32-bit word basis, read access to 8-bit memory will be transformed in a burst of four read transactions. If EDAC protection is active, 5 read cycles are necessary to complete the access (please refer to “Error Management - EDAC”, page 40 for more details). During write operation, only the necessary bytes are writen. 29 7703C–AERO–6/09 Write Protection Two write protection schemes are provided to prevent accidental over-writing to the RAM area, the “Start/End address Scheme” and the “Mask Scheme”. These two schemes are explained in the following two sub-chapter Start/End address Scheme Two memory areas are defined by using a start-address and an end-address register. The first address of the protected memory area is calculated as 0x40000000 + START*4. The last address of the protected memory area is calculated as 0x40000000 + END*4. Table 12. Start Address Register (WPSTAx) 31 0 30 0 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 BP 0 0 START Table 13. End Address Register (WPSTOx) 31 0 30 0 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 STOP US SU Setting WPSTAx BPx to logical one, any access inside the two areas defined by the start/end registers will cause a memory exception (trap 0x2B). The first address of the protected against write operation is calculated as 0x40000000 + START*4. The first address outside the protected memory area is calculated as 0x40000000 + END*4 + 4. Setting WPSTAx BPx to logical zero the area between the start address and the end address defines the memory where write access is permitted, and a write access outside both areas will cause a memory exception (trap 0x2B). The first address where write operation is permitted is calculated as 0x40000000 + START*4. The first address outside the protected allowed area is calculated as 0x40000000 + END*4 + 4. The start/end address protection scheme is enabled when at least one of the user mode protection and the supervisor mode protection is valid. The write protection can be configured to prevent the application from user and/or supervisor write access. • • Memory is protected against User write when WPSTOx USx bit is set logical 1 Memory is protected against Superviser write when WPSTOx SUx bit is set logical 1 30 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Figure 14. RAM Protection Mode Overview RAM Write START1 RAM trap START1 Segment 1 END1 END1 Segment 1 Write trap Write trap START2 START2 Segment 2 END2 END2 Segment 2 Write trap Write trap Segment mode (bp = 0) Block mode (bp = 1) Mask Scheme Two block protection units are available for RAM area. Each one is controlled through a write protection register (WPRn). Two major fields are defined : a TAG and a MASK. • The TAG defines the 15 most significant bits of the address of the block to be write protected. • The Mask specifies which bits of the TAG are really relevant for the protection. The write protection on the RAM area is enabled setting logical one in WPRx EN. If this bit is set logical zero, no protection is activated. Two protection modes can be programmed. If the WPRx BP is set logical one the protection is active within the segment. If this bit is set logical zero, the exterior of the segment is protected. Figure 15. RAM Protection Mode Overview RAM Write trap RAM Segment 1 Segment 1 Write trap Write trap Segment 2 Write trap Segment 2 Write trap Segment mode (bp = 0) Block mode (bp = 1) 31 7703C–AERO–6/09 To detect if the written address is part of a protected segment (or block), address bit 29 down to bit 2 are Xored with the WPRx TAG. This operation is based on the following segmentation of an address. Table 14. Address Segmentation bit num. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 field area most significant byte 8 7 6 5 4 3 2 1 0 32Kbyte protected block With such segmentation, memory block in the range of 32Kbyte up to 1Gbyte can be protected. The result of the Xor is then Anded WPRx MASK. If the result is zero, this indicates that address specified is in the protected range. If result is different from zero, address is out of the protected address range. If a write protection error is detected, the write transaction is stopped. Then, a memory access error is generated. Trap 0x2B is generated Figure 16. RAM Write Protection Overview Address Bus 15 15 logic 15 15 MASK BP Data Bus Trap 0x2B TAG Write Protection Reg. WRPn EN Protection Priorities As a result of the write protection implementation for the RAM area,the two RAM write protection schemes can be used simultaneously. Combining the write protection schemes leads to the following behaviors : • If all the enable protection units are configured in block protect mode (BP = 0), then a write protect error is generated when any of the units signal a write protection hit. In this mode, if at least one protection error is triggered, the write protection trap is raised. If at least one of the protection units operates in segment mode (BP=1), then a write protect error is generated only if all units configured in segment mode signal a protection error. • 32 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION SDRAM The synchronous dynamic RAM interface can manage up to two SDRAM banks. The control of the SDRAM memory accesses uses a standard set of pin, including chip selects (SDCS*x), write enable (SDWE*), data masks (SDDQM*x) and clock lines. The bank size of the two SDRAM banks can be configured by setting the value of the MCFG2 SDRBS. The bank size can be programmed in binary step from 4 Mbytes to 512 Mbytes. The controller supports 64M, 256M and 512M devices with 8 to 12 column-address bits, up to 13 row-address bits, and 4 banks. Only 32-bit data bus width is supported for SDRAM banks. Address Mapping The start address for the SDRAM banks depends upon the SRAM use in the application. If the the SRAM disable bit MCFG2 SI and the SDRAM enable bit MCFG2 SE are set logical one, the SDRAM start address is 0x40000000. If the the SRAM disable bit MCFG2 SI is set logical zero and the SDRAM enable bit MCFG2 SE is set logical one, the SDRAM start address is 0x60000000. If MCFG2 SE if set logical zero, no SDRAM can be used. The address bus of the SDRAMs shall be connected to A[14:2], the bank address to A[16:15]. Devices with less than 13 address pins should only use the less significant bits of A[14:2]. Figure 17. SDRAM connection overview A D AT697F SDCLK SDCSN[1:0] SDRAS* SDCAS* SDWE* SDDQM[3:0] A[27:0] D[31:0] CLK CSN RAS CAS WE DQM BA A[16:15] SDRAM A D A[14:2] SDRAM Timing Parameters To provide optimum access cycles for different SDRAM devices some SDRAM parameters can be programmed through MCFG2 register. The programmable SDRAM parameters are the following : Table 15. SDRAM Programmable Timing Parameters Function CAS latency Precharge to activate Auto-refresh command period Auto-refresh interval tRP tRFC Parameter Range 2-3 2-3 3 - 11 10 - 32768 Unit clocks clocks clocks clocks SDRAM Commands The SDRAM controller can issue three SDRAM commands. Commands to be executed are programmed through the MCFG2 SDRCMD. When this field is writen with a non zero value, a SDRAM command is issued : • • • if set to ‘01’, Precharge command is sent, if set to ‘10’, Auto-Refresh command is sent, if set to ‘11’, Load Mode Reg (LMR) is sent. When the LMR command is issued, the MCFG2 S DRCAS delay pr ogrammed is used. MCFG2 SDRCMD is cleared after a command is executed. When changing the value of the CAS delay, a LOAD-MODE-REGISTER command should be generated at the same time. The SDRAM controller also provides a refresh command. It can be enabled by setting a logical one into MCFG2 SDRREF. 33 7703C–AERO–6/09 The Auto-Refresh command enables a periodical refresh for both SDRAM banks. The period between two Auto-Refresh command is programmed in MCFG3 SRCRV. Depending on SDRAM type, required period is typically 7.8 or 15.6μs. This corresponds to 780 or 1560 clock cycle at 100MHz. Refresh period is calculated as Refresh Period = Reload value + 1 -------------------------------------------sdclk frequency SDRAM Initialisation After reset, the SDRAM controller automatically performs the SDRAM initialisation sequence. It consists in PRECHARGE, two AUTO-REFRESH cycles and LOAD-MODE-REG on both banks simultaneously. The controller programs the SDRAM to use page burst on read and single location access on write. A CAS latency of 3 is programmed by default. This value can be updated later by software. A read transaction consists in three main operation. First, an ACTIVATE command to the desired bank and row is performed. Then, after the programmed CAS delay, a READ command is sent. The read transaction is terminated with a PRE-CHARGE command. No bank is left open between two accesses. A burst read is performed if a burst access is requested on the internal bus. SDRAM Read Access SDRAM Write Access A write transactions consists in three main operations. First, an ACTIVATE command to the desired bank and row is performed. Then, a WRITE command is sent. The write transaction is terminated with the PRE-CHARGE command. A burst write on internal bus generates a burst of write commands without idle cycles inbetween. Access Error An access error can be indicated to the processor asserting the BEXC* signal. If enabled by setting logical one to MCFG1 BEXC , the BEXC* signal is sampled with the data. If the BEXC* signal is driven low by the external device during the access, an error response is generated on the internal bus. • • • Trap 0x01 is taken if an instruction fetch is in progress Trap 0x09 is taken if a data space access is in progress Trap 0x2B is taken if a data store is in progress 34 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION PROM Interface Overview The memory controller give the capability to control up to 512Mbyte of PROM. The PROM interface can manage up to two PROM banks. The control of the PROM memory accesses uses a standard set of pin, including chip selects (ROMS*x), output enable (OE*), read (READ) and write (WRITE*) lines. The bank size of the PROM banks is not programmable. The lower half part of the PROM area (0x00000000 up to 0x0FFFFFFF) is controlled by the ROMS0* PROM select signal. The upper half part of the PROM area (0x10000000 up to 0x1FFFFFFF) is controlled by the ROMS1* PROM select signal. PROM Read Access A read access to PROM consists in two data cycles and waitstates if any programmed. On nonconsecutive accesses, a lead-out cycle is added after a read transaction to prevent bus contention due to slow turn-off time of memories or I/O devices. On consecutive accesses, no lead-out cycle is performed between the acesses but only one is performed at the end of the operations. Figure 18. PROM Read transaction (0 Waitstate) data1 data2 lead-out CLK A ROMS* OE* A1 D D1 PROM Write Access Each write access to PROM consists of three states and of waitstates if any programmed. The three mandatory states are divided in one write setup cycle, one data cycle and one lead-out cycle. The write operation is strobed by the WRITE* signal. Figure 19. PROM Write transaction (0 waitstate) lead-in data lead-out CLK A ROMS* WRITE* A1 D D1 Waitstates For application using slow ROM memories, the ROM controller provides the capability to insert wait-states during the accesses. Two types of wait-states can be inserted : • • Programmed delay ‘Hardware’ bus ready delay Up to 30 waitstates can be programmed for PROM accesses. Read and write waitstates can be individually programmed. Setting MCFG1 PRRWS defines the number of waitstates to insert 35 7703C–AERO–6/09 during a PROM read access. Setting MCFG1 PRWWS defines the number of waitstates to insert during a PROM write. MCFG1 PRRWS and MCFG1 PRWWS can be programmed to take values from 0 up to 15. The effective number of waitstates applied during an access is then twice the programmed value. In that way, programming two waitstates results in the insertion of four wait cycles during the access. Figure 20. ROM read access with PRRWS=1 (two programmed waitstates) data1 data2 waitstate waitstate lead-out CLK A ROMS* OE* A1 D D1 If the application needs more time for ROM transfer, it is possible to introduce more delay by activating the hardware bus ready MCFG1 PBRDY. Refer to paragraph “BRDY Wait states”, page 38. After a reset operation of the processor (or at power up), the MCFG1 P RRWS and MCFG1 PRWWS waitstates for the PROM area are set default to 15, resulting in 30 effective waitstates and the MCFG1 PBRDY is set to 0. Write Protection Write protection is provided to prevent accidental over-writing to PROM area. It is controlled through the PROM write enable bit MCFG1 P RWE. When set 1, this bit enables write to PROM. When set 0, no PROM write transaction is available. To support applications with low memory and performance requirements, the PROM area can be configured for 8-bit operations. The configuration of PROM in 8-bit mode is done programming MCFG1 PRWDH. When the PROM bus is configured as an 8-bit wide bus, data 31 downto 24 shall be used as interface. Figure 21. PROM 8-bit bus width connection A ROMS0* OE* WRITE* CS OE WE A[27:0] D Bus width PROM D A D[31:24] AT697F A[27:0] D[31:24] Since access to memory is always done on 32-bit word basis, read access to 8-bit memory will be transformed in a burst of four read transactions. If EDAC protection is active, 5 read cycles are necessary to complete the access (please refer to protection section for more details). During write operation, only the necessary bytes are writen. Access Error An access error can be indicated to the processor asserting the BEXC* signal. If enabled by setting logical one to MCFG1 BEXC , the BEXC* signal is sampled with the data. • • 36 Trap 0x01 is taken if an instruction fetch is in progress Trap 0x09 is taken if a data space is in progress AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION • Trap 0x2B is taken if a data store is in progress Memory Mapped I/O Overview The memory controller give the capability to control up to 256Mbyte of I/O. The I/O area consists in a single large bank. The control of the I/O area accesses uses a standard set of pin, including chip selects (IOS*x), output enable (OE*), read (READ) and write (WRITE*) lines. The size of the I/O bank is not programmable. The entire I/O area (0x20000000 up to 0x2FFFFFFF) is controlled by the IOS* select signal. I/O Read Access A read access to I/O consists in a lead-in cycle, two data cycles, waitstates if any programmed and a lead-out cycle. On non-consecutive accesses, the lead-out cycle is used to prevent bus contention due to slow turn-off time of memories or I/O devices. The I/O select signal (IOSEL*) is delayed one clock to provide stable address. Figure 22. single I/O read transaction with lead-out lead-in data 1 data 2 lead-out CLK A IOS* OE* A1 D D1 I/O Write Access Each write access to I/O consists of three states and of waitstates if any programmed. The three mandatory states are divided in one write setup cycle, one data cycle and one lead-out cycle. The write operation is strobed by the WRITE* signal. Figure 23. I/O write transaction lead-in data lead-out CLK A IOS* WRITE* A1 D D1 Waitstates For application using slow I/O devices, the I/O controller provides the capability to insert waitstates during the accesses. Two types of wait-states can be inserted : • • Programmed delay, ‘Hardware’ delay. Up to 15 waitstates can be programmed for I/O accesses. Read and write waitstates are programmed simultaneously. Setting MCFG1 IOWS defines the number of waitstates to insert during any access to/from I/O areas. MCFG1 IOWS can be programmed to take values from 0 up to 15. 37 7703C–AERO–6/09 If the application needs more time for IO transfer, it is possible to introduce more delay by activating the hardware bus ready detection bit MCFG1 IOBRDY. Refer to paragraph “BRDY Wait states”, page 38. Write Protection Read and write protections are provided to prevent accidental accesses to I/O area. Protection is controlled through the I/O protection bit MCFG1 IOP. To support applications with low memory and performance requirements, I/O area can be configured for 8-bit operations. The configuration of I/O in 8-bit mode is done programming the I/O bus width in MCFG1 IOWDH. In such configuration, I/O device is not accessed by multiple 8-bit accesses as other memory areas. Only one single access is performed When the I/O bus is configured as an 8-bit wide bus, data 31 downto 24 shall be used as interface. Figure 24. I/O 8-bit bus width connection A IOS* OE* WRITE* CS OE WE A[27:0] D Bus width IO A D D[31:24] AT697F A[27:0] D[31:24] Access Error An access error can be indicated to the processor asserting the BEXC* signal. If enabled by setting logical one the MCFG1 BEXC, the BEXC* signal is sampled with the data. • • • Trap 0x01 is taken if an instruction fetch is in progress Trap 0x09 is taken if a data space is in progress Trap 0x2B is taken if a data store is in progress BRDY Wait states For PROM accesses, for IO accesses and for RAM bank 4, but not for the other RAM banks, it is possible to introduce additional wait states determined by the peripherals with the BRDY* mechanism. This capability can be enabled separatly b y the respective configuration bits MCFG1 PBRDY, MCFG1 IOBRDY and MCFG2 RAMBRDY. If the configuration bit is set to one, the processor waits before ending the transfer, as long as the BRDY* pin is driven high. If the configuration bit is set to zero (reset state), the BRDY* pin is ignored.Termination of the BRDY* induced wait states can be in two different modes: • If MCFG1 ABRDY is set to zero (reset state), BRDY* needs to be asserted zero synchronously with respect to SDCLK, respecting the setup and hold times t19 and t20 (Refer to section “AC Characteristics”, page 130).The processor will terminate the access at the rising clock edge immediately following the rising edge during which BRDY* was low by de-asserting the OE* and the select signal (RAMS*[4], IOS* or ROMS*), as shown in the figures. If MCFG1 ABRDY is set to one, BRDY* is double synchronised in the processor, and it can be asserted asynchronously, without respecting t19 and t20, provided it is asserted low for at least 1.5 clock cycle. Asynchronous BRDY* timing implies an uncertainty, the access terminates at the second or third edge after its assertion, and read data needs to be kept stable until OE* and the select signal (RAMS*[4], IOS* or ROMS*) are de-asserted. • It should be noted that the BRDY* mechanism can be used in addition to the nominal duration of an access (one or two data cycles depending on the access type) and to the fixed wait states programmed in the “WS” fields (MCFG2 RAMWWS, MCFG1 PRWWS, MCFG1 IOWS). Even when BRDY* goes low earlier, the trasaction does not terminate until expiration of the programmed wait states. 38 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Figure 25. Read access with one BRDY* controlled waitstate, MCFG1 AB=0 data1 data2 waitstate lead-out CLK A ROMS* OE* A1 D BRDY* D1 Figure 26. Read access with one BRDY* controlled waitstate, MCFG1 AB=1 data1 data2 waitstate waitstate lead-out CLK A ROMS* OE* A1 D BRDY* D1 39 7703C–AERO–6/09 Error Management - EDAC Overview The AT697F processor implements an on-chip error detector and corrector (EDAC). The on-chip memory EDAC can correct one error in a 32-bit word and detect two errors in a 32-bit word. The processor EDAC implemention enables data correction on-the-fly so that no timing penalty occurs during correction. Data error management with the EDAC can be used on both PROM and RAM memory areas. The following table presents the EDAC protection capabilities provided by the processor. Table 16. EDAC capability on Memories Address Range 0x00000000 - 0x1FFFFFFF 0x20000000 - 0x3FFFFFFF 0x40000000 - 0x7FFFFFFF PROM I/O RAM EDAC capability mapping Area 8 bits 32 bits All 8 bits 32 bits EDAC Protected yes yes no yes yes PROM protection Setting logical one the PROM EDAC enable bit MCFG3 PE, the data protection is enabled. For each read and write transaction to the PROM area the EDAC act as an error detector and an error corrector. When set logical zero, the EDAC is transparent for the PROM access. At power-on or at reset, the value of the MCFG3 PE is directly copied from the PIO2 pin. In that way, it is possible to start the application with the EDAC enabled by driving high PIO2 during the power-on sequence (or reset sequence). RAM protection Setting logical one the RAM EDAC enable bit MCFG3 RE, the data protection is enabled. For each read and write transaction to the RAM area the EDAC act as an error detector and an error corrector. When set logical zero, the EDAC is transparent for the RAM access. The processor uses an EDAC based on a seven bit Hamming code that detects any double error on a 32-bit bus and corrects any single error on a 32-bit bus. Note when the EDAC is enabled the read-modify-write bit MCFG2 RMW must be set. For each 32-bit data, a seven bit a 7-bit checksum is generated. The equations below show how the Hamming checkbits (CBx) are generated: CB0 = D0 ^ D4 ^ D6 ^ D7 ^ D8 ^ D9 ^ D11 ^ D14 ^ D17 ^ D18 ^ D19 ^ D21 ^ D26 ^ D28 ^ D29 ^ D31 CB1 = D0 ^ D1 ^ D2 ^ D4 ^ D6 ^ D8 ^ D10 ^ D12 ^ D16 ^ D17 ^ D18 ^ D20 ^ D22 ^ D24 ^ D26 ^ D28 CB2 = D0 ^ D3 ^ D4 ^ D7 ^ D9 ^ D10 ^ D13 ^ D15 ^ D16 ^ D19 ^ D20 ^ D23 ^ D25 ^ D26 ^ D29 ^ D31 CB3 = D0 ^ D1 ^ D5 ^ D6 ^ D7 ^ D11 ^ D12 ^ D13 ^ D16 ^ D17 ^ D21 ^ D22 ^ D23 ^ D27 ^ D28 ^ D29 CB4 = D2 ^ D3 ^ D4 ^ D5 ^ D6 ^ D7 ^ D14 ^ D15 ^ D18 ^ D19 ^ D20 ^ D21 ^ D22 ^ D23 ^ D30 ^ D31 CB5 = D8 ^ D9 ^ D10 ^ D11 ^ D12 ^ D13 ^ D14 ^ D15 ^ D24 ^ D25 ^ D26 ^ D27 ^ D28 ^ D29 ^ D30 ^ D31 CB6 = D0 ^ D1 ^ D2 ^ D3 ^ D4 ^ D5 ^ D6 ^ D7 ^ D24 ^ D25 ^ D26 ^ D27 ^ D28 ^ D29 ^ D30 ^ D31 Operation Hamming code Write operation When the processor performs a write operation to a memory protected by the EDAC, it also outputs the seven bit checksum on the CB[6:0] pins. During a read operation from a protected memory, the seven bit checksum is sampled from the CB[6:0] inputs. Then, the EDAC verify the checksum to check the presence of an error. According to the checksum equations, the EDAC calculates its own checksum. Then a syndrome generator uses the calculated and the read checksum to qualify if there is no error, one error or two errors in the read word. Read operation Correctable error If a single error is detected, this leads to a correctable error. The correction is done on-the-fly during the current access and no timing penalty is induced but the corrected data is not automatically written back to the memory. 40 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION The correctable error detection event is reported in the fail address register (FAILAR) and in the fail status register (FAILSR). If unmasked, interrupt 1 (trap 0x11) is generated. The interrupt can then be attached to a low priority interrupt handler that scrubs the failing memory location. Uncorrectable error If a double error is detected, this leads to an un-correctable error. An un-correctable error detection during a data access leads to a data access exception (trap 0x09). In case the double error is detected during instruction fetch, it leads to an instruction access error (trap 0x01). Figure 27. EDAC overview Memory Configuration Reg. MCFG3 Fail Address Reg. FAILAR CB[7:0] Data Bus Address Bus EDAC trap 0x01 trap 0x09 trap 0x11 Fail Status Reg. FAILSR EDAC on 8-bit areas The 8-bit mode applies to RAM and PROM while SDRAM always uses 32-bit accesses. When a memory area is configured in 8-bit mode, the EDAC checkbit bus (CB[7:0]) is not used but it is still possible to use EDAC protection. The data bus mapped on D31:24 is always accessed in a 32-bit wide word basis (4bytes at a time). The corresponding checkbits are located on top of the selected memory bank according to the following operation: • • • The address A[27:2] of the 32-bit data word is inverted The resulting address is then shifted twice right to become a byte address The checkbit is written to the derived byte address while the data address chipselect is kept active so that the current memory area is still active. A word written as four bytes to addresses 0, 1, 2, 3 will have its checkbits at address 0x0FFFFFFF, addresses 4, 5, 6, 7 at 0x0FFFFFFE and so on. Here is an example of checkbit addressing: • • • • The data is written at address 0x00000004 Inversion of this address lids to 0xFFFFFFFB Once shifted we have 0xFFFFFFFE The checkbit is located at address 0xFFFFFFFE in the same memory bank as the data. All the bits up to the maximum bank size will be inverted while the same chip-select is always asserted. This way all the bank size can be supported and no memory will be unused (except for a maximum of 4 Bytes in the gap between the data and checkbit area). Here is an overview of the memory organization when EDAC is enbled on a 8-bit area. 41 7703C–AERO–6/09 Figure 28. Memory Organization when EDAC enabled memory top address 0x0FFFFFFF 0x0FFFFFFE checksum1 checksum2 ing nd spo m rre u Co ecks Ch 0x00000007 0x00000006 0x00000005 0x00000004 0x00000003 0x00000002 0x00000001 0x00000000 data2 byte3 data2 byte2 data2 byte1 data2 byte0 data1 byte3 data1 byte2 data1 byte1 data1 byte0 Note In addition, only byte-writes shall be performed to ROM area when the EDAC is enabled. In this case, only the corresponding byte are written. EDAC testing The operation of the EDAC can be tested trough the MCFG3 memory configuration register. Figure 29. EDAC testing overview WB 8 Data Bus Memory Configuration Reg. MCFG3 TCB 8 CB[7:0] EDAC TCB 8 RB Write test If the write bypass MCFG3 WB is set logical one, the value of the test checksum from the MCFG3 TCB field replaces the normal checkbits during memory write transactions. During memory read transactions, if the read bypass MCFG3 RB is set logical one, the memory checkbits of the loaded data is stored to the test checkbit MCFG3 TCB. Read test 42 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Cache Memories Overview The AT697F processor implements a Harvard architecture with separate instruction and data buses, connected to two independent cache controllers. In order to improve the speed performance of the cpu core, multi-set-caches are used for both instruction and data caches. The cache replacement policy used for both instruction and data caches is based on the LRU algorithm. The least recently used (LRU) set of the cache is replaced when new data need to be stored in cache. Cache mapping Most of the main memory areas can be cached. The cacheable areas are the PROM and RAM areas. The following table presents the caching capabilities of the processor. Table 17. Cache Capability List Address Range 0x00000000 - 0x1FFFFFFF 0x20000000 - 0x3FFFFFFF 0x40000000 -0x7FFFFFFF 0x80000000 -0xFFFFFFFF Area PROM I/O RAM Internal Cache status Cached Non-cacheable Cached Non-cacheable Operation During normal operation, the processor accesses instructions and data using ASI 0x8 - 0xB as defined in the SPARC standard. Using the LDA/STA instructions, alternative address spaces a s caches can be accessed. ASI[3:0] are used for the mapping when ASI[7:4] have no influence on operation. • Access with ASI 0 - 3 will force a cache miss, update the cache if the data was previously cached or allocate a new line if the data was not in the cache and the address refers to a cacheable location. Access to ASI 4 and 7 will force a cache miss and update the cache if the data was previously cached. • The following table shows the ASI implementation on the AT697F. Table 18. ASI Usage ASI 0x0, 0x1, 0x2, 0x3 0x4, 0x7 0x5 0x6 0x8, 0x9, 0xA, 0xB 0xC 0xD 0xE 0xF Usage Forced cache miss (replace if cacheable) Forced cache miss (update on hit) Flush instruction cache Flush data cache Normal cached access (replace if cacheable) Instruction cache tags Instruction cache data Data cache tags Data cache data Note: Please refer to the SPARC v8 specification for detailed information on ASI usage. Instruction Cache 43 7703C–AERO–6/09 Overview The AT697F instruction cache is a multi-set cache of 32 kbyte divided in 4 memory sets. Multiset-cache use improves speed performance of the core. The instruction cache is divided into cache lines with 32 bytes of data. Each line has a cache tag associated with it consisting of a tag field and one valid bit per 4-byte sub-block. The instruction cache operations are controled with the cache control register (CCR). On an instruction cache miss to a cachable location, the instruction is fetched and the corresponding tag and data line updated. The instruction cache always works in one of three modes: • • • disabled, enabled or frozen. Cache Control Operation The instruction cache current state is reported in the instruction cache state CCR ICS. Disabled mode If disabled, no cache operation is performed and load and store requests are passed directly to the memory controller. If enabled, the cache operates as described above. In the frozen state, the cache is accessed and kept in synchronisation with the main memory as if it was enabled, but no new lines are allocated on read misses. If CCR IF is set logical one, the instruction cache is frozen when an asynchronous interrupt is taken. This can be beneficial in real-time system to allow a more accurate calculation of worstcase execution time for a code segment. The execution of the interrupt handler will not evict any cache lines and when control is returned to the interrupted task, the cache state is identical to what it was before the interrupt. If a cache has been frozen by an interrupt, it can only be enabled again by enabling the cache in the CCR. This is typically done at the end of the interrupt handler before control is returned to the interrupted task. Burst fetch An instruction burst fetch mode can be enabled setting logical one in CCR IB. If the burst fetch is enabled, the cache line is filled from main memory starting at the missed address and until the end of the line. At the same time, the instructions are forwarded to the IU. If the IU cannot accept the streamed instructions due to internal dependencies or multi-cycle instruction, the IU is halted until the line fill is completed. If the IU executes a control transfer instruction during the line fill, the line fill will be terminated on the next fetch. If instruction burst fetch is enabled, instruction streaming is enabled even when the cache is disabled. In this case, the fetched instructions are only forwarded to the IU and the cache is not updated. Cache Flush Instruction cache can be flushed by executing the FLUSH instruction, setting logical one in CCR FI, or writing any location with ASI=0x5. The flush operation takes one cycle per line during which the IU will is not halted, but during which the cache is disabled. When the flush operation is completed, the cache will resume the state indicated in the cache control register. If a memory access error occurs during a line fill with the IU halted, the corresponding valid bit in the cache tag is not set. If the IU later fetches an instruction from the failed address, a cache miss will occur, triggering a new access to the failed address. If the error remains, an instruction access error trap (tt=0x1) is generated. Instruction Cache Parity Error detection of cache tags and data is implemented using two parity bits per tag and per 4byte data sub-block. The tag parity is generated from the tag value and the valid bits. The data parity is derived from the sub-block data. The parity bits are written simultaneously with the associated tag or sub-block and checked on each access. The two parity bits correspond to the parity of odd and even data (tag) bits. Enabled mode Freeze mode Error reporting 44 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION If a tag parity error is detected during a cache access, a cache miss is generated. The tag and the data are automatically updated. All valid bits except the one corresponding to the newly loaded data are cleared. Each error is reported in the instruction cache tag error counter from the CCR. The instruction cache tag error counter CCR ITE is incremented after each instruction cache tag error detection. If a data sub-block parity error occurs, a miss is also generated but only the failed sub-block is updated with data from main memory. Each error is reported in the instruction cache data error counter from the CCR. The instruction cache data error counter CCR IDE is incremented after each instruction cache data error detection. Data Cache Overview The AT697F data cache is a multi-set cache of 16 kbyte divided in 2 memory sets. Multi-setcache use improves speed performance. The data cache is divided into cache lines with 16 bytes of data. Each line has a cache tag associated with it consisting of a tag field and one valid bit per 4-byte sub-block. The instruction cache operations are controled with the cache control register (CCR). Cache Control Operation Write The write policy for stores is write-through with no-allocate on write-miss. The write buffer (WRB) consists of three 32-bit registers used to temporarily hold store data until it is sent to the destination device. For half-word or byte stores, the stored data replicated into proper byte alignment for writing to a word-addressed device, before being loaded into one of the WRB registers. The WRB is emptied prior to a load-miss cache-fill sequence to avoid any stale data from being read in to the data cache. Read On a data cache read-miss to a cachable location, 4 bytes of data are loaded into the cache from main memory. Data cache can be flushed by executing the FLUSH instruction, setting logical one in CCR FD in the cache control register, or writing any location with ASI=0x6. The flush operation takes one cycle per line during which the IU will is not halted, but during which the cache is disabled. When the flush operation is completed, the cache will resume the state indicated in the cache control register. Since the processor executes in parallel with the write buffer, a write error will not cause an exception to the store instruction. Depending on memory and cache activity, the write transaction may not occur until several clock cycles after the store instructions has completed. If a write error occurs, the currently executing instruction will take trap 0x2B. Note: the 0x2B trap handler should flush the data cache, since a write hit would update the cache while the memory would keep the old value due the write error. Cache Flush Error Reporting If a memory access error occurs during a data load, the corresponding valid bit in the cache tag will not be set. and a data access error trap (tt=0x09) is generated. Data Cache Parity Error detection of cache tags and data is implemented using two parity bits per tag and per 4byte data sub-block. The tag parity is generated from the tag value and the valid bits. The data parity is derived from the sub-block data. The parity bits are written simultaneously with the associated tag or sub-block and checked on each access. The two parity bits correspond to the parity of odd and even data (tag) bits. If a tag parity error is detected during a cache access, a cache miss is generated. The tag and the data are automatically updated. All valid bits except the one corresponding to the newly loaded data are cleared. Each error is reported in the instruction cache tag error counter from the CCR. CCR DTE is incremented after each data cache tag error detection. 45 7703C–AERO–6/09 If a data sub-block parity error occurs, a miss is also generated but only the failed sub-block is updated with data from main memory. Each error is reported in the data cache data error counter from the CCR. CCR DDE is incremented after each data cache data error detection. Data Cache Snooper In addition to the cache controller, a snooper is implemented on the on-chip cache subsystem. The cache snooper is enabled setting logical one in CCR DS. T his snooper is able to verify if a master on the internal bus accesses and modifies some cached data. If a master accesses a data in memory and this data is cached, the snooper will invalidate the corresponding cache tag. Next time the IU will access the modified data, a cache miss will be generated due to not valid tag. Diagnostic Cache Access Tags and data in the instruction and data cache can be accessed through ASI address space 0xC, 0xD, 0xE and 0xF by executing LDA and STA instructions. Address bits making up the cache offset will be used to index the tag to be accessed while the least significant bits of the bits making up the address tag will be used to index the cache set. Diagnostic read of tags is possible by executing an LDA instruction with ASI=0xC for instruction cache tags and ASI=0xE for data cache tags. The cache line and the cache set are indexed by the address bits making up the cache offset and the least significant bits of the address bits making up the address tag. Similarly, the data sub-blocks may be read by executing an LDA instruction with ASI=0xD for instruction cache data and ASI=0xF for data cache data. The sub-block to be read in the indexed cache line and set is selected by A[4:2]. The tags can be directly written by executing a STA instruction with ASI=0xC for the instruction cache tags and ASI=0xE for the data cache tags. The cache line and cache set are indexed by the address bits making up the cache offset and the least significant bits of the address bits making up the address tag. D[31:10] is written into the ATAG filed and the valid bits are written with the D[7:0] of the write data. The data sub-blocks can be directly written by executing a STA instruction with ASI=0xD for the instruction cache data and ASI=0xF for the data cache data. The sub-block to be read in the indexed cache line and set is selected by A[4:2]. Note: Diagnostic access to the cache is not possible during a FLUSH operation and will cause a data exception (trap=0x09) if attempted. 46 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Timer Unit Prescaler Timer/Counter1, Timer/Counter2 and the watchdog share the same prescaler. The prescaler consists of a 10-bit down counter clocked by the system clock. The prescaler is decremented on each clock cycle. When the prescaler underflows, it is automatically reloaded with the content of the prescaler reload register. A count tick is generated for the two timers and the watchdog. The effective division rate is equal to prescaler reload register value + 1. Figure 30. Prescaler Block Diagram Reload Reg. SCAR Control Logic count tick clock Data Bus load Counter Reg. SCAC =0x3FF Note: The reset value for SCAR is 0. This is not a legal value, it is however equivalent to a value of 3 and leads to a division rate of 4. Caution : The two timers and watchdog share the same decrementer. The minimum allowed prescaler division factor is 4 (reload register = 3). Timer/Counter 1 & Timer/Counter 2 Timer/Counter1, Timer/Counter2 are two general purpose 32-bit timers. They share the same decrementer. The timer value is then decremented each time the prescaler generates a timer pulse. Each timer operation is controlled through a dedicated Timer Control register (TIMCTR). A timer is enabled/disabled by setting TIMCTRx ENx. Each time a timer underflows, an interrupt is generated. These interrupts can be masked with the Interrupt Mask and Priority register (ITMP). Setting TIMCTRx RLx, the content of the reload register (TIMR) is automatically reloaded in the Timer Counter register (TIMC) after an underflow and the timer continue running. If the reload bit is reset, the timer stops running after its first underflow. Timer Counter can be forced with the Timer Reload value at any time by asserting the load bit TIMCTRx LDx in the Timer Control register. Figure 31. Timer/Counter 1/2 Block Diagram Control Reg. TIMCTRn Reload Reg. TIMRn Control Logic Data Bus timer interrupts (irq 8 & 9) count tick load Counter Reg. TIMCn enable/disable =0xFFFFFFFF 47 7703C–AERO–6/09 Watchdog The watchdog operates the same way as the timers, with the difference that it is always enabled and upon underflow asserts the external signal WDOG. This signal can be used to generate a system reset. If the watchdog counter is refreshed by writing to WDG register before the counter reaches zero, the counter restarts counting from the new value. If the counter is not refreshed before the counter reaches zero, WDOG signal is asserted. After reset, the watchdog is automatically enabled and starts running. The watchdog is reset to a “all ones”. Together with the default prescaler ratio of 4, the time until first expiration of the watchdog after reset is about 2^34 clock cycles. Note: A read access gives the decounting value of the watchdog, the reload value itself is not stored in the processor. Figure 32. Watchdog Block Diagram Data Bus WDOG Watchdog Reg. WDG Control Logic clock =0xFFFFFFFF 48 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION General Purpose Interface GPI as 32-bit I/O port The general purpose interface (GPI) consists in a 32-bit wide I/O port with alternate facilities. The interface is based on bi-directional I/O ports.The port is split in two parts, with the lower 16bits accessible by the parallel IO pads and the upper 16-bits via the data bus. lower 16-bits The lower 16-bits of the general purpose interface are accessible through PIO[15:0]. All I/O ports have true Read-Modify-Write functionality when used as general I/O ports. This means that the direction of one port pin can be changed without unintentionally the direction of any other pin. The same applies when changing the drive value of the port. Figure 33. I/O port block diagram - PIO[15:0] IO Direction Reg. IODIR D Q Data Bus D Q PIOx IO Data Reg. IODAT Q D clock configuring the pin Each pin from PIO[15:0] consists of two register bits : IODIRx and IODATx. As shown in the “Register Description” section, the IODIRx bits are accessed at IODIR address and iodatx at IODAT address. The IODIR IODIRx bit selects the direction for port number x. If IODIRx is written logic one, the corresponding pin is configured as output. If written logic zero, the pin is configured as an input. When the pin is configured as an input, a read of the IODAT IODATx bit returns the current value of the pin. When the pin is configured as an output, if a logical one is written to IODATIODATx bit, the port x is driven high. If a logical zero is written to IODAT IODATx bit, the port x is driven low. switching between input & output When the port x is switched from input to output by switching IODIRx, the value of IODATx is immediatly driven on the corresponding pin.When switched from output to input by toggling IODIRx, the value from the pin is immediatly written to IODATx. upper 16-bits The upper 16-bits of the general purpose interface are accessible through D[15:0]. They can only be used when all memory areas (ROM, RAM and I/O) are 8-bit wide. If the SDRAM controller is enabled, the upper 16-bits cannot be used. Figure 34. I/O port block diagram - D[15:0] IO Direction Reg. MEDDIR/LOWDIR D Q Data Bus D Q Dx IO Data Reg. MEDDAT/LOWDAT Q D clock 49 7703C–AERO–6/09 configuring the pin The upper 16 bits of the general purpose interface can only be configured as outputs or inputs on byte basis. D[15:8] is referenced as the medium byte when D[7:0] is referenced as the lower byte. Each byte from D[15:0] consists of two register fields. As shown in the “Register Description” section, the direction fields are accessed at IODIR address when data fields at IODAT address. The IODIR MEDDIR bit and the IODIR LOWDIR bit select the direction for respectively the medium byte ( D[15:8] ) and the lower byte ( D[7:0] ). If MEDDIR (or LOWDIR) is written logic one, the corresponding byte in D[15:0] is configured as output. If written logic zero, the byte is configured as an input. When configured as an input, a read of the IODAT MEDDAT fileds returns the current value of D[15:8]. When configured as an output, the logical value from IODAT MEDDAT field is translated in physical values on D[15:8] bus. When configured as an input, a read of the IODAT LOWDAT fileds returns the current value of D[7:0]. When configured as an output, the logical value from IODAT LOWDAT field is translated in physical values on D[7:0] bus. switching between input & output When the medium byte (or the lower) is switched from input to output by switching MEDDIR (or LOWDIR), the value of MEDDAT (or LOWDAT) is immediatly driven on the corresponding pin. When switched from output to input by toggling MEDDIR (or LOWDIR), the value from the pins are immediatly written to MEDDAT (or LOWDAT). Most GPI pins have alternate functions in addition to being general I/O. Facilities like serial communication link, interrupt input and configuration are made available through these functions. The following table summaryses the assignement of the alternate functions. Table 19. GPI alternate functions GPI port pin PIO[15] PIO[14] PIO[13] PIO[12] PIO[11] PIO[10] PIO[9] PIO[8] PIO[3] PIO[2] PIO[1:0] Alternate function TXD1 - UART1 transmitter data RXD1 - UART1 receiver data RTS1 - UART1 request-to-send CTS1 - UART1 clear-to-send TXD2 - UART2 transmitter data RXD2 - UART2 receiver data RTS2 - UART2 request-to-send CTS2 - UART2 clear-to-send UART clock - Use as alternative UART clock EDAC enable - Enable EDAC checking at reset Prom width - Defines PROM bus width at reset GPI Alternate functions In addition to these alternate functions, each GPI interface pin can be configured as an interrupt input to catch interrupt from external devices. Up to four interrupts can be configured on the GPI interface by programming the I/O interrupt register (IOIT). For a detailed description of the external interrupt configuration, please refer to the “Traps and Interrupts” section. 50 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION PCI Arbiter A PCI arbiter is embedded on the AT697 chip. The¨PCI arbiter enables the arbitration of 4 PCI agents numbered from 3 downto 0. A round-robin algorithm is implemented as arbitration policy. The PCI arbiter is totally independent from the PCI interface An Agent on the PCI bus requests the bus by driving low its REQ* line. When the arbiter determines that the bus can be granted to an agent, it drives low the corresponding GNT* line. When the bus is granted to a PCI agent, the agent keeps the bus for only one transaction. If the agent desires more accesses, it shall continue to assert its REQ* line and wait to be granted the bus again. Operation Round Robin The round robin algorithm used for the arbitration is based on various loops with different priority levels. The implementation in the AT697 is based on two priority loops. A high priority loop is defined as level 0. A low priority loop is defined as level 1. The arbitration is done checking the REQ* lines of the PCI agents one after each other. In first place, the loop with level 0 is checked. If a a REQ* is active and no master is currently granted ther bus, the corresponding GNT* line is driven low. Then, the agent is granted the bus. At each complete round-turn in level 0, one step is done in level 1. The following figure illustrates the operation of the arbitre. Figure 35. Arbitre operation - Agent level 0 level 1 time Agent 0 Agent 1 Agent 2 Agent 0 Agent 1 Agent 3 Agent 0 Agent 1 Agent 2 Operation With : agents 0 and 1 at level 0 agents 2 and 3 at level 1 If all agents have a request at the same time, the following probabilities of access are implemented: • • • All agents in one level have equal probability All agents in level 1 together have the same probability of access as one agent in level 0. If no agent is in level 0, or no agent in level 0 has a request, all agents in level 1 are granted with equal probability Bus Parking As long as no bus request is active on the arbiter, the bus is granted to the last owner. It remains granted to the last owner until another agent requests the bus. When another request is asserted, re-arbitration occurs after one turnover cycle. After reset, the bus is parked to agent 0. Agent 0 is the default owner after a reset operation. Re-arbitration When a master is managing a transfer and another one makes a request to the arbiter, re-arbitration occurs. Only one re-arbitration is performed during a transfer. A new arbitration will take place when the master which was granted the bus frees the bus. As long as all the PCI agents have no request pending, the arbitration is performed. A re-arbitration cycle also occurs when living the bus parking state. Two different priority levels are defined for the PCI arbiter. Level 0 is defined as the high priority level. Level 1 defines the low priority level. Assignment of the PCI agents priority level is programmable through the arbiter configuration register (ACR). Each PCI agent can be individually configured to operate either on level 0 or on level 1, except agent 3 that is defined by hardware with a low priority (level 1). Setting logical one in PCIA Px leads the agents x to a low priority level. Setting this bit logical zero leads to a high priority. After reset, all the PCI agents are configured in the low priority loop. 51 Priority definition 7703C–AERO–6/09 PCI Interface Overview The PCI interface implementation is compliant with the PCI 2.2 specification. It is a high performance 32-bit bus interface with multiplexed address and data lines. It is intended for use as an interconnect mechanism between processor/memory systems and peripheral controller components. The AT697 processor embedds the In-Silicon PCI core. It is interfaced to the processor core through the PCI to AMBA bridge developped by the European Space Agency. The PCI bus operations can be clocked at a frequency up to 33MHz, independently of the processor clock. Synchronization of the operation between PCI interface and AT697 core implies numerous FIFO usage. This implementation allows to use the device for Initiator (Master) and Target operations. In each mode single word and burst transfer can be executed. Two different operating modes can be used with the PCI interface : • Host Bridge The host-bridge connects the local bus of a processor to the PCI bus. Its PCI configuration registers are accessible locally by the processor, but not through PCI configuration cycles. Host-bridge initialises other satellite devices through PCI configuration commands. • Satellite The satellite is a PCI device, configurable via PCI configuration cycles and the idsel line, but not locally. Both, host-bridge and satellites can be initiator and/or target on the bus. The present interface has universal functionality, allowing both operation modes. The mode is configured via a hardware bootstrap on the SYSEN* pin. The state of the SYSEN* pin is copied in PCIIS SYS. This enables plug and play boot programs loading the appropriate driver depending on the hardware configuration. In the same manner, the configuration registers are made visible as read only when the device is configured as satellite Some other features are supported by this interface like • • • • • • • Target lock support Zero-latency Fast Back-to-Back transfers Zero wait state burst mode transfers Support for memory read line/multiple Support for memory write and invalidate commands Delayed read support Flexible error reporting by polling The PCI bus is a multiplexed one. In this way, address and data through the same medium. That is why PCI communication is based on two phase burst transfer. Each transfer is composed of the following phases : • An address phase During the address phase, the initiator of the communication drives the 32-bit address concerned by the transfer and the command involved through this transfer. The command defines the space area concerned with the transfer and the direction of the transfer. A data phase During the data phase the initiator of the communication drives the enable bit signal so that only active part of the bus is enabled. When reading, the initiator drives the enable bits and the target set the data on the bus. • PCI Initiator (Master) The PCI initiator mode of the AT697 gives a direct memory-mapped (initiator) access to the PCI bus. Any access to a memory address in the PCI address range is automatically translated by the interface into the appropriate PCI transaction. In this configuration, the PCI bus is accessed by the same instructions as the main memory. The SPARC instruction set foresees various load/store instruction types. The PCI bus foresees 32 bit wide transactions with byte-enables for each byte lane. 52 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Initiator Mapping For standard operation, the PCI interface only works in a limited address range. The address r ange for such initiator transaction is limited to addresses between 0xA0000000 and 0xF0000000. PCI addresses outside of this predefined range can be accessed only via DMA transactions. Instructions of different width (byte, half-word, word, double) can be performed for each address of the PCI address range. The three low significant bits of the address A[2:0] are used to determine which PCI byte enable line C/BE*[3:0] should be active during the transaction. According to the SPARC architecture, big-endian mapping is implemented, the most significant byte standing at the lower address (0x..00) and the least significant byte standing to the upper address (0x..03). A byte-writing to A[1:0] = 00 results in the byte enable pattern 0111, indicating that the e most significant byte lane (bits 31:24) of the PCI data bus is selected. The following table presents the transaction width authorized for PCI transfers. Table 20. Byte Enable Settings width Assembler C-datatype A[2:0]=000 A[2:0]=100 A[2:0]=x01 A[2:0]=x10 A[2:0]=x11 8 ld[s/u]b, stb char 0111 0111 1011 1101 1110 16 ld[s/u]h, sth short 0011 0011 not aligned 1100 not aligned 32 ld, st int 0000 0000 not aligned not aligned not aligned 64 ldd, std long long 0000 (burst) not aligned not aligned not aligned not aligned Note: PCI byte enables are active low. For non-aligned accesses, the byte enable pattern (1111) is issued on PCI, to avoid destroying data in the remote PCI target. Memory cycles Many memory transactions such as memory-read/write and memory-read-line/write-invalidate can be issued from the processor with common SPARC instruction set. Selection of the command to execute is performed setting the value PCIIC COMMSB. Setting logical ‘01’ in PCIIC COMMSB result in the generation of memory read/write access when PCI address is accessed. A logical value of ‘10’ result in a memory read line or write and invalidate on PCI address access. For the memory commands the address issued on the PCI bus is a word address with bits (1:0) set to 00. This indicates that the linear incrementing mode is used. operation The following procedure shall be used to engage memory transaction on the PCI interface: 1. Select the initiator mode by setting logical one in the PCIIC MOD. 2. Select the memory load/store command or the memory read-line/write and invalidate command in the PCI initiator configuration register. The PCIIC COMMSB shall be set logical ‘01’ for simple load/store operation and shall be set logical ’11’ for read-line/write&-invalidate. 3. Enabling the interrupt signalisation is optionnal. It can be enabled setting logical one in PCIITE IMIER. Up to four interrupt sources can be defined : Initiator Error, Initiator Parity Error, PCI core error and system error. 4. Engage an access to a memory address mapped in the PCI address range. 53 7703C–AERO–6/09 IO transaction cycles operation The following procedure shall be used to engage I/O transaction on the PCI interface: 1. Select the initiator mode by setting logical one in PCIIC MOD. 2. Select the I/O load/store command in the PCI initiator configuration register. The PCIIC COMMSB shall be set logical ‘00’ for I/O operation. 3. Enabling the interrupt signalisation is optionnal. It can be enabled setting logical one in PCIITE IMIER. Up to four interrupt sources can be defined : Initiator Error, Initiator Parity Error, PCI core error and system error. 4. Engage an access to an I/O address mapped in the PCI address range. Configuration cycles Target selection Accesses to a configuration address space requires the target device to be selected. Due to the address range limitation, the chip-select (IDSEL) connection necessary for device selection shall be done using only A/D[27:16]. This allows up to 12 PCI devices to be connected on the bus. Devices with chip-select line connected to A/D[31:28] can’t be configured through standard operations. DMA configuration cycles shall be used to configure the devices connected to A/D[31:28]. The PCI bus configuration cycles can be performed using the same instructions as the main memory. To generate such configuration cycle with the standard instructions,PCIIC COMMS shall be programmed to ‘01’. Then, if a load (or store) cycle is performed to an addresss in the PCI address range, a physical configuration cycle is performed on the PCI bus. The full 32-bit address defined on the internal bus is propagated on the PCI bus. Once a target is selected (DEVSEL* asserted). Operation The following procedure shall be used to engage configuration cycle on the PCI interface: 1. Select the initiator mode by setting logical one tin PCIIC MOD 2. Select the configuration load/store command in PCIIC COMMS shall be set logical ‘10’ for configuration operation. 3. Enabling the interrupt signalisation is optionnal. It can be enabled setting logical one in PCIITE IMIER. Up to four interrupt sources can be defined : Initiator Error, Initiator Parity Error, PCI core error and system error. 4. Engage an access to an configuration space. Limitation Configuration cycles shall only be generated by the PCI host of the bus or by a PCI-to-PCI bridges. Special cycles By default, all requests are translated into single cycle PCI transactions, each transaction consisting in an address phase followed by a single data phase. Linear incrementing store-word sequences are translated into undetermined length PCI write bursts with up to a maximum of 255 words. The PCI burst mode is then maintained as long as possible. Read/write direction is unchanged and the address An+1 = An + 4. When the sequence is discontinued, the PCI burst stops with a last data phase during which byte enables are 1111. The PCI implementation only supports fast back2back cycles to the same target. Before using fast back-2-back transfers, fast back-2-back cycles shall be enabled setting logical one the bit COM9 in the status command register (PCISC). PCISC COM9 shall only be set one if all targets on the bus support fast back-2-back transfers. Issuing a fast back to back transfer is done setting logical one in PCIDMA B2B. Note: Fast back-2-back can only be generated by the initiator. It is not accepted by the AT697 PCI target. Linear incrementing store-word Fast back2back cycles 54 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Error reporting Fatal (abort) and address parity errors On a fatal error ( or address parity error ), the interface flushes all the current buffer requests and all other buffer requests. Then, the interface reports the fatal error driven logical one the PCIITE CMFER. The PCI core is restarted as soon as a new request is engaged. DMA transfer A DMA facility is available on the AT697 processor. The DMA transfer are performed through the PCI interface. The DMA controller executes data transfer between the local memory and a remote target on the PCI bus. The processor core only intervenes for the initiation of the transfer. Once transfer is initiated, DMA controller is fully autonomous. DMA transfers take place in background of the processor core activity. Thus, interrupts are provided to help to synchronise the application with start and end of the transfer. The DMA interface executes only word-size transactions with all 4 byte lanes enabled. Operation The DMA is enabled setting logical one the PCIIC MOD. To synchronize the application with the start and the end of the transaction, two interrupts can be enabled : PCIITE DMAER for transfer control and PCIITE IMIER for error control. Each DMA sequence shall program the following parameters : • • • • PCI start address PCI command type number of words to be transferred the start address in the local memory A DMA transfer is performed assuming the following operations are done in the given order : 1. Write the PCI start address of burst to the PCI start address register (PCISA). The PCISA register shall be re-writen each time a DMA transfer is initiated, even if the address is identical to the address of the previous DMA request. 2. Write together the PCI command and the number of words to be transfered in the PCI DMA configuration register (PCIDMA). Writing to the PCIDMA passes the PCI address, the word count and the PCI command to the PCI core and initiates the transaction on the PCI bus. 3. Write the start address in the local memory map to the PCI DMA address register (PCIDMAA). Once the three operation are executed, data transfer is started in background. Once the specified number of words is transfered, the interface set logical one the PCIITE D MAER and generate an interrupt if enabled. Then DMA controller goes back to idle state. Error Reporting If the PCI core does not accept the DMA cycle request, the DMA state controller remains locked and an error is reported as initiator error with the PCIITE IMIER bit set logical one. If the request on the PCI core was just delayed, rewriting PCIDMAA may succeed. If the problem persists, reset the interface by writing –1 (0xFFFFFFFF) to PCIIC. A DMA transaction may never cross a 1 KByte border. The value represented by PCIDMAA(9:2) + PCIDMA(7:0) must be less than 256. If this restriction is not respected, the data transfer stops at the 1 kByte border. Then the PCI core is flushed. Simultaneously, in the PCI interrupt pending register (PCIITP) the dma error bit PCIITE DMAER and the initiator error bit PCIITE IMIER are asserted logical one. If enabled with the PCI interrupr enable register (PCIITE) and unmasked in the general interrupt mask register, the PCI interrupt 14 is generated (TT = 0x1E). Transfer Limitation Debug Facilities Not implemented for application use. 55 7703C–AERO–6/09 Target Mode Transfer In the target mode, the PCI interface receives requests originated from remote PCI initiators (masters). Target data transfer is executed in background without AT697 core intervention. AT697 core can only intervenes is the configuration of the target. • • In host bridge mode the target is configured by the AT697 core In satellite mode the configuration is done by a remote device using the PCI command set Target Programming The target is configured through the following registers : • PCISC register bits 0/1 for memory and I/O command response bit 6 for check of data and address parity error bit 7 for response to data and address parity error base address registers memory base address : MEMBAR1, MEMBAR2 I/O base address : IOBAR PCITPA register to indicate the storage location PCITSC FRTY bit to write data in memory • • • transaction Ordering As specified in the PCI standard, delayed read functionality is implemented, obeying to the following rules: • The interface stores one delayed read at a time. When a read request was retried (because local data not yet available), the interface remains locked for any other target read (targeting different addresses). The initiator of the original read has to repeat its request to the same address. A retried (delayed) read can be interrupted by one or more PCI write accesses. The PCI standard requires this write command to be processed first, to prevent a system lock-up. Meanwhile, the interface will prefetch read-data into the TXMT FIFO. After the (interfering) write, when the read request is repeated, and the requested data is available in the FIFO the delayed transfer completes normally. • • All target read accesses are generally prefetching, also reads with I/O command. Once a start address is given, the interface prefetches up to 8 words into the TXMT FIFO. After the last required data word was transferred to PCI, the PCI core automatically flushes the FIFO to discard the unused prefetched data. The interface assumes the complete local address space to be ‘prefetchable’, defined here as the fact, that reading from an address does not alter the data. This behaviour is to be considered if non-prefetchable devices (for example the UART’s) shall be read through the PCI target. PCI Error Reporting According to the PCI standard, error and status bits are implemented in the PCI status register.(PCISC). The PCI standard foresees a single parity check, by which bus-errors can be detected, but not corrected. Errors which occur in the PCI interface or on the PCI bus are also saved in status bits in the PCIITP register, and optionally, the PCI interrupt (IRQ14) is asserted. Different events can be selected to assert the interrupt. By the interrupt enable register (PCIITE) configuration you can select the interrupt events which will assert IRQ14. then an interrupt handler can read the interrupting event in the status register (PCIITP). Furthermore, interrupts can be forced for test purposes by writing to PCIITF. In host-bridge configuration, this allows an error detection by polling. Certain events and errors are also reported by the interface in the interrupt status register. For each bit of this register , interrupt generation can be programmed individualy. All PCI interrupt generated are then reported to AT697 core through the PCI interrupt (IT14). The different interrupt causes are distinguished by the interrupt status registers settings. Please refer to the register description chapter for more details on interrupt status register. 56 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION UARTs (UART1 and UART2) Overview The Universal Asynchronous Receiver and Transmitter (UART) is a highly flexible serial communication module. The AT697 implements two uarts : UART1 and UART2. Uarts on the processor are defined as alternate functions of the general purpose interface (GPI). The two UART’s provide double buffering. Each UART consists of a transmitter holding register, a receiver holding register, a transmitter shift register, and a receiver shift register. Each of these registers are 8-bit wide. Figure 36. UART Block Diagram Uart Scaler Reg. UASCAn Uart Control Reg. UACn Uart Status Reg. UASn Baud-rate generator control logic CTS RTS TX Data Bus RX Receiver Shift Register Transmitter Shift Register Receiver Holding Register Transmitter Holding Register Uart Data Reg. UADn Each UART is fully controlled by a set of four registers including : • • • • a control register a status register a scaler register and a data register Serial Frame Frame formats A serial frame is defined to be one character of data bits with synchronisation bits (start and stop bits), and optionnaly a parity bit for error checking. Two frame formats are accepted by the AT697 UARTs, the only difference being the presence or the absence of the parity bit. All the frames are built on an eight data bits basis. A frame starts with the synchronization start bit followed by the least significant data bit. Then the next data bits, up to a total of eight, are succeeding, ending with the most significant bit. If enabled by setting the UACx PEx, the parity bit is inserted after the data bits and before the stop bit. The following figure illustrates the accepted frame formats. Figure 37. Data frame format Data frame, no parity: Start D0 D1 D2 D3 D4 D5 D6 D7 Stop Data frame with parity: Parity bit Start D0 D1 D2 D3 D4 D5 D6 D7 Parity Stop The parity bit is calculated by doing an exclusive-or of all the data bits. The odd parity is configured setting logical one the UACx PSx . In this case, the result of the exclusive or is inverted. An even parity can be selected setting logical zero the UACx PSx. 57 7703C–AERO–6/09 If used, the parity bit is located between the last data bit and the stop bit of the serial frame. The relation between the parity bit and data bits is as follows: P even = d 7 ⊕ … ⊕ d 3 ⊕ d 2 ⊕ d 1 ⊕ d 0 ⊕ 0 P odd = d 7 ⊕ … ⊕ d 3 ⊕ d 2 ⊕ d 1 ⊕ d 0 ⊕ 1 Peven Podd dn Parity bit using even parity Parity bit using odd parity Data bit n of the character Clock Generation The clock generation logic generates the base clock for the Transmitters and Receivers. The bit rate of the UART is issued from the clock generator after a combination between the input clock of the clock module and a scaler. Two clock inputs can be used by the clock generator : • • An internal clock An external clock Uart Clock Each UART can be configured to use either the internal or the external clock source by programming the U ACx E Cx. If set l ogical zero, the UART is clocked by the internal clock. If UACx ECx is set logical one, the UART is clocked by the external clock. When using the external configuration, the UART clock shall be provided by PIO[3] from the general purpose interface. This clock input is used as an alternate function for PIO[3]. caution : When using the external clock source, the frequency of PIO[3] must be less than half the frequency of the system clock. Baud Rate Generation To generate the bit-rate, each UART has a program mable 12-bits clock divider (UASCAx). According to the configuration of the UACx ECx, the scaler is clocked either by the system or by an external clock. Each time the scaler underflows, a UART tick is generated. The scaler is automatically reloaded with the value of the UART scaler register after each underflow. The resulting UART tick frequency should be 8 times the desired baud-rate. The following equation shall be used to calculate the scaler value to define, depending on the clock source and the expected baud rate. uartclk × 10 -------------------------------- – 5 baudrate × 8 s caler = -----------------------------------------10 variable description : • • • uartclk : frequency of the uart clock baudrate : expected baud rate scaler : value to set in (UASCAx) to reach the expected baudrate Communication Operations Transmitter Operation UARTS operations are controlled through the uart control registers (UACx) and the Uart status registers (UASx). The transmitter is enabled setting logical one the UACx TEx . When ready to transmit, data is transferred from the transmitter holding register to the transmitter shift register and converted to a serial frame on the transmitter serial output pin (TX). Following the transmission of the stop bit, if a new character is not available in the transmitter holding register, the transmitter serial data output remains high and the transmitter shift register 58 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION empty bit UASx TSx. Transmission resumes and the UASx TSx is cleared when a new character is loaded in the transmitter holding register. If the transmitter is disabled, it will continue operating until the character currently being transmitted is completely sent out. The transmitter holding register cannot be loaded when the transmitter is disabled. If flow control is enabled, the CTS input must be low in order for the character to be transmitted. If it is deasserted in the middle of a transmission, the character in the shift register is transmitted and the transmitter serial output then remains inactive until CTS is asserted again. If the CTS is connected to a receivers RTS, overrun can effectively be prevented. Receiver Operation The receiver is enabled for data reception when the receiver enable bit UACx REx is set logical one. The receiver looks for a high to low transition of a start bit on the receiver serial data input pin. If a transition is detected, the state of the serial input is sampled a half bit clocks later. If the serial input is sampled high the start bit is invalid and the search for a valid start bit continues. If the serial input is still low, a valid start bit is assumed and the receiver continues to sample the serial input at one bit time intervals until the proper number of data bits and the parity bit have been assembled and one stop bit has been detected. During this process the least significant bit is received first. The serial input is sampled three times for each bit and averaged to filter out noise. The data is then transferred to the receiver holding register and the data ready bit UASx DRx is set logical one. The parity, framing and overrun error bits are set at the received byte boundary, at the same time as the receiver ready bit is set. If both receiver holding and shift registers contain an un-read character when a new start bit is detected, then the character held in the receiver shift register will be lost and the overrun bit UASx OVx is set logical one. If flow control is enabled, then the RTS will be negated (high) when a valid start bit is detected and the receiver holding register contains an un-read character. When the holding register is read, the RTS will automatically be reasserted again. A correctly received byte is indicated by the data ready bit UASx DRx. In case of error (framing error, stop bit error,...), the respective bits UASx FEx, UASx PEx, ... are set logical one when the data ready bit remains logical zero. Interrupt Generation The two UARTs can be configured to generate interrupt each time a byte is received or a byte is sent. If the UACx TIx is set logical one, an interrupt is issued after each character sending. If set logical zero, no interrupt is issued on character sending. If the UACx RIx is set logical one, an interrupt is issued after each character reception. If set logical zero, no interrupt is issued after a character reception. If the receiver interrupt is enabled, when error is detected during the reception of a character,an interrupt is generated. To identify the origin of the transaction failure, refer to the uart status register bits (UASx OVx, UASx PEx, UASx TEx) that indicate either it is a parity, a framing or an overrun error. Loop back mode If the UACx LBx is set, the UART will be in loop back mode. In this mode, the transmitter output is internally connected to the receiver input and the RTS is connected to the CTS. It is then possible to perform loop back tests to verify operation of receiver, transmitter and associated software routines. In this mode, the outputs remain in the inactive state, in order to avoid sending out data. 59 7703C–AERO–6/09 Debug Support Unit - DSU Overview The AT697 processor includes an hardware debug support unit to aid software debugging on target hardware. The support is provided through two modules: a debug support unit (DSU) and a debug communication link (DCL). The DSU can put the processor in debug mode, allowing read/write access to all processor registers and cache memories. The DSU also contains a trace buffer which stores executed instructions or data transfers on the internal bus. The debug communications link implements a simple read/write protocol and uses standard asynchronous UART communications. Figure 38. Debug Support Unit and Communication Link AT697 processor Trace Buffer DSUEN DSUBRE DSUACT Debug Support Unit I-Cache D-Cache AT697 SPARC V8 Integer unit Debug I/F AHB interface AMBA AHB DSUTX DSURX Debug Comm. Link It is possible to debug the processor through any master on the internal bus. The PCI interface is build in as a master on the internal bus. All debug features are available from any PCI master. Debug Support Unit The debug support unit is used to control the trace buffer and the processor debug mode. The DSU master occupies a 2 Mbyte address space on the internal bus. Through this address space, any other masters like PCI can access the processor registers and the contents of the trace buffer. The DSU control registers can be accessed at any time, while the processor registers and caches can only be accessed when the processor has entered debug mode. The trace buffer can be accessed only when tracing is disabled or completed. In debug mode, the processor pipeline is held and the processor is controlled by the DSU. Entering the debug mode can occur on the following events: • • • • • • • • executing a breakpoint instruction (ta 1) integer unit hardware breakpoint/watchpoint hit (trap 0x0B) rising edge of the external break signal (DSUBRE) setting the break-now DSUC BN a trap that would cause the processor to enter error mode occurrence of any, or a selection of traps as defined in the DSU control register after a single-step operation DSU breakpoint hit The debug mode can only be entered when the debug support unit is enabled through an external pin (DSUEN). Driving the DSUEN pin high enables the debug mode. When the debug mode is entered, the following actions are taken: • • PC and nPC are saved in temporary registers (accessible by the debug unit) an output signal (DSUACT) is asserted to indicate the debug state 60 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION • the timer unit is (optionally) stopped to freeze the AT697 timers and watchdog The instruction that caused the processor to enter debug mode is not executed, and the processor state is kept unmodified. Execution is resumed by clearing the DSUC B N or by deasserting DSUEN. The timer unit will be re-enabled and execution will continue from the saved PC and nPC. Debug mode can also be entered after the processor has entered error mode, for instance when an application has terminated and halted the processor. The error mode can be reset and the processor restarted at any address. DSU Breakpoint The DSU contains two breakpoint registers for matching either internal bus addresses or executed processor instructions. A breakpoint hit is typically used to freeze the trace buffer, but can also put the processor in debug mode. Freeze operation can be delayed by programming the DSUC DCNT to a non-zero value. In this case, the DSUC DCNT value will be decremented for each additional trace until it reaches zero, after which the trace buffer is frozen. If the brake on trace freeze bit DSUC BT is set logical one, the DSU forces the processor into debug mode when the trace buffer is frozen. Note: Due to pipeline delays, up to 4 additional instruction can be executed before the processor is placed in debug mode. A mask register is associated with each breakpoint, allowing breaking on a block of addresses. Only address bits with the corresponding mask bit set to ‘1’ are compared during breakpoint detection. Time Tag The DSU implements a time tag counter. This counter is decremented each clock as long as the processor is running. The counter is stopped when the processor enters debug mode. It is restarted when execution is resumed. This time tag counter is stored in the trace as an execution time reference. Trace Buffer The trace buffer consists of a circular buffer that stores the executed instructions or the internal bus data transfers. The size of the trace buffer is 512 lines of 16 bytes. The trace buffer operation is controlled through the DSU control register (DSUC) and the trace buffer control register (TBC). When the processor enters debug mode, tracing is suspended. The trace buffer can contain the executed instructions, the transfers on the internal bus or both (mixed-mode). The trace buffer control register (TBC) contains two counters TBC BCNT and TBC ICNT that store the address of the trace buffer location that will be written on next trace. Since the buffer is circular, it actually points to the oldest entry in the buffer. The indexes are automatically incremented after each stored trace entry. Instruction trace The instruction trace mode is enabled setting logical one the trace instruction enable bit TBC TI. During instruction tracing, one instruction is stored per line in the trace buffer with the exception of multi-cycle instructions. Multi-cycle instructions can be entered two or three times in the trace buffer : • For store instructions, bits [63:32] correspond to the store address on the first entry and to the stored data on the second entry (and third in case of STD). Bit 126 is set logical one on the second and third entry to indicate this. A double load (LDD) is entered twice in the trace buffer, with bits [63:32] containing the loaded data. Multiply and divide instructions are entered twice, but only the last entry contains the result. Bit 126 is set for the second entry. For FPU operation producing a double-precision result, the first entry puts the MSB 32 bits of the results in bit [63:32] while the second entry puts the LSB 32 bits in this field. • • • 61 7703C–AERO–6/09 Table 21. Trace buffer data allocation, Instruction tracing mode Bits 127 126 125:96 95:64 63:34 33 32 31:0 Name Instruction breakpoint hit Multi-cycle instruction DSU counter Load/Store parameters Program counter Instruction trap Processor error mode Opcode Definition Set to ‘1’ if a DSU instruction breakpoint hit occurred. Set to ‘1’ on the second and third instance of a multi-cycle instruction (LDD, ST or FPOP) The value of the DSU counter Instruction result, Store address or Store data Program counter (2 lsb bits removed since they are always zero) Set to ‘1’ if traced instruction trapped Set to ‘1’ if the traced instruction caused processor error mode Instruction opcode When a trace is frozen, interrupt 11 is generated. Bus Trace The bus trace mode is enabled setting logical one the trace instruction enable bit TBC TA. During bus tracing, one operation of the internal bus is stored per line in the trace buffer. Table 22. Trace Buffer Data Allocation, Internal bus Tracing Mode Bits 127 126 Name AHB breakpoint hit Definition Set to ‘1’ if a DSU AHB breakpoint hit occurred. Unused The value of the DSU counter Processor interrupt request input Processor interrupt level (psr.pil) Processor trap type (psr.tt) AHB HWRITE AHB HTRANS AHB HSIZE AHB HBURST AHB HMASTER AHB HMASTLOCK AHB HRESP AHB HRDATA or HWDATA AHB HADDR 125:96 DSU counter 95:92 91:88 95:80 79 78:77 76:74 73:71 70:67 66 65:64 63:32 31:0 IRL PIL Trap type Hwrite Htrans Hsize Hburst Hmaster Hmastlock Hresp Load/Store data Load/Store address Mixed Trace In mixed mode, the buffer is divided on two halves, with instructions stored in the lower half and bus transfers in the upper half. The MSB bit of the AHB index counter is then automatically kept high, while the MSB of the instruction index counter is kept low. 62 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION DSU Memory Map Table 23. DSU Map Address 0x800000c4 0x800000c8 0x800000cc 0x90000000 0x90000004 0x90000008 0x90000010 0x90000014 0x90000018 0x9000001C 0x90010000 - 0x90020000 ..0 ...4 ...8 ...C 0x90020000 - 0x90040000 0x90080000 - 0x90100000 0x90080000 0x90080004 0x90080008 0x9008000C 0x90080010 0x90080014 0x90080018 0x9008001C 0x90080040 - 0x9008007C 0x90100000 - 0x90140000 0x90140000 - 0x90180000 0x90180000 - 0x901C0000 0x901C0000 - 0x90200000 Register DSU UART status register DSU UART control register DSU UART scaler register DSU control register Trace buffer control register Time tag counter AHB break address 1 AHB mask 1 AHB break address 2 AHB mask 2 Trace buffer Trace bits 127 - 96 Trace bits 95 - 64 Trace bits 63 - 32 Trace bits 31 - 0 IU/FPU register file IU special purpose registers Y register PSR register WIM register TBR register PC register NPC register FSR register DSU trap register ASR16 - ASR31 (when implemented) Instruction cache tags Instruction cache data Data cache tags Data cache data The addresses of the IU/FPU registersis defined according to how many register windows has been implemented. The registers can be accessed at the following addresses (NWINDOWS = number of SPARC register windows = 8): • • • • • %on: 0x90020000 + (((psr.cwp * 64) + 32 + n) mod (NWINDOWS*64)) %ln: 0x90020000 + (((psr.cwp * 64) + 64 + n) mod (NWINDOWS*64)) %in: 0x90020000 + (((psr.cwp * 64) + 96 + n) mod (NWINDOWS*64)) %gn: 0x90020000 + (NWINDOWS*64) + 128 %fn: 0x90020000 + (NWINDOWS*64) 63 7703C–AERO–6/09 Debug Operations Instruction Breakpoints To insert instruction breakpoints, the breakpoint instruction (ta 1) should be used. This will leave the four IU hardware breakpoints free to be used as data watchpoints. Since cache snooping is only done on the data cache, the instruction cache must be flushed after the insertion or removal of breakpoints. To minimize the influence on execution, it is enough to clear the corresponding instruction cache tag (which is accesible through the DSU). The DSU hardware breakpoints should only be used to freeze the trace buffer, and not for software debugging since there is a 4-cycle delay from the breakpoint hit before the processor enters the debug mode. Single Stepping By writing the TBC SS and reseting the TBC BN bit, the processor will resume execution for one instruction and then automatically enter debug mode. The DSU trap register (DTR) consists in a read-only register that indicates which SPARC trap type caused the processor to enter debug mode. When debug mode is forced by setting the TBC BN, the trap type is 0x0B. DSU Trap 64 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION DSU Communication Link DSU communication link consists of a UART connected to the internal bus as a master. Figure 39. DSU Communication Link Block Diagram Baud-rate generator 8*bitclk Serial port Controller AMBA APB DSURX Receiver shift register Transmitter shift register DSUTX AHB master interface AHB data/response AMBA AHB A simple communication protocol is supported to transmit access parameters and data. A link command consist of a control byte, followed by a 32-bit address, followed by optional write data. If the TBC LR is set, a response byte will be sent after each AHB transfer. If the TBC LR is not set, a write access does not return any response, while a read access only returns the read data. Data Frame Data is sent on 8-bit basis. Figure 40. DSU UART Data Frame Start D0 D1 D2 D3 D4 D5 D6 D7 Stop Commands Through the communication link, a read or write transfer can be generated to any address on the internal bus. A response byte is can optionally be sent when the processor goes from execution mode to debug mode. Block transfers can be performed be setting the length field to n-1, where n denotes the number of transferred words. For write accesses, the control byte and address is sent once, followed by the number of data words to be written. The address is automatically incremented after each data word. For read accesses, the control byte and address is sent once and the corresponding number of data words is returned. Figure 41. DSU Commands DSU Write Command Send 11 Length -1 Addr[31:24] Addr[23:16] Addr[15:8] Addr[7:0] Data[31:24] Data[23:16] Data[15:8] Data[7:0] Receive Resp. byte (optional) Response byte encoding DSU Read command Send 10 Length -1 Addr[31:24] Addr[23:16] Addr[15:8] Addr[7:0] bit 7:3 = 000000 bit 2 = DMODE bit 1:0 = HRESP Receive Data[31:24] Data[23:16] Data[15:8] Data[7:0] Resp. byte (optional) Clock Generation The UART contains a 14-bit down-counting scaler to generate the desired baud-rate. The scaler is clocked by the system clock and generates a UART tick each time it underflows. The scaler is reloaded with the value of the UART scaler reload register after each underflow. The resulting UART tick frequency should be 8 times the desired baud-rate. 65 7703C–AERO–6/09 If not programmed by software, the baud rate will be automatically be discovered. This is done by searching for the shortest period between two falling edges of the received data (corresponding to two bit periods). When three identical two-bit periods has been found, the corresponding scaler reload value is latched into the reload register, and the DSUUC BL bit is set . If the DSUUC BL is reset by software, the baud rate discovery process is restarted. The baud-rate discovey is also restarted when a ‘break’ is received by the receiver, allowing to change to baudrate from the external transmitter. For proper baudrate detection, the value 0x55 should be transmitted to the receiver after reset or after sending break. The best scaler value for manually programming the baudrate can be calculated as follows: sdclk frequency x 10 scaler = baudrate x 8 10 5 Booting from DSU By asserting DSUEN and DSUBRE at reset time, the processor will directly enter debug mode without executing any instructions. The system can then be initialised from the communication link, and applications can be downloaded and debugged. Additionally, external (flash) PROMs for standalone booting can be re-programmed. 66 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION JTAG Interface Overview The AT697 implements a standard interface compliant with the IEEE 1149.1 JTAG specification. This interface can be used for PCB testing using the JTAG boundary-scan capability. The JTAG interface is accessed through five dedicated pins. In JTAG terminology, these pins constitute the Test Access Port (TAP). The following table summarizes the TAP pins and there function at JTAG level. Table 24. TAP Pins Pin Name Type Description Used to clock serial data boundary into scan latches and control sequence of the test state machine. TCK can be asynchronous with CLK Primary control signal for the state machine. Synchronous with TCK. A sequence of values on TMS adjusts the current state of the TAP. Serial input data to the boundary scan latches. Synchronous with TCK Serial output data from the boundary scan latches. Synchronous with TCK Resets the test state machine. can be asynchronous with TCK TCK Test Clock Input TMS Test Mode select Input TDI Test Data Input Input TDO TRST Test Data Output Test Reset Output Input For more details, please refer to the ‘IEEE Standard Test Access Port and Boundary Scan’ specification. Any AT697 based system will contain several JTAG compatible chips. These are connected using the minimum (single TMS signal) configuration. This configuration contains three broadcast signals (TMS, TCK, and TRST,) which are fed from the JTAG master to all JTAG slaves in parallel, and a serial path formed by a daisy-chain connection of the serial test data pins (TDI and TDO) of all slaves. The TAP supports a BYPASS instruction which places a minimum shift path (1 bit) between the chip’s TDI and TDO pins. This allows efficient access to any single chip in the daisy-chain without board-level multiplexing. Figure 42. JTAG Serial connection using 1 TMS Signal Part 1 Part 2 TDO TRST TDI TMS TCK TDO TRST TDI TMS TCK Part 3 TDO TRST TDI Part n TDO TRST TDI TDI TMS TCK TDO TMS TCK TMS TCK TRST 67 7703C–AERO–6/09 TAP Architecture The TAP implemented in the AT697 consists of a TAP interface, a TAP controller, plus a number of shift registers including an instruction register (IR) and some registers . Figure 43. AT697 TAP Architecture Boundary Scan Register TDO TDI Device ID Register Bypass Register Mux 0 1 DQ EN ∇ Test Data Registers .... TAP TMS TCK TRST Clock DR Shift DR Update DR Reset TAP Controller Clock IR Shift IR Update IR Instruction Decode ......... Instruction Register .... Select TCK Ena TDO .... Design-Specific Data TAP Controller The TAP controller is a synchronous finite state machine (FSM) which controls the sequence of operations of the JTAG test circuitry, in response to changes at the JTAG bus. (Specifically, in response to changes at the TMS input with respect to the TCK input.) The TAP controller FSM implements the state (16 states) diagram as detailed in the following diagram. The IR is a 3-bit register which allows a test instruction to be shifted into the AT697. The instruction selects the test to be performed and the test data register to be accessed. Although any number of loops may be supported by the TAP, the finite state machine in the TAP controller only distinguishes between the IR and a DR. The specific DR can be decoded from the instruction in the IR. 68 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Figure 44. TAP - State Machine 1 Test Logic Reset 0 0 Run Test/Idle 1 Select DR Scan 1 1 Select IR Scan 0 1 Capture DR[1] 0 1 Capture IR 0 Shift DR 0 0 Shift IR 0 Transitions between states are controlled by TMS input value. 1 Exit_1 DR 1 1 Exit_1 IR 1 0 Pause DR 0 0 Pause IR 0 1 0 Exit_2 DR 1 0 Exit_2 IR 1 Update DR[1] 1 Update IR 1 0 1 0 Due to the scan cell layout, "Capture DR" and "Update DR" are states without associated action during the scanning of internal chains. TAP Instructions The following instruction are supported by the AT697 TAP. Table 25. TAP instruction set Binary Value Instruction Name 000 001 010 111 EXTEST SAMPLE/PRELOAD BYPASS IDCODE Data Register Boundary scan register Boundary scan register Bypass register Device id register Scan Chain Accessed Boundary scan chain Boundary scan chain Bypassscan chain ID register scan chain BYPASS This instruction is binary coded "010" It is used to speed up shifting at board level through components that are not to be activated. EXTEST This instruction is binary coded "000" 69 7703C–AERO–6/09 It is used to test connections between components at board level. Components output pins are controlled by boundary scan register during Capture DR on the rising edge of TCK. SAMPLE/PRELOAD This instruction is binary coded "001" It is used to get a snapshot of the normal operation by sampling I/O states during Capture DR on the rising edge of TCK. It allows also to preload a value on the output latches during Update DR on falling edge of TCK. It do not modify system behaviour. IDCODE This instruction is binary coded "111" Value of the IDCODE is loaded during Capture DR. Test Data Registers Bypass Register The following data registers are supported in the AT697 TAP: Bypass register containing a single shift register stage is connected between TDI and TDO. Figure 45. Bypass Register Cell from TDI Shift DR Clock DR & D to TDO Device ID register Device ID register is a read only 32-bit register. It is connected between TDI and TDO. Figure 46. Device ID Register 31 28 27 12 11 1 0 Vers. 0001 Part ID 1011 . 0110 . 0100 . 0101 Manufacturer’s ID 000 . 0101 . 1000 Const. 1 ID. register value: 0x 1b64 50b1 Field Definitions: [31:28]: Vers - Version number - 0x1 [27:12]: Part ID - Represent part number as assigned by Vendor- 0x b645 [11:01]: Manufacturer’s ID - Represent manufacturer’s ID as per JEDEC - 0x 058 [0]: Const - Constant tied to logic ’1’. 70 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Boundary Scan Register A single scan chain consisting of all of the boundary scan cells (input, output and in/out cells). • The purpose of the boundary scan is the support of scan-based board testing. Boundary Scan register is connected between TDI and TDO. To use the boundary scan feature, the PLL will be in bypass mode, i.e. BYPASS signal direction to VCC. Checker Scan Register A single scan chain consisting of all of the scan cells of IU parity checkers. The checkers scan is only used for factory test. Checkers scan register is connected between TDI and TDO. 71 7703C–AERO–6/09 Execution Mode Reset Mode When the RESET input is asserted for at least two cycles, the processor enters reset mode. Under this mode, the CPU and all the peripherals are halted. Only the following registers are affected by the reset. All other registers maintain their value or are undefined. Table 26. Reset Operation Register PC nPC PSR CCR MCFG1 PRWDH MCFG3 PE Description program counter new program counter processor status register cache control register PROM bus width PROM EDAC enable Reset Value 0x0000 0000 0x0000 0004 et = 0 s=1 0x0000 0000 PIO[1:0] PIO[2] When RESET is deasserted, execution restarts from address 0. Debug Mode Debug mode can be entered when the DSU is enabled through the external DSUEN pin. This allows read/write access to all processor registers and caches memories. In debug mode, the processor pipeline is held and the processor is controlled by the DSU. AT697 can be idled by writing any value to the power-down register. During power-down mode, only the integer unit is halted. All other functions and peripherals operate as nominal. When a single write to the idle register is performed, idle mode is entered on the next load instruction. Idle mode is terminated when an unmasked interrupt with higher level than the current processor interrupt level is pending. Then, the integer unit is re-enabled. Here is a simple example allowing Idle mode entry : ! write any value to Idle register st %g2,[%g1 + 0x18] ! enter Idle mode ld [%o1 + 0x08],%g3 Power-down/Idle Mode 72 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION System Clock Overview The AT697F clock system is mainly based on two main clock trees: the PCI clock and the CPU clock. The following figure presents the clock system of the processor and its distribution. Figure 47. Clock Distribution SDCLK Interrupt Controller Timers GPI Memory Control PCI Core Caches Reg. File CPU clock CPU Core PCI Wrapper PCI clock Uarts BYPASS Uart Control Reg. UACn PLL PDIV4 LOCK Alternate UART clock CLK PCI Clock External Clock The PCI clock is dedicated to the PCI Interface. It is used in particular by the PCI wrapper that shares its activity between the two clock domains. The PCI interface and its associated wrapper can only be driven from an external clock. The PCI clock shall be connected to the PCI_CLK pin of the PCI interface. This input shall be driven at a frequency in the range of 0 up to 33MHz. CPU Clock The CPU clock is routed to the parts of the system concerned with operation of the SPARC core. Examples of such modules are the CPU core itself, the register files... The CPU clock is also used by the majority of the I/O modules like Timers, Memory controller, Interrupt Controller, with the exception of the PCI Interface. The CPU clock is driven either directly by an external oscillator or by the internal PLL. External Clock To drive the device directly from an external clock source, the CLK input shall be driven by an external clock generator while the BYPASS pin is driven high. In that way, the CPU clock is the direct representation of the clock applied to CLK. When the external CPU clock source is selected, the clock input can be driven at a frequency in the range of 0MHz up to 100MHz. PLL Overview The CPU clock can be issued from the internal PLL. This PLL contains a phase/frequency detector, charge pump, voltage control oscillator, low pass filter, lock detector and divider. 73 7703C–AERO–6/09 The PLL implemented is configured by hardware to provide a cpu clock frequency four times the frequency of the input clock. PLL control The PLL control is done by hardware through dedicated ports, including a bypass, a clock input and a filter input. The following table presents the assignement and functions of the PLL control signals. Table 27. PLL ports description Pin name LOCK CLK BYPASS Function Lock Board clock input Bypass Operation To drive the device from the internal PLL, the CLK input shall be driven by an external clock generator while the BYPASS pin is driven low. In that way, the CPU clock frequency is four time the frequency of the clock applied to CLK. When the PLL based CPU clock source is selected, the clock input shall be driven at a frequency in the range of 18MHz up to 25MHz. Fault Tolerance & Clock To prevent erroneous operations from single event transient (SET) errors and single event upset (SEU), the AT697F processor is based on full triple modular redundancy (TMR) architecture. Figure 48. TMR structure Such architecture is based on a fully triplicated clock distribution (CLK1, CLK2 and CLK3). In that way, each one of the PCI clock and the cpu clock are build as three-clock trees. Skew To prevent the processor from corruption by single event transient (SET) phenomenon, additional skew can be programmed on the clock trees. The two dedicated pins SKEW1 and SKEW0 are used to program the delay induced by the skew. Here is a short description of the skew implementation : 74 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION Figure 49. CPU clock tree overview BYPASS PLL SKEW[1:0] CLK cpu clock i1 i2 i3 SKEW[1:0] i1 CLK1 tree D2 = D1 D1 D2 i2 i3 SKEW[1:0] i1 CLK2 tree D3 D4 = D3 = 2 * D1 i2 i3 CLK3 tree D4 Three configuration of skew are available : • • • SKEW[1:0] = ’00’ : natural skew corresponding to the intrinsec routage of the chip SKEW[1:0] = ’01’ : medium skew ‘artificially’ injected SKEW[1:0] = ’10’ : maximum skew ‘artificially’ injected The remaining configuration (SKEW[1:0] = ’11’) is reserved and must not be used at application level. Table 28. SKEW assignements DELAY SKEW[1:0] ‘00’ ‘01’ ‘10’ ‘11’ CLK1 -> CLK2 natural D1 D1 + D2 Reserved CLK1 -> CLK3 natural D3 D3 + D4 Comments natural skew medium skew maximum skew Use of a high level of skew improves the efficiency of SET prevention but leads to an operating loss performance. Maximum speed is decreased and timings on the interfaces are slower than with natural skew. Refer to the ’Electrical Characteristics’ section for detailed timings at each skew. 75 7703C–AERO–6/09 Package MCGA 349 Mechanical Outlines A2 A1 e A D/E D1/E1 A1 A2 A b e mm min max 24,8 25,2 22,86 1,4 1,85 2,4 3,45 4,3 5,9 0,79 0,99 1,27 inch min max 0,976 0,992 0,9 0,055 0,073 0,094 0,136 0,169 0,232 0,031 0,04 0,05 76 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 AT697F ADVANCE INFORMATION QFP256 package Package Description 77 7703C–AERO–6/09 Registers Description Table 29. Register legend Address = 0x01010101 Bit Number 31 30 29 28 27 26 25 24 23 ... ... ... ... 9 8 7 6 5 4 3 2 1 0 field name field reserved bit access type default value after reset r=read access 0 100 1 w=write acces x = undefined or non affected by reset r/w=read and write access Integer Unit Registers Table 30. Processor State Register- PSR 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 r 0001 Bit Number 31..28 27..24 r 0001 Mnemonic impl[3:0] ver[3:0] x x r/w x x r/w xxxxxx r 0 r x r/w xxxx 1 Description Implementation or class of implementations of the architecture. Identify one or more particular implementations or is a readable and writable state field whose properties are implementation-dependent. indicates whether the ALU result was negative for the last instruction modifying icc field. 1 = negative 0 = not negative. indicates whether the ALU result was zero for the last instruction modifying icc field. 1 = zero 0 = not zero. indicates whether the ALU result was within the range of (was representable in) 32-bit 2’s complement notation for the last instruction that modified the icc field. 1 = overflow, 0 = no overflow. indicates whether a 2’s complement carry out (or borrow) occurred for the last instruction that modified the icc field. Carry is set on addition if there is a carry out of bit 31. Carry is set on subtraction if there is borrow into bit 31. 1 = carry, 0 = no carry. determines whether the implementation-dependent oprocessor is enabled. If disabled, a coprocessor instruction will trap. 1 = enabled, 0 = disabled. If an implementation does not support a coprocessor in ardware, PSR.EC should always read as 0 and writes to it should be ignored. determines whether the FPU is enabled. If disabled, a floating-point instruction will trap. 1 = enabled, 0 = disabled. If an implementation does not support a hardware FPU, PSR.EF should always read as 0 and writes to it should be ignored. identify the interrupt level above which the processor will accept an interrupt. determines whether the processor is in supervisor or user mode. 1 = supervisor mode, 0 = user mode. contains the value of the S bit at the time of the most recent trap. determines whether traps are enabled. A trap automatically resets ET to 0. When ET=0, an interrupt request is ignored and an exception trap causes the IU to halt execution, which typically results in a reset trap that resumes execution at address 0. 1 = traps enabled, 0 = traps disabled. 23 n 22 z 21 v 20 c 13 ec 12 ef 11..8 7 6 pil[3:0] s ps 5 et 78 AT697F ADVANCE INFORMATION 7703C–AERO–6/09 s impl[3:0] ver[3:0] n z v c reserved ec ef pil[3:0] ps et cwp[4:0] r/w 1 1 r/w 00000 AT697F ADVANCE INFORMATION Bit Number Mnemonic Description comprise the current window pointer, a counter that identifies the current window into the r registers. The hardware decrements the CWP on traps and SAVE instructions, and increments it on RESTORE and RETT instructions (modulo NWINDOWS). 4..0 cwp[4:0] Table 31. Window Invalid Mask - WIM 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 windows reserved 7 r 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 6 5 4 r/w 0 0 0 1 3 2 1 0 Bit Number Mnemonic Description Indicated wether the window is a ‘valid’ or an ‘invalid’ one. ‘0’ : valid ‘1’ : invalid 0
AT697F 价格&库存

很抱歉,暂时无法提供与“AT697F”相匹配的价格&库存,您可以联系我们找货

免费人工找货