AVR32UC_10

厂商：
ATMEL(爱特梅尔)
封装：
描述：
AVR32UC_10 - Technical Reference Manual - ATMEL Corporation

数据手册：

下载AVR32UC_10.pdf

立即购买

数据手册
价格&库存

AVR32UC_10 数据手册

Feature Summary • • • • • • • • • • • • • • • • Small area, high clock frequency. 32-bit load/store AVR32A RISC architecture. 15 general-purpose 32-bit registers. 32-bit Stack Pointer, Program Counter and Link Register reside in register file. Fully orthogonal instruction set. Pipelined architecture allows one instruction per clock cycle for most instructions. Byte, half-word, word and double word memory access. Fast interrupts and multiple interrupt priority levels. Privileged and unprivileged modes enabling efficient and secure Operating Systems. Optional MPU allows for operating systems with memory protection. Innovative instruction set together with variable instruction length ensuring industry leading code density. DSP extention with saturating arithmetic, and a wide variety of multiply instructions. Memory Read-Modify-Write instructions. Optional advanced On-Chip Debug system. FlashVault™ support through Secure state for executing trusted code alongside nontrusted code on the same CPU. Optional floating-point hardware. AVR32UC Technical Reference Manual 32002F–03/2010 AVR32 1. Introduction AVR32 is a new high-performance 32-bit RISC microprocessor core, designed for cost-sensitive embedded applications, with particular emphasis on low power consumption and high code density. In addition, the instruction set architecture has been tuned to allow for a variety of microarchitectures, enabling the AVR32 to be implemented as low-, mid- or high-performance processors. AVR32 extends the AVR family into the world of 32- and 64-bit applications. 1.1 The AVR family The AVR family was launched by Atmel in 1996 and has had remarkable success in the 8-and 16-bit flash microcontroller market. AVR32 is complements the current AVR microcontrollers. Through the AVR32 family, the AVR is extended into a new range of higher performance applications that is currently served by 32- and 64-bit processors To truly exploit the power of a 32-bit architecture, the new AVR32 architecture is not binary compatible with earlier AVR architectures. In order to achieve high code density, the instruction format is flexible providing both compact instructions with 16 bits length and extended 32-bit instructions. While the instruction length is only 16 bits for most instructions, powerful 32-bit instructions are implemented to further increase performance. Compact and extended instructions can be freely mixed in the instruction stream. 1.2 The AVR32 Microprocessor Architecture The AVR32 is a new innovative microprocessor architecture. It is a fully synchronous synthesisable RTL design with industry standard interfaces, ensuring easy integration into SoC designs with legacy intellectual property (IP). Through a quantitative approach, a large set of industry recognized benchmarks has been compiled and analyzed to achieve the best code density in its class of microprocessor architectures. In addition to lowering the memory requirements, a compact code size also contributes to the core’s low power characteristics. The processor supports byte and half-word data types without penalty in code size and performance. Memory load and store operations are provided for byte, half-word, word and double word data with automatic sign- or zero extension of half-word and byte data. The C-compiler is closely linked to the architecture and is able to exploit code optimization features, both for size and speed. In order to reduce code size to a minimum, some instructions have multiple addressing modes. As an example, instructions with immediates often have a compact format with a smaller immediate, and an extended format with a larger immediate. In this way, the compiler is able to use the format giving the smallest code size. Another feature of the instruction set is that frequently used instructions, like add, have a compact format with two operands as well as an extended format with three operands. The larger format increases performance, allowing an addition and a data move in the same instruction in a single cycle. Load and store instructions have several different formats in order to reduce code size and speed up execution: • Load/store to an address specified by a pointer register • Load/store to an address specified by a pointer register with postincrement • Load/store to an address specified by a pointer register with predecrement • Load/store to an address specified by a pointer register with displacement 2 32002F–03/2010 AVR32 • Load/store to an address specified by a small immediate (direct addressing within a small page) • Load/store to an address specified by a pointer register and an index register. The register file is organized as 16 32-bit registers and includes the Program Counter, the Link Register, and the Stack Pointer. In addition, one register is designed to hold return values from function calls and is used implicitly by some instructions. The AVR32 core defines several micro architectures in order to capture the entire range of applications. The microarchitectures are named AVR32A, AVR32B and so on. Different microarchitectures are suited to different end applications, allowing the designer to select a microarchitecture with the optimum set of parameters for a specific application. 1.3 Exceptions and Interrupts The AVR32 incorporates a powerful exception handling scheme. The different exception sources, like Illegal Op-code and external interrupt requests, have different priority levels, ensuring a well-defined behavior when multiple exceptions are received simultaneously. Additionally, pending exceptions of a higher priority class may preempt handling of ongoing exceptions of a lower priority class. Each priority class has dedicated registers to keep the return address and status register thereby removing the need to perform time-consuming memory operations to save this information. There are four levels of external interrupt requests, all executing in their own context. An interrupt controller does the priority handling of the external interrupts and provides the prioritized interrupt vector to the processor core. 1.4 Java Support Some AVR32 implementations provide Java hardware acceleration. To reduce gate count, AVR32UC does not implement any such hardware. 1.5 FlashVault Revision 3 of the AVR32 architecture introduced a new CPU state called Secure State. This state is instrumental in the new security technology named FlashVault. This innovation allows the on-chip flash and other memories to be partially programmed and locked, creating a safe onchip storage for secret code and valuable software intellectual property. Code stored in the FlashVault will execute as normal, but reading, copying or debugging the code is not possible. This allows a device with FlashVault code protection to carry a piece of valuable software such as a math library or an encryption algorithm from a trusted location to a potentially untrustworthy partner where the rest of the source code can be developed, debugged and programmed. 1.6 Microarchitectures The AVR32 architecture defines different microarchitectures, AVR32A and AVR32B. This enables implementations that are tailored to specific needs and applications. The microarchitectures provide different performance levels at the expense of area and power consumption. The AVR32A microarchitecture is targeted at cost-sensitive, lower-end applications like smaller microcontrollers. This microarchitecture does not provide dedicated hardware registers for shadowing of register file registers in interrupt contexts. Additionally, it does not provide hardware registers for the return address registers and return status registers. Instead, all this information is stored on the system stack. This saves chip area at the expense of slower interrupt handling. 3 32002F–03/2010 AVR32 Upon interrupt initiation, registers R8-R12 are automatically pushed to the system stack. These registers are pushed regardless of the priority level of the pending interrupt. The return address and status register are also automatically pushed to stack. The interrupt handler can therefore use R8-R12 freely. Upon interrupt completion, the old R8-R12 registers and status register are restored, and execution continues at the return address stored popped from stack. The stack is also used to store the status register and return address for exceptions and scall. Executing the rete or rets instruction at the completion of an exception or system call will pop this status register and continue execution at the popped return address. 1.7 The AVR32UC architecture The first implementation of the AVR32A architecture is called AVR32UC. This implementation targets low- and medium-performance applications, and provides an optional, advanced OCD system, no data or instruction caches, and an optional Memory Protection Unit (MPU). Java acceleration is not implemented. AVR32UC provides three memory interfaces, one High Speed Bus (HSB) master for instruction fetch, one HSB bus master for data access, and one HSB slave interface allowing other bus masters to access data RAMs internal to the CPU. Keeping data RAMs internal to the CPU allows fast access to the RAMs, reduces latency and guarantees deterministic timing. Also, power consumption is reduced by not needing a full HSB bus access for memory accesses. A dedicated data RAM interface is provided for communicating with the internal data RAMs. If an optional MPU is present, all memory accesses are checked for privilege violations. If an access is attempted to an illegal memory address, the access is aborted and an exception is taken. The following figure displays the contents of AVR32UC: 4 32002F–03/2010 AVR32 Figure 1-1. Interrupt controller interface Overview of AVR32UC. Reset interface OCD interface OCD system Power/ Reset control AVR32UC CPU pipeline MPU Instruction memory controller High Speed Bus master High Speed Bus Data memory controller High Speed Bus slave High Speed Bus High Speed Bus master High Speed Bus 1.8 AVR32UC CPU revisions Three revisions of the AVR32UC CPU currently exist: • Revision 1 implementing revision 1 of the AVR32 architecture. • Revision 2 implementing revision 2 of the AVR32 architecture, and with a faster divider. • Revision 3 implementing revision 3 of the AVR32 architecture, and with optional floating-point hardware. Revision 2 of the AVR32UC CPU added the following instructions: • movh Rd, imm • {add, sub, and, or, eor}{cond4}, Rd, Rx, Ry • ld.{sb, ub, sh, uh, w}{cond4} Rd, Rp[disp] • st.{b, h, w}{cond4} Rp[disp], Rs • rsub{cond4} Rd, imm Revision 3 of the AVR32UC CPU added the following instructions: CPU Local Bus Data RAM interface CPU Local Bus master 5 32002F–03/2010 AVR32 • sscall • retss • Floating-point instructions as described in Section 4. on page 40. Revision 3 of the AVR32UC CPU added the following system registers: • SS_STATUS • SS_ADRF, SS_ADRR, SS_ADR0, SS_ADR1 • SS_SP_SYS, SS_SP_APP • SS_RAR, SS_RSR Revision 3 of the AVR32UC CPU added the following bit in the status register: • SS AVR32UC CPU revision 2 is fully backward-compatible with revision 1, ie. code compiled for revision 1 is binary-compatible with revision 2 CPUs. AVR32UC CPU revision 3 is fully backward-compatible with revision 1 and 2, ie. code compiled for revision 1 and 2 is binary-compatible with revision 3 CPUs. The Architecture Revision field in the CONFIG0 system register identifies which architecture revision is implemented in a specific device. The “Processor and Architecture”-chapter of the device datasheet identifies the CPU revision used. 6 32002F–03/2010 AVR32 2. Programming Model This chapter describes the programming model and the set of registers accessible to the user. It also describes the implementation options in AVR32UC. 2.1 Architectural compatibility AVR32UC is fully compatible with the Atmel AVR32A architecture. AVR32UC devices implementing both revision 2 and revision 3 of the AVR32 Architecture exist. Refer to the device datasheet or the device’s CONFIG0 register to determine which architecture revision the device implements. Architecture revision 3 is fully backwards compatible with revision 2, and additionally implements: • Secure state with associated programming model • The automatic clearing of COUNT on COMPARE match is now optional and disabled by setting the NOCOMPRES bit in CPUCR. 2.2 2.2.1 Implementation options Memory protection AVR32UC optionally supports an MPU as specified by the AVR32 architecture. Java support AVR32UC does not implement Java hardware acceleration. 2.2.2 2.2.3 Floating-Point Hardware AVR32UC optionally supports Floating-Point Hardware implemented as coprocessor instructions. 2.3 Register file configuration The AVR32A architecture dictates a specific register file implementation, reproduced below. Secure state context and secure state system registers are only available in devices implementing revision 3 of the AVR32 architecture. 7 32002F–03/2010 AVR32 Figure 2-1. A p p lic a tio n B it 3 1 B it 0 B it 3 1 Register File in AVR32A S u p e r v is o r B it 0 IN T 0 B it 3 1 B it 0 IN T 1 B it 3 1 B it 0 IN T 2 B it 3 1 B it 0 IN T 3 B it 3 1 B it 0 E x c e p tio n B it 3 1 B it 0 NMI B it 3 1 B it 0 S e c u re B it 3 1 B it 0 PC LR SP_APP R 12 R 11 R 10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR PC LR SP_SYS R 12 R 11 R 10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR PC LR SP_SYS R 12 R 11 R 10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR PC LR SP_SYS R12 R11 R10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR PC LR SP_SYS R12 R11 R10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR PC LR SP_SYS R12 R11 R10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SMPC R4 R3 R2 R1 R0 SR PC LR SP_SYS R 12 R 11 R 10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR PC LR SP_SYS R 12 R 11 R 10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR PC LR SP_SEC R 12 R 11 R 10 R9 R8 IN T 0 P C R7 IN T 1 P C R6 F IN T P C R5 SM PC R4 R3 R2 R1 R0 SR SS_STATU S SS_A DR F SS_A DR R SS_A DR 0 SS_A DR 1 SS_SP_SYS SS_SP_APP SS_RAR SS_R SR 2.4 The Status Register The Status Register (SR) consists of two halfwords, one upper and one lower, see Figure 2-2 on page 8 and Figure 2-3 on page 9. The lower halfword contains the C, Z, N, V and Q flags, as well as the L and T bits, while the upper halfword contains information about the mode and state the processor executes in. The upper halfword can only be accessed from a privileged mode. Figure 2-2. Bit 31 The Status Register high halfword Bit 16 SS LC 1 0 - - DM D - M2 M1 M0 EM I3M I2M FE I1M I0M GM Bit name Initial value Global Interrupt Mask Interrupt Level 0 Mask Interrupt Level 1 Mask Interrupt Level 2 Mask Interrupt Level 3 Mask Exception Mask Mode Bit 0 Mode Bit 1 Mode Bit 2 Reserved Debug State Debug State Mask Reserved Secure State 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 8 32002F–03/2010 AVR32 Figure 2-3. Bit 15 The Status Register low halfword Bit 0 0 T 0 0 0 0 0 0 0 0 0 L 0 Q 0 V 0 N 0 Z 0 C 0 Bit name Initial value Carry Zero Sign Overflow Saturation Lock Reserved Scratch Reserved SS - Secure State This bit is indicates if the processor is executing in the secure state. Only implemented in devices implementing revision 3 of the AVR32 architecture, set to 0 in older revisions. The bit is initialized in an IMPLEMENTATION DEFINED way at reset. Refer to Section 5. ”Secure State” on page 59 for more information. DM - Debug State Mask If this bit is set, the Debug State is masked and cannot be entered. The bit is cleared at reset, and can both be read and written by software. D - Debug State The processor is in debug state when this bit is set. The bit is cleared at reset and should only be modified by debug hardware, the breakpoint instruction or the retd instruction. Undefined behaviour may result if the user tries to modify this bit using other mechanisms. M2, M1, M0 - Execution Mode These bits show the active execution mode. The settings for the different modes are shown in Table 2-1 on page 10. M2 and M1 are cleared by reset while M0 is set so that the processor is in supervisor mode after reset. These bits are modified by hardware when initiating interrupt or exception processing. Execution of the scall, rets or rete instructions will also change these bits. Undefined behaviour may result if the user tries to modify these bits using the mtsr, ssrf or csrf instructions. If software needs to change these bits, scall, rets or rete should be used, possibly with prior modifications of the stack, to achieve the desired changes in a safe way. Refer to the AVR32 Architecture Manual for the behaviour of these instructions, note especially how the stack is modified after their execution. 9 32002F–03/2010 AVR32 Table 2-1. M2 1 1 1 1 0 0 0 0 Mode bit settings M1 1 1 0 0 1 1 0 0 M0 1 0 1 0 1 0 1 0 Mode Non Maskable Interrupt Exception Interrupt level 3 Interrupt level 2 Interrupt level 1 Interrupt level 0 Supervisor Application EM - Exception mask When this bit is set, exceptions are masked. Exceptions are enabled otherwise. The bit is automatically set when exception processing is initiated or Debug Mode is entered. Software may clear this bit after performing the necessary measures if nested exceptions should be supported. This bit is set at reset. I3M - Interrupt level 3 mask When this bit is set, level 3 interrupts are masked. If I3M and GM are cleared, INT3 interrupts are enabled. The bit is automatically set when INT3 processing is initiated. Software may clear this bit after performing the necessary measures if nested INT3s should be supported. This bit is cleared at reset. I2M - Interrupt level 2 mask When this bit is set, level 2 interrupts are masked. If I2M and GM are cleared, INT2 interrupts are enabled. The bit is automatically set when INT3 or INT2 processing is initiated. Software may clear this bit after performing the necessary measures if nested INT2s should be supported. This bit is cleared at reset. I1M - Interrupt level 1 mask When this bit is set, level 1 interrupts are masked. If I1M and GM are cleared, INT1 interrupts are enabled. The bit is automatically set when INT3, INT2 or INT1 processing is initiated. Software may clear this bit after performing the necessary measures if nested INT1s should be supported. This bit is cleared at reset. I0M - Interrupt level 0 mask When this bit is set, level 0 interrupts are masked. If I0M and GM are cleared, INT0 interrupts are enabled. The bit is automatically set when INT3, INT2, INT1 or INT0 processing is initiated. Software may clear this bit after performing the necessary measures if nested INT0s should be supported. This bit is cleared at reset. GM - Global Interrupt Mask When this bit is set, all interrupts are disabled. This bit overrides I0M, I1M, I2M and I3M. The bit is automatically set when exception processing is initiated, Debug Mode is entered, or a Java trap is taken. This bit is automatically cleared when returning from a Java trap. This bit is set after reset. 10 32002F–03/2010 AVR32 T - Scratch bit This bit is not set or cleared implicit by any instruction and the programmer can therefore use this bit as a custom flag to for example signal events in the program. This bit is cleared at reset. L - Lock flag Used by the conditional store instruction. Used to support atomical memory access. Automatically cleared by rete. This bit is cleared after reset. Q - Saturation flag The saturation flag indicates that a saturating arithmetic operation overflowed. The flag is sticky and once set it has to be manually cleared by a csrf instruction after the desired action has been taken. See the Instruction set description for details. V - Overflow flag The overflow flag indicates that an arithmetic operation overflowed. See the Instruction set description for details. N - Negative flag The negative flag is modified by arithmetical and logical operations. See the Instruction set description for details. Z - Zero flag The zero flag indicates a zero result after an arithmetic or logic operation. See the Instruction set description for details. C - Carry flag The carry flag indicates a carry after an arithmetic or logic operation. See the Instruction set description for details. 2.5 System registers The system registers are placed outside of the virtual memory space, and are only accessible using the privileged mfsr and mtsr instructions. Some of the System Registers can be altered automatically by hardware. The table below lists the system registers specified in AVR32UC. The programmer is responsible for maintaining correct sequencing of any instructions following a mtsr instruction. Table 2-2. Reg # 0 1 2 3 4 5 6 7 System Registers Address 0 4 8 12 16 20 24 28 Name SR EVBA ACBA CPUCR ECR RSR_SUP RSR_INT0 RSR_INT1 Function Status Register Exception Vector Base Address Application Call Base Address CPU Control Register Exception Cause Register Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC 11 32002F–03/2010 AVR32 Table 2-2. Reg # 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33-63 64 65 66 67 68 69 70 71 72 73 System Registers (Continued) Address 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 104 108 112 116 120 124 128 132-252 256 260 264 268 272 276 280 284 288 292 Name RSR_INT2 RSR_INT3 RSR_EX RSR_NMI RSR_DBG RAR_SUP RAR_INT0 RAR_INT1 RAR_INT2 RAR_INT3 RAR_EX RAR_NMI RAR_DBG JECR JOSP JAVA_LV0 JAVA_LV1 JAVA_LV2 JAVA_LV3 JAVA_LV4 JAVA_LV5 JAVA_LV6 JAVA_LV7 JTBA JBCR Reserved CONFIG0 CONFIG1 COUNT COMPARE TLBEHI TLBELO PTBR TLBEAR MMUCR TLBARLO Function Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Return Status Register for Debug Mode Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Return Address Register for Debug Mode Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Reserved for future use Configuration register 0 Configuration register 1 Cycle Counter register Compare register Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC 12 32002F–03/2010 AVR32 Table 2-2. Reg # 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 System Registers (Continued) Address 296 300 304 308 312 316 320 324 328 332 336 340 344 348 352 356 360 364 368 372 376 380 384 388 392 396 400 404 408 412 416 420 424 428 432 436 Name TLBARHI PCCNT PCNT0 PCNT1 PCCR BEAR MPUAR0 MPUAR1 MPUAR2 MPUAR3 MPUAR4 MPUAR5 MPUAR6 MPUAR7 MPUPSR0 MPUPSR1 MPUPSR2 MPUPSR3 MPUPSR4 MPUPSR5 MPUPSR6 MPUPSR7 MPUCRA MPUCRB MPUBRA MPUBRB MPUAPRA MPUAPRB MPUCR SS_STATUS SS_ADRF SS_ADRR SS_ADR0 SS_ADR1 SS_SP_SYS SS_SP_APP Function Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Bus Error Address Register MPU Address Register region 0 MPU Address Register region 1 MPU Address Register region 2 MPU Address Register region 3 MPU Address Register region 4 MPU Address Register region 5 MPU Address Register region 6 MPU Address Register region 7 MPU Privilege Select Register region 0 MPU Privilege Select Register region 1 MPU Privilege Select Register region 2 MPU Privilege Select Register region 3 MPU Privilege Select Register region 4 MPU Privilege Select Register region 5 MPU Privilege Select Register region 6 MPU Privilege Select Register region 7 MPU Cacheable Register A MPU Cacheable Register B MPU Bufferable Register A MPU Bufferable Register B MPU Access Permission Register A MPU Access Permission Register B MPU Control Register Secure State Status Register Secure State Address Flash Register Secure State Address RAM Register Secure State Address 0 Register Secure State Address 1 Register Secure State Stack Pointer System Register Secure State Stack Pointer Application Register 13 32002F–03/2010 AVR32 Table 2-2. Reg # 110 111 112-191 192-255 248 249 250 251 252 253 254 255 System Registers (Continued) Address 440 444 448-764 768-988 992 996 1000 1004 1008 1012 1016 1020 Name SS_RAR SS_RSR Reserved IMPL MSU_ADDRHI MSU_ADDRLO MSU_LENGTH MSU_CTRL MSU_STATUS MSU_DATA MSU_TAIL Reserved Function Secure State Return Address Register Secure State Return Status Register Reserved for future use IMPLEMENTATION DEFINED Memory Service Unit Address High Register Memory Service Unit Address Low Register Memory Service Unit Length Register Memory Service Unit Control Register Memory Service Unit Status Register Memory Service Unit Data Register Memory Service Unit TailRegister Reserved for future use SR- Status Register The Status Register is mapped into the system register space. This allows it to be loaded into the register file to be modified, or to be stored to memory. The Status Register is described in detail in Section 2.4 ”The Status Register” on page 8. EVBA - Exception Vector Base Address This register contains a pointer to the exception routines. All exception routines start at this address, or at a defined offset relative to the address. Special alignment requirements may apply for EVBA, depending on the implementation of the interrupt controller. Exceptions are described in detail in the AVR32 Architecture Manual. ACBA - Application Call Base Address Pointer to the start of a table of function pointers. Subroutines can thereby be called by the compact acall instruction. This facilitates efficient reuse of code. Keeping this pointer as a register facilitates multiple function pointer tables. ACBA is a full 32 bit register, but the lowest two bits should be written to zero, making ACBA word aligned. Failing to do so may result in erroneous behaviour. CPUCR - CPU Control Register Register controlling the configuration and behaviour of the CPU. The following fields are defined: Table 2-3. Name NOCOMP RES LOCEN CPU control register Bit Other 17 Reset 0 Description Unused. Read as 0. Should be written as 0. If set, COUNT is not set on COMPARE match. If cleared, COUNT is cleared on COMPARE match. Local Bus Enable. Must be written to 1 to enable the local bus. Any access attempted to the LOCAL section when this bit is cleared will result in a BUS ERROR. 16 0 14 32002F–03/2010 AVR32 Table 2-3. Name CPU control register Bit Reset Description Slave Pending Limit. The maximum number of clock cycles the slave interface can have a request pending due to the CPU owning the RAMs. After this period, the CPU will lose arbitrartion for the RAM, and the slave access can proceed. CPU Pending Limit. The maximum number of clock cycles the CPU can have a request pending due to the slave interface owning the RAMs. After this period, the slave interface will lose arbitrartion for the RAM, and the CPU access can proceed. CPU Ownership Period. The number of cycles the CPU is guaranteed to own the RAM after it has won the arbitration for the RAM. No arbitration will be performed during this period. Slave Interface Enable. If this bit is set, the slave interface is enabled. Otherwise, the slave interface is disabled and any slave access will be stalled. SPL 15:11 16 CPL 10:6 16 COP 5:1 8 SIE 0 1 ECR - Exception Cause Register This register identifies the cause of the most recently executed exception. This information may be used to handle exceptions more efficiently in certain operating systems. The register is updated with a value equal to the EVBA offset of the exception, shifted 2 bit positions to the right. Only the 9 lowest bits of the EVBA offset are considered. As an example, an ITLB miss jumps to EVBA+0x50. The ECR will then be loaded with 0x50>>2 == 0x14. The ECR register is not loaded when an scall, Breakpoint or OCD Stop CPU exception is taken. Note that for interrupts, the offset is given by the autovector provided by the interrupt controller. The resulting ECR value may therefore overlap with an ECR value used by a regular exception. This can be avoided by choosing the autovector offsets so that no such overlaps occur. RSR_DBG - Return Status Register for Debug Mode When Debug mode is entered, the status register contents of the original mode is automatically saved in this register. When the debug routine is finished, the retd instruction copies the contents of RSR_DBG into SR. RAR_DBG - Return Address Register for Debug Mode When Debug mode is entered, the Program Counter contents of the original mode is automatically saved in this register. When the debug routine is finished, the retd instruction copies the contents of RAR_DBG into PC. CONFIG0 / 1 - Configuration Register 0 / 1 Used to describe the processor, its configuration and capabilities. The contents and functionality of these registers is described in detail in Section 2.7 ”Configuration Registers” on page 17. COUNT - Cycle Counter Register Can be used as a general counter to time for example execution time. Can also be used together with COMPARE to implement a periodic interrupt for example for an OS timer. The contents and functionality of this register is described in detail in Section 2.6 ”COMPARE and COUNT registers” on page 17. 15 32002F–03/2010 AVR32 COMPARE - Cycle Counter Compare Register Used together with COUNT to implement a periodic interrupt for example for an OS timer. The contents and functionality of this register is described in detail in Section 2.6 ”COMPARE and COUNT registers” on page 17. BEAR - Bus Error Address Register Physical address that caused a Data Bus Error. This register is Read Only. Writes are allowed, but are ignored. MPUARn - MPU Address Register n Registers that define the base address and size of the protection regions. Refer to the AVR32 Architecture Manual for details. MPUPSRn - MPU Privilege Select Register n Registers that define which privilege register set to use for the different subregions in each protection region. Refer to the AVR32 Architecture Manual for details. MPUCRA / MPUCRB - MPU Cacheable Register A / B Registers that define if the different protection regions are cacheable. Refer to the AVR32 Architecture Manual for details. MPUBRA / MPUBRB - MPU Bufferable Register A / B Registers that define if the different protection regions are bufferable. Refer to the AVR32 Architecture Manual for details. MPUAPRA / MPUAPRB - MPU Access Permission Register A / B Registers that define the access permissions for the different protection regions. Refer to the AVR32 Architecture Manual for details. MPUCR - MPU Control Register Register that control the operation of the MPU. Refer to the AVR32 Architecture Manual for details. SS_STATUS - Secure State Status Register Register that can be used to pass status or other information from the secure state to the nonsecure state. Refer to Section 5. ”Secure State” on page 59 for details. SS_ADRF, SS_ADRR, SS_ADR0, SS_ADR1 - Secure State Address Registers Registers used to partition memories into a secure and a nonsecure section. The 10 LSBs must always be written to zero. Refer to Section 5. ”Secure State” on page 59 for details. SS_SP_SYS, SS_SP_APP - Secure State SP_SYS and SP_APP Registers Read-only registers containing the SP_SYS and SP_APP values. Refer to Section 5. ”Secure State” on page 59 for details. SS_RAR, SS_RSR - Secure State Return Address and Return Status Registers Contains the address and status register of the sscall instruction that called secure state. Also used when returning to nonsecure state with the retss instruction. Refer to Section 5. ”Secure State” on page 59 for details. 16 32002F–03/2010 AVR32 MSU_ADDRHI, MSU_ADDRLO, MSU_LENGTH, MSU_CTRL, MSU_STATUS, MSU_DATA, MSU_TAIL Memory Service Unit Registers These registers are system register mappings of the Memory Service Unit Registers. Refer to Section 9.8 ”Memory Service Unit” on page 138 for details. 2.6 COMPARE and COUNT registers The COUNT register increments once every clock cycle, regardless of pipeline stalls and flushes. The COUNT register can both be read and written. The COUNT register can be used together with the COMPARE register to create a timer with periodic interrupt. The COUNT register is written to zero upon reset and compare match if the CPUCR[NOCOMPRES] bit is cleared, otherwise COUNT is not reset on compare match. Incrementation of the COUNT register can not be disabled. The COUNT register will increment even though a compare interrupt is pending. The COMPARE register holds a value that the COUNT register is compared against. The COMPARE register can both be read and written. When the COMPARE and COUNT registers match, a compare interrupt request is generated and COUNT is reset to 0. COUNT will thereafter continue incrementing in the following clock cycle. The interrupt request is routed out to the interrupt controller, which may forward the request back to the processor as a normal interrupt request at a priority level determined by the interrupt controller. Writing a value to the COMPARE register clears any pending compare interrupt requests. The compare and exception generation feature is disabled if the COMPARE register contains the value zero. The COMPARE register is written to zero upon reset. COUNT and COMPARE are clocked by a dedicated clock with the same frequency as the CPU clock. This allows them to operate in some of the sleep modes. They can therefore be used as timers even when the system use sleep modes. Consult the clock system documentation for information on which sleep modes COUNT and COMPARE are operational. 2.7 Configuration Registers Configuration registers are used to inform applications and operating systems about the setup and configuration of the processor on which it is running, see Figure 2-4 on page 17. AVR32UC implements the following read-only configuration registers. Figure 2-4. Configuration Registers CONFIG0 31 Processor ID 24 23 20 19 16 15 13 12 10 9 76543210 F J POSDR Processor Revision AT AR MMUT CONFIG1 31 IMMU SZ 26 25 DMMU SZ 20 19 ISET 16 15 13 12 10 9 DSET 65 32 0 ILSZ IASS DLSZ DASS 17 32002F–03/2010 AVR32 Table 2-4 on page 18 shows the CONFIG0 fields. Table 2-4. Name Processor ID RESERVED Processor revision CONFIG0 Fields Bit 31:24 23:20 19:16 Description Specifies the type of processor. This allows the application to distinguish between different processor implementations. Reserved for future use. Specifies the revision of the processor implementation. Architecture type Value Semantic AVR32A Unused in AVR32UC Reserved AT 15:13 0 1 Other Architecture Revision Value 0 AR 12:10 1 2 3 Other MMU type Value 0 MMUT 9:7 1 2 3 Other Semantic None, using direct mapping and no segmentation Unused in AVR32UC Unused in AVR32UC Memory Protection Unit Reserved Semantic Unused in AVR32UC Revision 1 Revision 2 Revision 3 Reserved Floating-point unit implemented Value F 6 0 1 No FPU implemented Floating-Point Unit implemented Semantic Java extension implemented Value J 5 0 1 No Java extension implemented Unused in AVR32UC Semantic Performance counters implemented Value P 4 0 1 No Performance Counters implemented Unused in AVR32UC Semantic 18 32002F–03/2010 AVR32 Table 2-4. Name CONFIG0 Fields (Continued) Bit Description On-Chip Debug implemented Value Semantic No OCD implemented OCD implemented O 3 0 1 SIMD instructions implemented Value S 2 0 1 No SIMD instructions Unused in AVR32UC Semantic DSP instructions implemented Value D 1 0 1 Unused in AVR32UC DSP instructions implemented Semantic Memory Read-Modify-Write instructions implemented Value R 0 0 1 Unused in AVR32UC RMW instructions implemented Semantic Table 2-5 on page 19 shows the CONFIG1 fields. Table 2-5. Name IMMU SZ DMMU SZ ISET ILSZ IASS DSET DLSZ DASS CONFIG1 Fields Bit 31:26 25:20 19:16 15:13 12:10 9:6 5:3 2:0 Description Unused in AVR32UC Specifies the number of MPU entries. Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC Unused in AVR32UC 19 32002F–03/2010 AVR32 3. Pipeline 3.1 Overview AVR32UC is a pipelined processor with three pipeline stages: IF, ID and EX. All instructions are issued and complete in order. Some instructions may require several iterations through the EX stage in order to complete. The following figure shows an overview of the AVR32UC pipeline stages. Figure 3-1. The AVR32UC pipeline stages. MUL M u ltip ly u n it IF P re fe tc h u n it ID D e c o d e u n it R e g file R ead ALU R e g file w rite A L U u n it LS L o a d -s to re u n it The follwing abbreviations are used in the figure: • IF - Instruction Fetch • ID - Instruction Decode • EX - Instruction Execute • MUL - Multiplier • ALU - Arithmetic-Logic Unit • LS - Load/Store Unit 3.2 Prefetch unit The prefetch unit comprises the IF pipestage, and is responsible for feeding instructions to the decode unit. The prefetch unit fetches 32 bits at a time from the instruction memory interface and places them in a FIFO prefetch buffer. At the same time, one instruction, either RISC extended or compact, is fed to the decode stage. 3.3 Decode unit The decode unit generates the necessary signals in order for the instruction to execute correctly. The ID stage accepts one instruction each clock cycle from the prefetch unit. This instruction is then decoded, and control signals and register file addresses are generated. If the instruction cannot be decoded, an illegal instruction or unimplemented instruction exception is issued. The ID stage also contains a state machine required for controlling multicycle instructions. The ID stage performs the remapping of register file addresses from logical to physical addresses. This is used for remapping the stack pointer register into the SP_APP or SP_SYS registers. 20 32002F–03/2010 AVR32 3.4 EX pipeline stage The Execute (EX) pipeline stage performs register file reads, operations on registers and memory, and register file writes. 3.4.1 ALU section The ALU pipeline performs most of the data manipulation instructions, like arithmetical and logical operations. The ALU stage performs the following tasks: • Target address calculation and condition check for change-of-flow instructions. • Condition code checking for conditional instructions. • Address calculation for memory accesses • Writeback address calculation for the LS pipeline. • All flag setting for arithmetical and logical instructions. • The saturation needed by satadd and satsub. • The operation needed by satrnds, satrndu, sats and satu. • Signed and unsigned division 3.4.2 Multiply section All multiply instructions execute in the multiply section. This section implements a 32 by 32 multiplier array, and 16x16, 32x16 and 32x32 multiplications and multiply-accumulates therefore have an issue latency of one cycle. Multiplication of 32 by 32 bits to a 64-bit result require two iterations through the multiplier array, and therefore needs several cycles to complete. This will stall the multiply pipeline until the instruction is complete. A special accumulator cache is implemented in the MUL section. This cache saves the multiplyaccumulate result in dedicated registers in the MUL section, as well as writing them back to the register file. This allows subsequent MAC instructions to read the accumulator value from the cache, instead of from the register file. This will speed up MAC operations by one clock cycle. If a MAC instruction targets a register not found in the cache, one clock cycle is added to the MAC operation, loading the accumulator value from the register file into the cache. In the next cycle, the MAC operation is restarted automatically by hardware. If an instruction, like an add, mul or load, is executed with target address equal to that of a valid cached register, the instruction will update the cache. The accumulator cache can hold one doubleword accumulator value, or one word accumulator value. Hardware ensures that the accumulator cache is kept consistent. If another pipeline section writes to one of the registers kept in the accumulator cache, the cache is updated. The cache is automatically invalidated after reset. 3.4.3 Load-store section The load-store (LS) pipeline is able to read or write one register per clock cycle. The address is calculated by the ALU section. Thereafter the address is passed on to the LS section and output to the memory interface, together with the data to write if the access is a write. If the access is a read, the read data is returned from the memory interface in the same cycle. If the read data requires typecasting or other manipulation like performed by ldins or ldswp, this manipulation is performed in the same cycle. Any load or store multiple registers are decoded by the ID stage and passed on to the EX stage as a series of single load or store word operations. 21 32002F–03/2010 AVR32 The read-modify-write instructions memc, mems and memt are performed as a non-interruptable sequence of read from and write to memory. The load-store section generates the control signals required to perform this sequence. This sequence takes several clock cycles, so any following instructions requiring the use of the load-store section must stall until the sequence is finished. Following instructions that do not use the load-store section will not have to stall even if the sequence has not finished. Some memory operations to slow memories, such as memories on the HSB bus, may require several clock cycles to perform. If required, the CPU pipeline will stall as long as necessary in order to perform the memory access. 3.5 Support for unaligned addresses All memory accesses must be performed with the correct alignment according to the data size. The only exception to this is doubleword accesses, which are performed as two word accesses, and therefore can be word-aligned. Any other unaligned memory access will cause an Data Address Exception. Instruction fetches must be halfword aligned. Any other alignment will cause an Instruction Address Exception. 3.6 Forwarding hardware and hazard detection Since the register file is read and written in the same pipeline stage, no hazards can occur, and no forwarding is necessary. The programmer does not need to take any special considerations regarding data hazards when writing code. 3.7 Event handling Due to various reasons, the CPU may be required to abort normal program execution in order to handle special, high-priority events. When handling of these events is complete, normal program execution can be resumed. Traditionally, events that are generated internally in the CPU are called exceptions, while events generated by sources external to the CPU are called interrupts. The possible sources of events are listed in Table 3-4 on page 28. The AVR32 has a powerful event handling scheme. The different event sources, like Illegal Opcode and external interrupt requests, have different priority levels, ensuring a well-defined behaviour when multiple events are received simultaneously. Additionally, pending events of a higher priority class may preempt handling of ongoing events of a lower priority class. When an event occurs, the execution of the instruction stream is halted, and execution control is passed to an event handler at an address specified in Table 3-4 on page 28. Most of the handlers are placed sequentially in the code space starting at the address specified by EVBA, with four bytes between each handler. This gives ample space for a jump instruction to be placed there, jumping to the event routine itself. A few critical handlers have larger spacing between them, allowing the entire event routine to be placed directly at the address specified by the EVBA-relative offset generated by hardware. All external interrupt sources have autovectored interrupt service routine (ISR) addresses. This allows the interrupt controller to directly specify the ISR address as an address relative to EVBA. The autovector offset has 14 address bits, giving an offset of maximum 16384 bytes. The target address of the event handler is calculated as (EVBA | event_handler_offset), not (EVBA + event_handler_offset), so EVBA and exception code segments must be set up appropriately. 22 32002F–03/2010 AVR32 The same mechanisms are used to service all different types of events, including external interrupt requests, yielding a uniform event handling scheme. Each pipeline stage has a pipeline register that holds the exception requests associated with the instruction in that pipeline stage. This allows the exception request to follow the contaminated instruction through the pipeline. Exceptions are detected in two different pipeline stages. The EX stage detects all data-address related exceptions (DTLB Protection and Data Address). All other exceptions, including interrupts, are detected in the ID stage. When an exception is detected in EX, the EX stage and all upstream stages are flushed. Generally, all exceptions, including breakpoint, have the failing instruction as restart address. This allows a fixup exception routine to correct the error and restart the instruction. Interrupts (INT0-3, NMI) have the address of the first non-completed instruction as restart address. 3.7.1 Exceptions and interrupt requests When an event other than scall or debug request is received by the core, the following actions are performed atomically: 1. The pending event will not be accepted if it is masked. The I3M, I2M, I1M, I0M, EM and GM bits in the Status Register are used to mask different events. Not all events can be masked. A few critical events (NMI, Unrecoverable Exception, TLB Multiple Hit and Bus Error) can not be masked. When an event is accepted, hardware automatically sets the mask bits corresponding to all sources with equal or lower priority. This inhibits acceptance of other events of the same or lower priority, except for the critical events listed above. Software may choose to clear some or all of these bits after saving the necessary state if other priority schemes are desired. It is the event source’s responsibility to ensure that their events are left pending until accepted by the CPU. 2. When a request is accepted, the Status Register and Program Counter of the current context is stored to the system stack. If the event is an INT0, INT1, INT2 or INT3, registers R8-R12 and LR are also automatically stored to stack. Storing the Status Register ensures that the core is returned to the previous execution mode when the current event handling is completed. When exceptions occur, both the EM and GM bits are set, and the application may manually enable nested exceptions if desired by clearing the appropriate bit. Each exception handler has a dedicated handler address, and this address uniquely identifies the exception source. 3. The Mode bits are set to reflect the priority of the accepted event, and the correct register file bank is selected. The address of the event handler, as shown in Table 3-4, is loaded into the Program Counter. The execution of the event handler routine then continues from the effective address calculated. The rete instruction signals the end of the event. When encountered, the Return Status Register and Return Address Register are popped from the system stack and restored to the Status Register and Program Counter. If the r ete i nstruction returns from INT0, INT1, INT2 or INT3, registers R8-R12 and LR are also popped from the system stack. The restored Status Register contains information allowing the core to resume operation in the previous execution mode. This concludes the event handling. Note that event priorities are only used to determine which event handler to call first when multiple events are received simultaneously. Once control is passed on to the event handler, handling of pending and lower priority events may be initiated if not masked. For instance, it is possible to make a supervisor call (SCALL) from an interrupt level 0 handler, even though the priority of a supervisor call event is lower than the active interrupt level 0 event. 23 32002F–03/2010 AVR32 3.7.2 Supervisor calls The AVR32 instruction set provides a supervisor mode call instruction. The scall instruction is designed so that privileged routines can be called from any context. This facilitates sharing of code between different execution modes. The scall mechanism is designed so that a minimal execution cycle overhead is experienced when performing supervisor routine calls from timecritical event handlers. The scall instruction behaves differently depending on which mode it is called from. The behaviour is detailed in the instruction set reference. In order to allow the scall routine to return to the correct context, a return from supervisor call instruction, rets, is implemented. In the AVR32A microarchitecture, scall and rets uses the system stack to store the return address and the status register. 3.7.3 Debug requests The AVR32 architecture defines a dedicated debug mode. When a debug request is received by the core, Debug mode is entered. Entry into Debug mode can be masked by the DM bit in the status register. Upon entry into Debug mode, hardware sets the SR[D] bit and jumps to the Debug Exception handler. By default, debug mode executes in the exception context, but with dedicated Return Address Register and Return Status Register. These dedicated registers remove the need for storing this data to the system stack, thereby improving debuggability. Debug mode is exited by executing the retd instruction. This returns to the previous context. 3.8 3.8.1 Special concerns System stack Event handling in AVR32UC, like in all AVR32A architectures, uses the system stack pointed to by the system stack pointer, SP_SYS, for pushing and popping R8-R12, LR, status register and return address. Since exception code may be timing-critical, SP_SYS should point to memory addresses in the IRAM section, since the timing of accesses to this memory section is both fast and deterministic. The user must also make sure that the system stack is large enough so that any event is able to push the required registers to stack. If the system stack is full, and an event occurs, the system will enter an UNDEFINED state. 3.8.2 Clearing of pending interrupt requests When an interrupt request is accepted by the CPU, the interrupt handler will eventually be called. The interrupt handler is responsible for performing the required actions so that the requesting module disasserts the interrupt request before the interrupt routine is exited with rete. Failing to do so will cause the interrupt handler to be re-entered after the rete instruction has been executed, since the interrupt request is still active. Different interrupt sources have different ways of disasserting requests, for example reading an interrupt cause register or writing to specific control registers. Refer to the module-specific documentation for information on how to disassert interrupt requests. Disasserting an interrupt request often requires that a bus access is performed to the requesting module. An example of such an access is to read an interrupt cause register. There will be a latency from the execution of the load or store instruction that is to disassert the interrupt request and the actual disassertion of the request. This latency can be caused by the bus system and internal latencies in the interrupting module. It is important that the programmer makes sure that the interrupt request has actually been disasserted before returning from the interrupt with rete. 24 32002F–03/2010 AVR32 This can usually be ensured by scheduling the code sequence disasserting the interrupt request in such a way that one can be certain that the interrupt request has actually been disasserted before the rete instruction is executed. Code 3-1. Clearing IRQs using code scheduling // Using scheduling of instructions in the IRQ handler to make sure that the // request has been disasserted before returning from the handler. // Assume that the IRQ is cleared by reading PERIPH_INTCAUSE, r0 points to // this register. irq_handler: ld.w r12, r0[0] // Clear the IRQ rete The mechanisms and timing required for disasserting an interrupt request from a module is specific to different modules. Usually, the request is disasserted within a few clock cycles after the load or store instruction has been received by the module. In this case, a simple way of making sure that the request has actually been disasserted is to use a data memory barrier (“Data memory barriers” on page 64). The DMB will block the CPU pipeline until the interrupt request has been disasserted. At this point, the rete instruction can safely be executed. Code 3-2. Clearing IRQs using data memory barriers // Using data memory barriers in the IRQ handler to make sure that the // request has been disasserted before returning from the handler // Assume that the IRQ is cleared by writing a bitmask to PERIPH_INTCLEAR. // r0 points to this register, r1 contains the correct bitmask. irq_handler: st.w r0[0], r1 ld.w r12, r0[0] // data memory barrier rete The programmer should consult the data sheets for the different peripheral modules to check if special timings or concerns related to disasserting of interrupt requests apply to the specific module. 3.8.3 Masking interrupt requests in peripheral modules Handling an interrupt request involves several operations like pushing of registers to stack and takes several clock cycles. The required operations are controlled by sequencing logic in hardware. This sequencing hardware does not permit that an asserted interrupt request is disasserted while it is in the process of handling this interrupt request. Hardware makes sure that manipulation of the GM and IxM bits in SREG can be performed safely at all times using the mtsr, csrf and ssrf instructions. The programmer does not need to take any special concerns when issuing one of these instructions. All hardware connected to the CPU is implemented in such a way that once an interrupt request is asserted by the hardware, it can only be disasserted by explicit actions by the programmer. 25 32002F–03/2010 AVR32 Many peripheral modules that are able to assert interrupt requests have control registers or other means of masking one or more of its interrupt requests. For example, a USART can contain an interrupt mask register with individual bits for masking “TX ready” and “RX ready” interrupts. Writing to such a mask register may cause a pending interrupt request from that module to be disasserted. The programmer must at all times make sure that an action that will disassert interrupts at the interrupt source is not performed if it is possible that the interrupt sequencing hardware is in the processing of handling the interrupt request that will be disasserted by the action. It is safe to perform such an action if one of the following is true: • The SREG GM or IxM bit corresponding to the priority of the interrupt request to be masked is set before the action is performed. • It can be guaranteed that the interrupt request being masked by the action is disasserted when the action is initiated and being performed. Code 3-3. Masking IRQs in a peripheral module which may assert an IRQ at any time // Masking TX_READY IRQ in a peripheral by setting the TXMASK bit in the // IRQMASK register of the peripheral. // Could alternatively mask the SREG IxM bit associated with the IRQ source disassert_periph_tx_irq: ssrf AVR32_SREG_GM mems PERIPH_IRQMASK, PERIPH_TXMASK csrf AVR32_SREG_GM If the interrupt request is disasserted during the critical clock cycles where the sequencing hardware is active handling this interrupt request, the CPU may enter an UNPREDICTABLE state. 3.9 Entry points for events Several different event handler entry points exists. In AVR32UC, the reset address is 0x8000_0000. This places the reset address in the boot flash memory area. TLB miss exceptions and scall have a dedicated space relative to EVBA where their event handler can be placed. This speeds up execution by removing the need for a jump instruction placed at the program address jumped to by the event hardware. All other exceptions have a dedicated event routine entry point located relative to EVBA. The handler routine address identifies the exception source directly. AVR32UC uses the ITLB and DTLB protection exceptions to signal a MPU protection violation. ITLB and DTLB miss exceptions are used to signal that an access address did not map to any of the entries in the MPU. TLB multiple hit exception indicates that an access address did map to multiple TLB entries, signalling an error. All external interrupt requests have entry points located at an offset relative to EVBA. This autovector offset is specified by an external Interrupt Controller. The programmer must make sure that none of the autovector offsets interfere with the placement of other code. The autovector offset has 14 address bits, giving an offset of maximum 16384 bytes. Special considerations should be made when loading EVBA with a pointer. Due to security considerations, the event handlers should be located in non-writeable flash memory, or optionally in a privileged memory protection region if an MPU is present. 26 32002F–03/2010 AVR32 If several events occur on the same instruction, they are handled in a prioritized way. The priority ordering is presented in Table 3-4. If events occur on several instructions at different locations in the pipeline, the events on the oldest instruction are always handled before any events on any younger instruction, even if the younger instruction has events of higher priority than the oldest instruction. An instruction B is younger than an instruction A if it was sent down the pipeline later than A. The addresses and priority of simultaneous events are shown in Table 3-4 on page 28. Some of the exceptions are unused in AVR32UC since it has no MMU or coprocessor interface. The interrupt system requires that an interrupt controller is present outside the core in order to prioritize requests and generate a correct offset if more than one interrupt source exists for each priority level. An interrupt controller generating different offsets depending on interrupt request source is referred to as autovectoring. Note that the interrupt controller should generate autovector addresses that do not conflict with addresses in use by other events or regular program code. The addresses of the interrupt routines are calculated by adding the address on the autovector offset bus to the value of the Exception Vector Base Address (EVBA). The INT0, INT1, INT2, INT3, and NMI signals indicate the priority of the pending interrupt. INT0 has the lowest priority, and NMI the highest priority of the interrupts. 27 32002F–03/2010 AVR32 Table 3-4. Priority 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Priority and handler addresses for events Handler Address 0x8000_0000 Provided by OCD system EVBA+0x00 EVBA+0x04 EVBA+0x08 EVBA+0x0C EVBA+0x10 Autovectored Autovectored Autovectored Autovectored EVBA+0x14 EVBA+0x50 EVBA+0x18 EVBA+0x1C EVBA+0x20 EVBA+0x24 EVBA+0x28 EVBA+0x2C EVBA+0x30 EVBA+0x100 EVBA+0x34 EVBA+0x38 EVBA+0x60 EVBA+0x70 EVBA+0x3C EVBA+0x40 EVBA+0x44 Name Reset OCD Stop CPU Unrecoverable exception TLB multiple hit Bus error data fetch Bus error instruction fetch NMI Interrupt 3 request Interrupt 2 request Interrupt 1 request Interrupt 0 request Instruction Address ITLB Miss ITLB Protection Breakpoint Illegal Opcode Unimplemented instruction Privilege violation Floating-point Coprocessor absent Supervisor call Data Address (Read) Data Address (Write) DTLB Miss (Read) DTLB Miss (Write) DTLB Protection (Read) DTLB Protection (Write) DTLB Modified Event source External input OCD system Internal MPU Data bus Data bus External input External input External input External input External input CPU MPU MPU OCD system Instruction Instruction Instruction UNUSED Coprocessor Instruction CPU CPU MPU MPU MPU MPU UNUSED PC of offending instruction PC(Supervisor Call) +2 PC of offending instruction PC of offending instruction PC of offending instruction PC of offending instruction PC of offending instruction PC of offending instruction Stored Return Address Undefined First non-completed instruction PC of offending instruction PC of offending instruction First non-completed instruction First non-completed instruction First non-completed instruction First non-completed instruction First non-completed instruction First non-completed instruction First non-completed instruction PC of offending instruction PC of offending instruction PC of offending instruction First non-completed instruction PC of offending instruction PC of offending instruction PC of offending instruction 3.9.1 3.9.1.1 Description of events Reset Exception The Reset exception is generated when the reset input line to the CPU is asserted. The Reset exception can not be masked by any bit. The Reset exception resets all synchronous elements and registers in the CPU pipeline to their default value, and starts execution of instructions at address 0x8000_0000. SR = reset_value_of_SREG; 28 32002F–03/2010 AVR32 PC = 0x8000_0000; All other system registers are reset to their reset value, which may or may not be defined. Refer to the Programming Model chapter for details. 3.9.1.2 OCD Stop CPU Exception The OCD Stop CPU exception is generated when the OCD Stop CPU input line to the CPU is asserted. The OCD Stop CPU exception can not be masked by any bit. This exception is identical to a non-maskable, high priority breakpoint. Any subsequent operation is controlled by the OCD hardware. The OCD hardware will take control over the CPU and start to feed instructions directly into the pipeline. RSR_DBG = SR; RAR_DBG = PC; SR[M2:M0] = B’110; SR[D] = 1; SR[DM] = 1; SR[EM] = 1; SR[GM] = 1; 3.9.1.3 Unrecoverable Exception The Unrecoverable Exception is generated when an exception request is issued when the Exception Mask (EM) bit in the status register is asserted. The Unrecoverable Exception can not be masked by any bit. The Unrecoverable Exception is generated when a condition has occurred that the hardware cannot handle. The system will in most cases have to be restarted if this condition occurs. *(--SPSYS) = PC of offending instruction; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x00; 3.9.1.4 TLB Multiple Hit Exception The TLB Multiple Hit Exception is generated when an access hits in multiple MPU regions. This is usually caused by programming error. Used only if an MPU is present. *(--SPSYS) = PC of offending instruction; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x04; 3.9.1.5 Bus Error Exception on Data Access The Bus Error on Data Access exception is generated when the data bus detects an error condition. This exception is caused by events unrelated to the instruction stream, or by data written to the cache write-buffers many cycles ago. Therefore, execution can not be resumed in a safe way after this exception. The return address placed on stack is unrelated to the operation that 29 32002F–03/2010 AVR32 caused the exception. The exception handler is responsible for performing the appropriate action. *(--SPSYS) = PC of first non-issued instruction; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x08; BEAR = failing address 3.9.1.6 Bus Error Exception on Instruction Fetch The Bus Error on Instruction Fetch exception is generated when the data bus detects an error condition. This exception is caused by events related to the instruction stream. Therefore, execution can be restarted in a safe way after this exception, assuming that the condition that caused the bus error is dealt with. *(--SPSYS) = PC of first non-issued instruction; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x0C; 3.9.1.7 NMI Exception The NMI exception is generated when the NMI input line to the core is asserted. The NMI exception can not be masked by the SR[GM] bit. However, the core ignores the NMI input line when processing an NMI Exception (the SR[M2:M0] bits are B’111). This guarantees serial execution of NMI Exceptions, and simplifies the NMI hardware and software mechanisms. Since the NMI exception is unrelated to the instruction stream, the instructions in the pipeline are allowed to complete. After finishing the NMI exception routine, execution should continue at the instruction following the last completed instruction in the instruction stream. *(--SPSYS) = PC of first noncompleted instruction; *(--SPSYS) = SR; SR[M2:M0] = B’111; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x10; 3.9.1.8 INT3 Exception The INT3 exception is generated when the INT3 input line to the core is asserted. The INT3 exception can be masked by the SR[GM] bit, and the SR[I3M] bit. Hardware automatically sets the SR[I3M] bit when accepting an INT3 exception, inhibiting new INT3 requests when processing an INT3 request. The INT3 Exception handler address is calculated by adding EVBA to an interrupt vector offset specified by an interrupt controller outside the core. The interrupt controller is responsible for providing the correct offset. 30 32002F–03/2010 AVR32 Since the INT3 exception is unrelated to the instruction stream, the instructions in the pipeline are allowed to complete. After finishing the INT3 exception routine, execution should continue at the instruction following the last completed instruction in the instruction stream. *(--SPSYS) = R8; *(--SPSYS) = R9; *(--SPSYS) = R10; *(--SPSYS) = R11; *(--SPSYS) = R12; *(--SPSYS) = LR; *(--SPSYS) = PC of first noncompleted instruction; *(--SPSYS) = SR; SR[M2:M0] = B’101; SR[I3M] = 1; SR[I2M] = 1; SR[I1M] = 1; SR[I0M] = 1; PC = EVBA | INTERRUPT_VECTOR_OFFSET; 3.9.1.9 INT2 Exception The INT2 exception is generated when the INT2 input line to the core is asserted. The INT2 exception can be masked by the SR[GM] bit, and the SR[I2M] bit. Hardware automatically sets the SR[I2M] bit when accepting an INT2 exception, inhibiting new INT2 requests when processing an INT2 request. The INT2 Exception handler address is calculated by adding EVBA to an interrupt vector offset specified by an interrupt controller outside the core. The interrupt controller is responsible for providing the correct offset. Since the INT2 exception is unrelated to the instruction stream, the instructions in the pipeline are allowed to complete. After finishing the INT2 exception routine, execution should continue at the instruction following the last completed instruction in the instruction stream. *(--SPSYS) = R8; *(--SPSYS) = R9; *(--SPSYS) = R10; *(--SPSYS) = R11; *(--SPSYS) = R12; *(--SPSYS) = LR; *(--SPSYS) = PC of first noncompleted instruction; *(--SPSYS) = SR; SR[M2:M0] = B’100; SR[I2M] = 1; SR[I1M] = 1; SR[I0M] = 1; PC = EVBA | INTERRUPT_VECTOR_OFFSET; 3.9.1.10 INT1 Exception The INT1 exception is generated when the INT1 input line to the core is asserted. The INT1 exception can be masked by the SR[GM] bit, and the SR[I1M] bit. Hardware automatically sets 31 32002F–03/2010 AVR32 the SR[I1M] bit when accepting an INT1 exception, inhibiting new INT1 requests when processing an INT1 request. The INT1 Exception handler address is calculated by adding EVBA to an interrupt vector offset specified by an interrupt controller outside the core. The interrupt controller is responsible for providing the correct offset. Since the INT1 exception is unrelated to the instruction stream, the instructions in the pipeline are allowed to complete. After finishing the INT1 exception routine, execution should continue at the instruction following the last completed instruction in the instruction stream. *(--SPSYS) = R8; *(--SPSYS) = R9; *(--SPSYS) = R10; *(--SPSYS) = R11; *(--SPSYS) = R12; *(--SPSYS) = LR; *(--SPSYS) = PC of first noncompleted instruction; *(--SPSYS) = SR; SR[M2:M0] = B’011; SR[I1M] = 1; SR[I0M] = 1; PC = EVBA | INTERRUPT_VECTOR_OFFSET; 3.9.1.11 INT0 Exception The INT0 exception is generated when the INT0 input line to the core is asserted. The INT0 exception can be masked by the SR[GM] bit, and the SR[I0M] bit. Hardware automatically sets the SR[I0M] bit when accepting an INT0 exception, inhibiting new INT0 requests when processing an INT0 request. The INT0 Exception handler address is calculated by adding EVBA to an interrupt vector offset specified by an interrupt controller outside the core. The interrupt controller is responsible for providing the correct offset. Since the INT0 exception is unrelated to the instruction stream, the instructions in the pipeline are allowed to complete. After finishing the INT0 exception routine, execution should continue at the instruction following the last completed instruction in the instruction stream. *(--SPSYS) = R8; *(--SPSYS) = R9; *(--SPSYS) = R10; *(--SPSYS) = R11; *(--SPSYS) = R12; *(--SPSYS) = LR; *(--SPSYS) = PC of first noncompleted instruction; *(--SPSYS) = SR; SR[M2:M0] = B’010; SR[I0M] = 1; PC = EVBA | INTERRUPT_VECTOR_OFFSET; 32 32002F–03/2010 AVR32 3.9.1.12 Instruction Address Exception The Instruction Address Error exception is generated if the generated instruction memory address has an illegal alignment. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x14; 3.9.1.13 ITLB Miss Exception The ITLB Miss exception is generated when the MPU is enabled and the instruction memory access does not hit in any regions. Used only if an MPU is present. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x50; 3.9.1.14 ITLB Protection Exception The ITLB Protection exception is generated when the instruction memory access violates the access rights specified by the protection region in which the address lies. Used only if an MPU is present. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x18; 3.9.1.15 Breakpoint Exception The Breakpoint exception is issued when the OCD breakpoint input line to the CPU is aseerted, and SREG[DM] is cleared. When entering the exception routine, RAR_DBG points to the breakpoint instruction, and the CPU will enter Debug mode. An external debugger can optionally assume control of the CPU when the Breakpoint Exception is executed. The debugger can then issue individual instructions to be executed in Debug mode. Debug mode is exited with the retd instruction. This passes control from the debugger back to the CPU, resuming normal execution. RSR_DBG = SR; RAR_DBG = PC; SR[M2:M0] = B’110; SR[D] = 1; SR[DM] = 1; SR[EM] = 1; SR[GM] = 1; 33 32002F–03/2010 AVR32 PC = EVBA | 0x1C; 3.9.1.16 Illegal Opcode This exception is issued when the core fetches an unknown instruction, or when a coprocessor instruction is not acknowledged. When entering the exception routine, the return address on stack points to the instruction that caused the exception. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x20; 3.9.1.17 Unimplemented Instruction This exception is issued when the core fetches an instruction supported by the instruction set but not by the current implementation. This allows software implementations of unimplemented instructions. When entering the exception routine, the return address on stack points to the instruction that caused the exception. Table 3-5. List of unimplemented instructions. Comment No SIMD implemented Privileged Instructions All SIMD instructions Coprocessor instructions adressing unimplemented coprocessors cache - perform cache operation incjosp - increment Java stack pointer popjc - pop Java context pushjc - push Java context retj- return from Java mode tlbr - read addressed TLB entry into TLBEHI and TLBELO tlbw - write TLB entry registers into TLB tlbs - search TLB for entry matching TLBEHI[VPN] No cache implemented No Java implemented No Java implemented No Java implemented No Java implemented No MMU present No MMU present No MMU present *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x24; 34 32002F–03/2010 AVR32 3.9.1.18 Data Read Address Exception The Data Read Address Error exception is generated if the address of a data memory read has an illegal alignment. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x34; 3.9.1.19 Data Write Address Exception The Data Write Address Error exception is generated if the address of a data memory write has an illegal alignment. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x38; 3.9.1.20 DTLB Read Miss Exception The DTLB Read Miss exception is generated when the MPU is enabled and the data memory read access does not hit in any regions. Used only if an MPU is present. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x60; 3.9.1.21 DTLB Write Miss Exception The DTLB Write Miss exception is generated when the MPU is enabled and the data memory write access does not hit in any regions. Used only if an MPU is present. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x70; 3.9.1.22 DTLB Read Protection Exception The DTLB Protection exception is generated when the data memory read violates the access rights specified by the protection region in which the address lies. Used only if an MPU is present. *(--SPSYS) = PC; *(--SPSYS) = SR; 35 32002F–03/2010 AVR32 SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x3C; 3.9.1.23 DTLB Write Protection Exception The DTLB Protection exception is generated when the data memory write violates the access rights specified by the protection region in which the address lies. Used only if an MPU is present. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x40; 3.9.1.24 Privilege Violation Exception If the application tries to execute privileged instructions, this exception is issued. The complete list of priveleged instructions is shown in Table 3-6 on page 36. When entering the exception routine, the address of the instruction that caused the exception is stacked as the return address. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x28; Table 3-6. List of instructions which can only execute in privileged modes. Comment Privileged only when accessing upper half of status register Privileged Instructions csrf - clear status register flag mtsr - move to system register mfsr - move from system register mtdr - move to debug register mfdr - move from debug register rete- return from exception rets - return from supervisor call retd - return from debug mode sleep - sleep ssrf - set status register flag Privileged only when accessing upper half of status register 3.9.1.25 DTLB Modified Exception Unused in AVR32UC, since it has no MMU. 36 32002F–03/2010 AVR32 3.9.1.26 Floating-point Exception Unused in AVR32UC. Coprocessor Absent Exception The Coprocessor Absent exception is generated when a nonexisting coprocessor is addressed by a coprocessor instruction. Used only if one or more coprocessors are present. Executing coprocessor instructions in systems with no coprocessors results in an Unimplemented Instruction exception instead. *(--SPSYS) = PC; *(--SPSYS) = SR; SR[M2:M0] = B’110; SR[EM] = 1; SR[GM] = 1; PC = EVBA | 0x30; 3.9.1.27 3.9.1.28 Supervisor call Supervisor calls are signalled by the application code executing a supervisor call (scall) instruction. The scall instruction behaves differently depending on which context it is called from. This allows scall to be called from other contexts than Application. When the exception routine is finished, execution continues at the instruction following scall. The rets instruction is used to return from supervisor calls. If ( SR[M2:M0] == {B’000 or B’001} ) *(--SPSYS) = PC; *(--SPSYS) = SR; PC ← EVBA | 0x100; SR[M2:M0] ← B’001; else LR ← PC + 2; PC ← EVBA | 0x100; 3.10 Interrupt latencies The following features in AVR32UC ensure low and deterministic interrupt latency: • Four different interrupt levels and an NMI ensures that the user can efficiently prioritize the interrupt sources. • Long-running instructions such as ldm, stm, pushm, popm, divs and divu will be aborted if an interrupt request is received. The slowest instruction that can not be aborted by a pending interrupt has a worst case issue latency of 5 cycles. This implies that an interrupt request will need to wait at most 5 cycles for an instruction to complete. The fastest instructions need only a single cycle to complete. • Interrupts are autovectored, allowing the CPU to jump directly to the interrupt handler. • When an interrupt of level m is received, the CPU will start stacking register file registers, return address and status register. After this stacking is performed, the CPU will jump to the autovector address of the interrupt of level m. If an interrupt of level n, where n > m, is received during this stacking, the CPU will jump to the autovector address of the interrupt of level n, NOT the autovector address of the original interrupt. 37 32002F–03/2010 AVR32 Note that the overall system latency from an interrupt request is signaled to the request is being handled depends on a number of things in addition to the latency through the CPU. The latency through the interrupt controller will affect interrupt latency for all peripheral interrupt requests and the bus matrix, code and data memories will affect overall responsiveness. 3.10.1 Maximum interrupt latency The maximum CPU interrupt latency can be calculated as follows: Table 3-7. Source Wait for the slowest instruction to complete Stack register file registers, return address and status register, and jump to autovector target Wait for autovector target instruction to be fetched TOTAL Maximum interrupt latency Delay 6 10 1 17 3.10.2 Minimum interrupt latency The minimum CPU interrupt latency of an interrupt request of level m will occur when the CPU is in the process of stacking the registers and return address associated with an interrupt request of level n, where n < m. If the level m interrupt request arrives just as the CPU is about to jump to the autovector address for the interrupt of level n, the CPU will jump directly to the autovector address of the latest arriving interrupt. In this case, the minimum interrupt latency is as follows: Table 3-8. Source Jump to autovector target Wait for autovector target instruction to be fetched TOTAL Minimum interrupt latency - higher priority interrupt preempts lower priority interrupt Delay 1 1 2 Assuming that the interrupt request arrives when the CPU is in the process of executing program code, the minimum interrupt latency can be calculated as follows: Table 3-9. Source Wait for the fastest instruction to complete Stack register file registers, return address and status register, and jump to autovector target Wait for autovector target instruction to be fetched TOTAL Minimum interrupt latency - interrupt received when executing program code Delay 1 10 1 12 3.11 NMI latency Non-maskable interrupts (NMI) behave similarly to interrupts, except that they do not automatically push register file registers on the stack. NMI can, similar to interrupts, abort long-running instructions. 38 32002F–03/2010 AVR32 The maximum NMI latency can be calculated as follows: Table 3-10. Source Wait for the slowest instruction to complete Stack return address and status register, and jump to autovector target Wait for autovector target instruction to be fetched TOTAL Maximum NMI latency Delay 6 4 1 11 39 32002F–03/2010 AVR32 4. Floating Point Hardware Newer versions of UC3 CPU introduced optional floating-point hardware performing 32-bit floating-point operations. Instructions controlling this hardware are mapped into the coprocessor instruction space, addressed as coprocessor 0. The CONFIG0 system register F bit indicates if floating-point hardware is present on a specific AVR32 device. The floating point hardware reads operands and places results in the same register file as the traditional AVR32 instructions. Floating-point compare updates the flags in the AVR32 Status Register, so that the regular AVR32 branch instructions can be used directly after a floatingpoint compare. The floating-point hardware consists of a fused multiply-accumulate unit, performing ± A ± ( X × Y ) ) as a single operation with no intermediate rounding, thereby resulting in greater precision than if separate multiplication and addition had been performed. Hardware is also provided to convert between integer and floating-point, to compare floating-point values, and to provide initial approximations for reciprocal and reciprocal square root. 4.1 Compliance The floating point hardware conforms to the requirements of the C standard, which is based on the IEEE 754 floating point standard. The round-to-nearest, ties to even rounding mode is used for all instructions except float-to-integer conversions. Float-to-integer conversions use the round-to-zero mode. The hardware supports denormal numbers. Signalling NaN are not provided, all NaN are non-signalling (quiet). NaNs are not propagated, the default quiet NaN is always returned (0x7FC00000). No floating-point exceptions are generated. 4.2 Operations The floating-point instructions are mapped into the coprocessor instruction space, but use the ordinary integer register file. The ordinary integer instructions such as memory accesses and logical operations can therefore be used on the same register data as the floating point hardware uses. Therefore, no special floating-point data transfer instructions are required. All floating point instructions are mapped to coprocessor 0 cop instructions, i.e. they are aliases for cop instructions. Attempting to execute instructions on any other coprocessor than coprocessor 0 will return a coprocessor absent exception. Attempting to execute coprocessor 0 instructions other than cop on a device with floating point hardware will result in an unimplemented instruction exception. Attempting to execute coprocessor 0 cop instructions on a device without floating point hardware will result in an unimplemented instruction exception. 4.2.1 Floating point compare (fcp.s) The floating point compare instruction, fcp.s, updates the status register flags. Ordinary AVR32 branch instructions such as breq and conditional instructions such as retge and movls can use 40 32002F–03/2010 AVR32 the condition flags set by fcmp.s directly. The following mapping from floating point compare results to AVR32 status register flags is used: Table 4-1. Floating point compare flag setting Status register flags SREG[C] = 1 SREG[N] = 1 SREG[V] = 0 SREG[Z] = 0 SREG[C] = 0 SREG[N] = 0 SREG[V] = 0 SREG[Z] = 0 SREG[C] = 0 SREG[N] = 0 SREG[V] = 0 SREG[Z] = 1 SREG[C] = 0 SREG[N] = 0 SREG[V] = 1 SREG[Z] = 0 Compare result Less Greater Equal Unordered Table 4-2. Branch if: Equal Not Equal Floating point branch conditions AVR32 Branch condition mnemonic eq ne ge gt lo ls vs Greater than or equal Greater than Less than Less than or equal Unordered 4.2.2 Floating point check (fchk.s) This instruction checks the operand for special values, such as Not-a-Number (NaN), infinity (inf) and denormal. Status register flags are set according to the result of the fchk.s instruction. This instruction is useful since some algorithms require special treatment of these special values. The floating point approximation instructions updates the status register flags in the same way as fchk.s, since iterative approximation algorithms require special handling of these special values. Ordinary AVR32 branch instructions such as breq and conditional instructions such as retge and movls can use the condition flags set by fchk.s directly. 41 32002F–03/2010 AVR32 4.3 Instruction set The following instructions are provided: Table 4-3. Mnemonics fmac.s fnmac.s fmsc.s fnmsc.s fmul.s fnmul.s fadd.s fsub.s Floating point arithmetical instructions Operands Rd, Ra, Rx, Ry Rd, Ra, Rx, Ry Rd, Ra, Rx, Ry Rd, Ra, Rx, Ry Rd, Rx, Ry Rd, Rx, Ry Rd, Rx, Ry Rd, Rx, Ry Description Multiply accumulate. (Rd ← Ra + Rx*Ry) Multiply accumulate. (Rd ← −Ra + Rx*Ry) Multiply subtract. (Rd ← Ra − Rx*Ry) Multiply subtract. (Rd ← −Ra − Rx*Ry) Multiply. (Rd ← Rx*Ry) Multiply. (Rd ← −Rx*Ry) Add. (Rd ← Rx + Ry) Subtract. (Rd ← Rx − Ry) Issue latency 2 2 2 2 2 2 2 2 : Table 4-4. Mnemonics fcastrs.sw fcastrs.uw fcastsw.s Floating point conversion instructions Operands Rd, Ry Rd, Ry Rd, Ry Description Convert float to signed word, round-to-zero. (Rd ← (signed int)Ry) Convert float to unsigned word, round-to-zero. (Rd ← (unsigned int)Ry) Convert signed word to float, round-to-nearest. (Rd ← (float)Ry) Convert unsigned word to float, round-tonearest. (Rd ← (float)Ry) Issue latency 1 1 1 fcastuw.s Rd, Ry 1 42 32002F–03/2010 AVR32 : Table 4-5. Mnemonics fcp.s Floating point compare instructions Operands Rd, Rx Description Compare floating point values in Rd and Rx, and set status register flags accordingly. Check floating point value in Rd for special values such as Inf, NaN and Denormal, and set status register flags accordingly. Issue latency 1 fchk.s Ry 1 : Table 4-6. Mnemonics frcpa.s frsqrta.s Floating point approximation instructions Operands Rd, Ry Rd, Ry Description (Rd ← approx(1/Rx)), set status flags as fchk.s (Rd ← approx(1/sqrt(Rx))), set status flags as fchk.s Issue latency 1 1 4.4 Detailed instruction description 43 32002F–03/2010 AVR32 FMAC.S – Floating Point Multiply-Accumulate Description Performs multiply-accumulate of the registers specified and stores the result in destination register. Operation: I. Rd ← Ra + Rx*Ry; Syntax: I. fmac.s Rd, Ra, Rx, Ry Operands: I. {a, d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 0 Rd Rx Ry 1 29 1 28 0 0 0 25 0 24 1 1 0 1 20 0 19 Ra 0 16 Not affected Not affected Not affected Not affected Not affected 44 32002F–03/2010 AVR32 FNMAC.S – Floating Point Negate-Multiply-Accumulate Description Performs negate-multiply-accumulate of the registers specified and stores the result in destination register. Operation: I. Rd ← - Ra + Rx*Ry; Syntax: I. fnmac.s Rd, Ra, Rx, Ry Operands: I. {a, d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 1 Rd Rx Ry 1 29 1 28 0 0 0 25 0 24 1 1 0 1 20 0 19 Ra 0 16 Not affected Not affected Not affected Not affected Not affected 45 32002F–03/2010 AVR32 FMSC.S – Floating Point Multiply-Subtract Description Performs multiply-subtract of the registers specified and stores the result in destination register. Operation: I. Rd ← Ra - Rx*Ry; Syntax: I. fmsc.s Rd, Ra, Rx, Ry Operands: I. {a, d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 0 Rd Rx Ry 1 29 1 28 0 0 0 25 1 24 1 1 0 1 20 0 19 Ra 0 16 Not affected Not affected Not affected Not affected Not affected 46 32002F–03/2010 AVR32 FNMSC.S – Floating Point Negate-Multiply-Subtract Description Performs negate-multiply-subtract of the registers specified and stores the result in destination register. Operation: I. Rd ← - Ra - Rx*Ry; Syntax: I. fnmsc.s Rd, Ra, Rx, Ry Operands: I. {a, d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 1 Rd Rx Ry 1 29 1 28 0 0 0 25 1 24 1 1 0 1 20 0 19 Ra 0 16 Not affected Not affected Not affected Not affected Not affected 47 32002F–03/2010 AVR32 FADD.S – Floating Point Add Description Performs addition of the registers specified and stores the result in destination register. Operation: I. Rd ← Rx + Ry; Syntax: I. fadd.s Rd, Rx, Ry Operands: I. {d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 0 Rd Rx Ry 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 0 0 0 16 0 0 Not affected Not affected Not affected Not affected Not affected 48 32002F–03/2010 AVR32 FSUB.S – Floating Point Subtract Description Performs subtraction of the registers specified and stores the result in destination register. Operation: I. Rd ← Rx - Ry; Syntax: I. fsub.s Rd, Rx, Ry Operands: I. {d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 0 Rd Rx Ry 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 0 0 0 16 1 0 Not affected Not affected Not affected Not affected Not affected 49 32002F–03/2010 AVR32 FMUL.S – Floating Point Multiplication Description Performs multiplication of the registers specified and stores the result in destination register. Operation: I. Rd ← Rx * Ry; Syntax: I. fmul.s Rd, Rx, Ry Operands: I. {d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 0 Rd Rx Ry 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 0 0 1 16 0 0 Not affected Not affected Not affected Not affected Not affected 50 32002F–03/2010 AVR32 FNMUL.S – Floating Point Multiply-Negate Description Performs multiply-negate of the registers specified and stores the result in destination register. Operation: I. Rd ← - Rx * Ry; Syntax: I. fnmul.s Rd, Rx, Ry Operands: I. {d, x, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Opcode: 31 1 15 0 0 0 0 Rd Rx Ry 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 0 0 1 16 1 0 Not affected Not affected Not affected Not affected Not affected 51 32002F–03/2010 AVR32 FCAST{S,U}W.S – Convert from Integer to Floating Point Description Converts the signed or unsigned integer specified and stores the result in destination register. The conversion used is rounds to nearest, ties to even. Operation: I. Rd ← (float)Rx; Rx is signed integer, round-to-nearest-even II. Rd ← (float)Rx; Rx is unsigned integer, round-to-nearest-even Syntax: I. fcastsw.s II. fcastuw.s Rd, Ry Rd, Ry Operands: I-IV. {d, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Not affected Not affected Not affected Not affected Not affected Opcode: S=0: Ry is an unsigned number, S=1: Ry is signed number 31 29 28 25 24 20 1 15 0 0 0 0 Rd 0 0 0 0 1 1 0 0 1 0 1 1 0 1 0 19 0 1 S 16 0 0 Ry 52 32002F–03/2010 AVR32 FCASTRS.{S,U}W – Convert from Floating Point to Integer Description Converts the floating-point number in the specified register to a signed or unsigned integer and stores the result in destination register. Rounding used is towards zero. Operation: I. Rd ← (signed int)Rx; Round towards zero II. Rd ← (unsigned int)Rx; Round towards zero Syntax: I. fcastrs.sw II. fcastrs.uw Rd, Ry Rd, Ry Operands: I-IV. {d, y} ∈ {0, 1, …, 15} Status Flags: Q: V: N: Z: C: Not affected Not affected Not affected Not affected Not affected Opcode: S=0: Ry is an unsigned number, S=1: Ry is signed number 31 29 28 25 24 20 1 15 0 0 0 0 Rd 0 0 0 0 1 1 0 0 1 0 1 1 0 1 0 19 1 0 S 16 1 0 Ry 53 32002F–03/2010 AVR32 FCP.S – Floating Point Compare Description Performs a compare between the two floating point operands specified. The operation is implemented by doing a floating-point subtraction without writeback of the difference. The operation sets the status flags according to the result of the subtraction, but does not affect the operand registers. See Table 4-2, “Floating point branch conditions,” on page 41 for branch condition mnemonics corresponding to different compare results. Operation: I. Rx - Ry; Syntax: I. fcmp.s Rx, Ry Operands: I. {x, y} ∈ {0, 1, …, 15} Status Flags: Q: Not affected Compare result Status register flags C←1 N←1 V←0 Z←0 C←0 N←0 V←0 Z←0 C←0 N←0 V←0 Z←1 C←0 N←0 V←1 Z←0 Less Greater Equal Unordered Opcode: 31 1 15 0 0 0 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 1 1 0 16 0 0 0 0 0 0 0 Rx Ry 54 32002F–03/2010 AVR32 FCHK.S – Floating Point Check for Special Values Description Checks the floating point operand specified for the special values Infinity, Not-a-Number and Denormal. A check is also performed for values with the two biggest possible representable exponents, i.e. 0xFD and 0xFE. This is useful for avoiding overflow in intermediate calculations in certain iterative algorithms. The operation sets the status flags according to the result of the check, but does not affect the operand register. Operation: I. Set flags depending on the value in the specified register Syntax: I. fchk.s Ry Operands: I. y ∈ {0, 1, …, 15} Status Flags: Q: Not affected Status register flag values if predicate true C←1 N←1 V←0 Z←0 C←0 N←0 V←0 Z←0 C←0 N←0 V←0 Z←1 C←0 N←0 V←1 Z←0 Condition for branch if predicate true Condition for branch if predicate false Predicate Operand == NaN lo gt Operand == Infinity gt lo Operand == (Denormal or (Exponent==0xFD) or (Exponent==0xFE)) eq ne Operand == Normal vs vc 55 32002F–03/2010 AVR32 Opcode: 31 1 15 0 0 0 0 Rd 0 0 0 0 Ry 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 1 1 0 16 1 0 56 32002F–03/2010 AVR32 FRCPA.S – Floating Point Reciprocal Approximation Description Returns an approximation of the reciprocal of the operand. This can be used as a starting point for iterative approximation algorithms. Also checks the operand for the special values Infinity, Not-a-Number and Denormal. A check is also performed for values with the two biggest possible representable exponents, i.e. 0xFD and 0xFE. This is useful for avoiding overflow in intermediate calculations in certain iterative algorithms. The operation sets the status flags according to the result of this check. Operation: I. Rd ← ApproximateReciprocal(Ry); Set flags depending on the value in Ry Syntax: I. frcpa.s Rd, Ry Operands: I. {d, y} ∈ {0, 1, …, 15} Status Flags: Q: Not affected Status register flag values if predicate true C←1 N←1 V←0 Z←0 C←0 N←0 V←0 Z←0 C←0 N←0 V←0 Z←1 Condition for branch if predicate true Condition for branch if predicate false Predicate Operand == NaN lo gt Operand == Infinity gt lo Operand == (Denormal or (Exponent==0xFD) or (Exponent==0xFE)) eq ne Opcode: 31 1 15 0 0 0 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 1 1 1 16 0 0 0 Rd 0 0 0 0 Ry 57 32002F–03/2010 AVR32 FRSQRTA.S – Floating Point Reciprocal Square Root Approximation Description Returns an approximation of the reciprocal of the square root of the operand. This can be used as a starting point for iterative approximation algorithms. Also checks the operand for the special values Infinity, Not-a-Number and Denormal. A check is also performed for values with the two biggest possible representable exponents, i.e. 0xFD and 0xFE. This is useful for avoiding overflow in intermediate calculations in certain iterative algorithms. The operation sets the status flags according to the result of this check. Operation: I. Rd ← ApproximateSquareRootReciprocal(Ry); Set flags depending on the value in Ry Syntax: I. frsqrta.s Rd, Ry Operands: I. {d, y} ∈ {0, 1, …, 15} Status Flags: Q: Not affected Status register flag values if predicate true C←1 N←1 V←0 Z←0 C←0 N←0 V←0 Z←0 C←0 N←0 V←0 Z←1 Condition for branch if predicate true Condition for branch if predicate false Predicate Operand == NaN lo gt Operand == Infinity gt lo Operand == (Denormal or (Exponent==0xFD) or (Exponent==0xFE)) eq ne Opcode: 31 1 15 0 0 0 1 29 1 28 0 0 1 25 0 24 1 1 0 1 20 0 19 1 1 1 16 1 0 0 Rd 0 0 0 0 Ry 58 32002F–03/2010 AVR32 5. Secure State Revision 3 of the AVR32 architecture introduced a separate system state allowing execution of secure or secret code alongside nonsecure code on the same processor. The secret code will execute in the secure state, and therefore be protected from hacking or readout by the code executing in the nonsecure state. Customers not needing the secure state functionality can just leave the associated hardware disabled, as it is by default, and the device will behave as previous versions of the AVR32UC. 5.1 Basic concept The secure state architecture extension divides the memory space into two sections, a secure section and a nonsecure section. The processor can be in one of two execution states, secure or nonsecure. The SS bit in the Status Register indicates which mode the processor is in. If the processor is in the secure state, it can access both secure and nonsecure memory spaces, but if it is in the nonsecure state, only nonsecure memory sections can be accessed. The SS_ADRR and SS_ADRF registers are used to configure the sizes of these secure sections. How the SS_ADR registers map secure sections of the associated memories is determined by the individual memories, but usually SS_ADR is programmed with a secure memory size starting from the first address in the associated memory, ie. if SS_ADRF is programmed with the value 0x800, the secure section of the flash contains the addresses from 0x8000_0000 to 0x8000_07FF. Any sections of the RAM and Flash that are not in a secure section are considered nonsecure. The processor can pass between the secure and nonsecure state by using dedicated sscall and retss instructions. If an access to secure memory is attempted from nonsecure space, a bus error exception is asserted and the access is aborted. 5.2 Typical use scenario The secure state hardware support allows our customers to program their proprietary IP code such as telecom stacks, DSP libraries etc into the secure section of the memories. This secret code must be placed in a special secure section of the flash program memory, and locate its secret data structures in a special secure section of the RAM. Thereafter a dedicated fuse in non-volatile memory, called the Secure State Enable (SSE) fuse, is programmed. When set, this fuse blocks all external access to the secure memories, both from debuggers and programs running in the nonsecure sections of the processor. The SSE fuse can only be erased by a full chip erase, which will also erase all data in the memory secure sections. This partially programmed device can then be sold to customers who will program their software application into the nonsecure section of the memories. This software can communicate with the secret IP code through a secure API provided by the secret code. This allows the application to call routines in the secret software IP, however this IP is protected from hacking or unauthorized copying. After the application has been programmed into the partially programmed device, the security fuse in the flash is set, protecting the entire application from unauthorized readout by any end user. 59 32002F–03/2010 AVR32 Figure 5-1. Typical secure state use scenario Secure memories programmed SSE set All memories programmed SSE + flash security fuse set Empty device ATMEL COMPANY A (Secret IP) COMPANY B (Application) END USER 5.3 Secure state boot sequence At system boot time, hardware state machines preloads the secure state address registers with an initial value programmed into a secure section in the flash. Also, the SS bit in the status register is preloaded with the value of the Secure State Enable (SSE) fuse from the flash. This preloading is done before the system has completed the boot sequence, so the secure state address registers and SR[SS] are initialized before code starts executing and before the debug system has been enabled. 5.4 Secure state debugging Normally, debugging when executing in secure state should be turned off to prevent compromising the secure code. However, it is useful to allow debugging of the secure state code during development of this code. A fuse in flash, called Secure State Debug Enable (SSDE), can be programmed to enable debugging of secure state code. 5.5 Events in secure state Normal RISC state interrupt and exception handling has been described in Section 3.7 ”Event handling” on page 22. This behavior is modified in the following way when interrupts and exceptions are received in secure state: • A sscall instruction will set SR[GM]. In secure state, SR[GM] masks both INT0-INT3, and NMI. Clearing SR[GM], INT0-INT3 and NMI will remove the mask of these event sources. INT0-INT3 are still additionally masked by the I0M-I3M bits in the status register. • sscall has handler address at 0x8000_0004. • Exceptions have a handler address at 0x8000_0008. • NMI has a handler address at 0x8000_000C. • BREAKPOINT has a handler address at 0x8000_0010. • INT0-INT3 are not autovectored, but have a common handler address at 0x8000_0014. Note that in the secure state, all exception sources share the same handler address. It is therefore not possible to separate different exception causes when in the secure world. The secure world system must be designed to support this, the most obvious solution is to design the secure software so that exceptions will not arise when executing in the secure world. 60 32002F–03/2010 AVR32 6. Memory System AVR32UC implements a 32-bit unsegmented memory space. Regions of this memory space can be protected by an optional MPU. The memory map is as follows: Figure 6-1. The AVR32UC memory map. H'FFFFFFFF 1GB High Speed Bus space H'C0000000 1GB Boot Program Memory H'80000000 1GB CPU Local Bus Memory H'40000000 HSB BOOT LOCAL 1 GB Internal Data RAM H'00000000 IRAM 6.1 Memory sections The memory map contains four sections, named IRAM, LOCAL, BOOT and HSB. The IRAM section contains the internal EX stage memory, and this memory is mapped from address 0 and upwards. The LOCAL section is mapped from address 0x4000_0000 and is designed for containing device-specific high-speed interfaces, such as floating-point units, encryption hardware or high-speed GPIO ports. Access to the LOCAL space is performed using any ordinary load and store instructions, and is performed in a single clock cycle. Mapping timing-critical devices in the LOCAL section is beneficial as the interface operates with high clock frequency, and its timing is deterministic since it does not need to access a shared bus which may be heavily loaded. The BOOT section starts at address 0x8000_0000, which is the reset address for AVR32UC. This section will typically contain an internal program FLASH, mapped from address 0x8000_0000 and upwards. The HSB section contains the addresses of all modules mapped on the HSB bus. This may include peripherals such as USARTs and external memory interfaces. The memory space is uniform, so program code can execute from the IRAM, BOOT and HSB sections, and data accesses can be performed to any of the these sections. Note that implementations of AVR32UC of may forbid certain accesses to certain memory sections, eg a write to program FLASH mapped into the BOOT section may be forbidden. The LOCAL section is only accessible by the Load-Store Unit in the CPU EX pipeline stage, therefore, code can not be executed from addresses in the LOCAL space. 61 32002F–03/2010 AVR32 6.2 Memory interfaces The AVR32UC CPU has three memory interfaces: • IF stage HSB master interface for instruction fetches • EX stage HSB master interface for data accesses into BOOT or HSB sections • EX stage HSB slave interface enabling other parts of the system to access addresses in the IRAM section 6.3 IF stage interface The single master interface in the IF stage performs instruction fetches. All fetches are performed with word alignment, except for the first fetch after a change-of-flow, which may use halfword alignment. The IF stage can not perform writes, only reads are possible. Reads can be perfomed from all addresses mapped on the HSB bus. Reads are performed as incrementing bursts of unspecified length. The IF stage master interface will stall appropriately to support slow slaves. 6.4 EX stage interfaces The EX stage separates between CPU accesses to the IRAM section, and accesses to BOOT/HSB. Any access to the IRAM section are performed to dedicated, high-speed RAMs implemented inside the memory controller. These fast RAMs are able to read or write within the cycle they are initiated. This means that a load instruction in EX will have the read-data ready at the end of the clock cycle for writing into the register file. 6.4.1 EX stage HSB master interface Any CPU access to the BOOT or HSB sections will use multiple clock cycles, as dictated by the HSB semantics. Writes to the BOOT or HSB sections can be pipelined, and are performed as a stream of nonsequential transfers, each taking one cycle unless stalled by the slave. If the slave stalls the transfer, the CPU will stall until the slave releases the stall. CPU reads from the BOOT or HSB sections are not pipelined, and transfer of a data therefore takes two clock cycles, one cycle for the address phase, and one cycle for the data phase. The CPU will be stalled in the data phase. EX stage HSB slave interface The AVR32UC CPU provides a slave interface into the high-speed RAMs that are implemented inside the memory controller. This interface enables other parts of the system, like DMA controllers, to write or read data to or from the RAMs. The slave interface support bursts for both reads and writes. If the high-speed RAMs for some reason cannot accept the transfer request, it will reply by stalling the request until it can be serviced. The arbitration priorities between the CPU and the slave interface for the RAMs can be controlled by programming the CPU Control Register (CPUCR). The CPUCR is described in Section 2.5 on page 11. Arbitration is performed according to the following rules: Assuming the memory interface is idle, and no memory transfers have been performed. Whoever requests access to the RAMs will win the arbitration and get access. If both the CPU and the slave interface requests access, the CPU will win. The source that won the arbitration can use the RAMs for as long as they require. If the other source also has a pending request for use of the RAM, this source will have to wait maximum the number of cycles specified by the SPL or CPL fields of CPUCR. The pending source will gain 6.4.2 62 32002F–03/2010 AVR32 access to the RAMs when the current owner voluntarily releases the RAMs, or after the SPL/CPL timeout period, whichever comes first. If the CPU wins arbitration for the RAMs, the CPU is guaranteed to own the RAM for the period specified by the COP field in CPUCR. Any slave request will be left pending during this period, even if the CPU is not using the RAMs. The following state diagram shows the states in arbitration for the RAM. Figure 6-2. Arbitration between CPU and slave interface for RAMs. RAM is free 1 2 3 CPU owns the RAM 4 Slave I/F owns the RAM 6 5 The state transitions are as follows: 1: CPU_wants_to_perform_mem_access 2: CPU_access_complete && (been_in_state > CPUCR[COP]) 3: (been_in_state > CPUCR[COP]) && slave_wants_to_perform_mem_access && (slave_been_pending > CPUCR[SPL]) 4: CPU_wants_to_perform_mem_access && (CPU_been_pending > CPUCR[CPL]) 5: slave_wants_to_perform_mem_access && !CPU_wants_to_perform_mem_access 6: slave_access_complete 6.4.3 EX stage local bus interface Any CPU access to the the LOCAL section is completed in a single clock cycle, both for reads and writes. Transfers on this bus can not be stalled. The CPU will never be stalled due to an access to the LOCAL section. Accesses to this section is performed using regular load-store instructions such as for example ldswp.w, ld.w, ld.ub, st.w, stswp.w, ldm or stm. Which devices are mapped in the LOCAL section, and their memory maps, is device-specific. The LOCAL interface must be enabled by the user by programming the LOCEN bit in CPUCR. Accesses to LOCAL memory addresses without first enabling the section will result in a BUS ERROR exception. If the MPU is enabled, accesses to LOCAL will be subject to permission checking. To ensure maximum transfer speed and cycle determinism, any slaves being addressed by the CPU on the local bus must be able to receive and transmit data on the bus at CPU clock speeds. The consequences of this may vary between different slave devices, but for some slave devices it may imply that the slaves have to run at the CPU clock frequency when local bus transfers are 63 32002F–03/2010 AVR32 being performed. Refer to the device datasheet for information on any relationships between CPU and device clock frequencies imposed by the local bus. 6.5 IRAM Write buffer The EX stage has a write buffer used to hold data to be written to the IRAM section. The operation of this buffer is usually transparent to the programmer. The programmer should be aware of the following: • The IRAM has a single port, allowing either one read or one write per clock cycle. • The write buffer is pipelined, allowing sequential writes to IRAM to be pipelined without any pipeline stalls. The previous contents of the write buffer is written to the RAM in parallel with the new store data being placed in the write buffer. • Any read instruction to IRAM in EX will be performed immediately, even if a previous store instruction has placed data to store in the write buffer. In this case, the previous store data remains in the write buffer and will be written back to RAM in a later clock cycle. • If a read instruction in EX accesses the same address as the data in the write buffer is to be stored to, the pipeline is stalled for one clock cycle while the write buffer is emptied to RAM. The read will be performed normally in the next clock cycle. • The contents of the write buffer is written to the physical RAM as soon as the memory interface is not used by any instructions. • The state of the write buffer may affect the timing of RMW instructions, see “Read-modifywrite instructions” on page 84 for details 6.6 Memory barriers Memory barriers are constructs used to enfore memory consitency. Caches and self-modifying code may cause memory to become inconsistent. AVR32UC has a simple pipeline with no caches, so there is usually no need for memory barriers. Mechanisms for memory barriers are present to handle the cases where such barriers are needed. 6.6.1 Instruction memory barriers An instruction memory barrier (IMB) is usually only needed when executing self-modifying code, for example when self-programming program flash. In this case, one must ensure that all levels in the memory hierarchy are consistent. Due to the simple non-cached memory system in AVR32UC, this is usually trivial. The programmer should make sure that an IMB is used if there is a possibility that an instruction to be modified by self-modifying code has already been prefetched by the instruction prefetch unit. In this case, an IMB should be inserted between the instruction modifying the code and the execution of the modified instruction. To make sure that the modified version of the instruction is executed, the prefetch buffer should be flushed between changing the program memory and executing the new version of the program. Any instruction performing a change-of flow, such as return from exception, conditional branches, unconditional branches, subprogram call or return, or instructions writing to PC would implement an IMB in AVR32UC. 6.6.2 Data memory barriers A data memory barrier (DMB) is used to make sure that a data memory access, either a read or write, is actually performed before the rest of the code is executed. Caches, write buffers and 64 32002F–03/2010 AVR32 bus latency may cause a memory access to be seen by a slave many cycles after it has been executed by the pipeline. In some cases, this may lead to UNPREDICTABLE behavior in the system. One example of this is found in interrupt handlers. One usually wants to make sure that the interrupt request has been cleared before executing the rete instruction, otherwise the same interrupt may be serviced immediately after executing the rete instruction. In this case a DMB must be inserted between the code clearing the interrupt request and the rete. All accesses to HSB space are strongly ordered. This is used to implement DMBs. A DMB after a store to a HSB slave is implemented by performing a dummy read from the same slave. Any critical code after the read will stall until the read has been performed. Consider an interrupt request made by a peripheral. This peripheral will disassert the interrupt request as soon as the interrupt handler has written a specific bitmask to its PERIPH_INTCLEAR register. A read from the same peripheral performs a bus transfer that implements the DMB. The rete instruction can be executed after the DMB. Code 6-1. Clearing IRQs using data memory barriers // Using data memory barriers in the IRQ handler to make sure that the // request has been disasserted before returning from the handler // Assume that the IRQ is cleared by writing a bitmask to PERIPH_INTCLEAR. // r0 points to this register, r1 contains the correct bitmask. irq_handler: st.w r0[0], r1 ld.w r12, r0[0] // data memory barrier rete 65 32002F–03/2010 AVR32 7. Memory Protection Unit The AVR32 architecture defines an optional Memory Protection Unit (MPU). This is a simpler alternative to a full MMU, while at the same time allowing memory protection. The MPU allows the user to divide the memory space into different protection regions. These protection regions have a user-defined size, and starts at a user-defined address. The different regions can have different cacheability attributes and bufferability attributes. Each region is divided into 16 subregions, each of these subregions can have one of two possible sets of access permissions. The MPU does not perform any address translation. 7.1 Memory map in systems with MPU An AVR32 implemetation with a MPU has a flat, unsegmented memory space. Access permissions are given only by the different protection regions. 7.2 Understanding the MPU The AVR32 Memory Protection Unit (MPU) is responsible for checking that memory transfers have the correct permissions to complete. If a memory access with unsatisfactory privileges is attempted, an exception is generated and the access is aborted. If an access to a memory address that does not reside in any protection region is attempted, an exception is generated and the access is aborted. The user is able to allow different privilege levels to different blocks of memory by configuring a set of registers. Each such block is called a protection region. Each region has a user-programmable start address and size. The MPU allows the user to program 8 different protection regions. Each of these regions have 16 sub-regions, which can have different access permissions, cacheability and bufferability. The “DMMU SZ” fields in the CONFIG1 system register identifies the number of implemented protection regions, and therefore also the number of MPU registers. An AVR32UC system with caches also have MPU cacheability and bufferability registers. A protection region can be from 4 KB to 4 GB in size, and the size must be a power of two. All regions must have a start address that is aligned to an address corresponding to the size of the region. If the region has a size of 8 KB, the 13 lowest bits in the start address must be 0. Failing to do so will result in UNDEFINED behaviour. Since each region is divided into 16 sub-regions, each sub-region is 256 B to 256 MB in size. When an access hits into a memory region set up by the MPU, hardware proceeds to determine which subregion the access hits into. This information is used to determine whether the access p e r m i s s io n s f o r t h e s u b r e g i o n a r e g iv e n i n M P U A P R A/ M PU B R A / M P U C R A o r i n MPUAPRB/MPUBRB/MPUCRB. If an access does not hit in any region, the transfer is aborted and an exception is generated. The MPU is enabled by writing setting the E bit in the MPUCR register. The E bit is cleared after reset. If the MPU is disabled, all accesses are treated as uncacheable, unbufferable and will not generate any access violations. Before setting the E bit, at least one valid protection region must be defined. 7.2.1 MPU interface registers The following registers are used to control the MPU, and provide the interface between the MPU and the operating system, see Figure 7-1 on page 67. All the registers are mapped into the Sys- 66 32002F–03/2010 AVR32 tem Register space, their addresses are presented in “System registers” on page 11. They are accessed with the mtsr and mfsr instructions. The MPU interface registers are shown below. The suffix n can have the range 0-7, indicating which region the register is associated with. Figure 7-1. The MPU interface registers 12 11 Base Address 65 Size 10 V MPUARn 31 MPUPSRn 31 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 876543210 C7 C6 C5 C4 C3 C2 C1 C0 876543210 B7 B6 B5 B4 B3 B2 B1 B0 16 15 AP4 AP3 12 11 AP2 87 AP1 43 AP0 0 10 E 20 19 AP5 MPUCRA / MPUCRB 31 MPUBRA / MPUBRB 31 MPUAPRA / MPUAPRB 31 AP7 28 27 AP6 24 23 MPUCR 31 7.2.1.1 MPU Address Register - MPUARn A MPU Address register is implemented for each of the 8 protection regions. The MPUAR registers specify the start address and size of the regions. The start address must be aligned so that its alignment corresponds to the size of the region. The minimum allowable size of a region is 4 KB, so only bits 31:12 in the base address needs to be specified. The other bits are always 0. Each MPUAR also has a valid bit that specifies if the protection region is valid. Only valid regions are considered in the protection testing. The MPUAR register consists of the following fields: • Base address - The start address of the region. The minimum size of a region is 4KB, so only the 20 most significant bits in the base address needs to be specified. The 12 lowermost base address bits are implicitly set to 0. If protection regions larger than 4 KB is used, the user must write the appropriate bits in Base address to 0, so that the base address is aligned to the size of the region. Otherwise, the result is UNDEFINED. 67 32002F–03/2010 AVR32 • Size - Size of the protection region. The possible sizes are shown in Table 7-1 on page 68. Table 7-1. Size B’00000 to B’01010 B’01011 B’01100 B’01101 B’01110 B’01111 B’10000 B’10001 B’10010 B’10011 B’10100 B’10101 B’10110 B’10111 B’11000 B’11001 B’11010 B’11011 B’11100 B’11101 B’11110 B’11111 Protection region sizes implied by the Size field Region size UNDEFINED 4 KB 8 KB 16 KB 32 KB 64 KB 128 KB 256 KB 512 KB 1 Mb 2 MB 4 MB 8 MB 16 MB 32 MB 64 MB 128 MB 256 MB 512 MB 1 GB 2 GB 4 GB Constraints on Base address None Bit [12] in Base Address must be 0 Bit [13:12] in Base Address must be 0 Bit [14:12] in Base Address must be 0 Bit [15:12] in Base Address must be 0 Bit [16:12] in Base Address must be 0 Bit [17:12] in Base Address must be 0 Bit [18:12] in Base Address must be 0 Bit [19:12] in Base Address must be 0 Bit [20:12] in Base Address must be 0 Bit [21:12] in Base Address must be 0 Bit [22:12] in Base Address must be 0 Bit [23:12] in Base Address must be 0 Bit [24:12] in Base Address must be 0 Bit [25:12] in Base Address must be 0 Bit [26:12] in Base Address must be 0 Bit [27:12] in Base Address must be 0 Bit [28:12] in Base Address must be 0 Bit [29:12] in Base Address must be 0 Bit [30:12] in Base Address must be 0 Bit [31:12] in Base Address must be 0 • V - Valid. Set if the protection region is valid, cleared otherwise. This bit is written to 0 by a reset. The region is not considered in the protection testing if the V bit is cleared. 7.2.1.2 MPU Permission Select Register - MPUPSRn A MPU Permission Select register is implemented for each of the 8 protection regions. Each MPUPSR register divides the protection region into 16 subregions. The bitfields in MPUPSR specifies whether each subregion has access permissions as specified by the region entry in either MPUAPRA or MPUAPRB. Table 7-2. MPUPSRn[P] 0 1 Subregion access permission implied by MPUPSR bitfields Access permission MPUAPRA[APn] MPUAPRB[APn] 68 32002F–03/2010 AVR32 7.2.1.3 MPU Cacheable Register A / B- MPUCRA / MPUCRB The MPUCR registers have one bit per region, indicating if the region is cacheable. If the corresponding bit is set, the region is cacheable. The register is written to 0 upon reset. AVR32UC implementations may optionally choose not to implement the MPUCR registers. 7.2.1.4 MPU Bufferable Register A / B- MPUBRA / MPUBRB The MPUBR registers have one bit per region, indicating if the region is bufferable. If the corresponding bit is set, the region is bufferable. The register is written to 0 upon reset. AVR32UC implementations may optionally choose not to implement the MPUBR registers. 7.2.1.5 MPU Access Permission Register A / B - MPUAPRA / MPUAPRB The MPUAPR registers indicate the access permissions for each region. The MPUAPR is written to 0 upon reset. The possible access permissions are shown in Table 7-3 on page 69. Table 7-3. AP B’0000 B’0001 B’0010 B’0011 B’0100 B’0101 B’0110 B’0111 B’1000 B’1001 B’1010 Other Access permissions implied by the APn bits Privileged mode Read Read / Execute Read / Write Read / Write / Execute Read Read / Execute Read / Write Read / Write / Execute Read / Write Read / Write None UNDEFINED Unprivileged mode None None None None Read Read / Execute Read / Write Read / Write / Execute Read Read / Execute None UNDEFINED 7.2.1.6 MPU Control Register - MPUCR The MPUCR controls the operation of the MPU. The MPUCR has only one field: • E - Enable. If set, the MPU address checking is enabled. If cleared, the MPU address checking is disabled and no exceptions will be generated by the MPU. 7.2.2 MPU exception handling This chapter describes the exceptions that can be signalled by the MPU. ITLB Protection Violation An ITLB protection violation is issued if an instruction fetch violates access permissions. The violating instruction is not executed. The address of the failing instruction is placed on the system stack. 7.2.2.1 69 32002F–03/2010 AVR32 7.2.2.2 DTLB Protection Violation An DTLB protection violation is issued if a data access violates access permissions. The violating access is not executed. The address of the failing instruction is placed on the system stack. ITLB Miss Violation An ITLB miss violation is issued if an instruction fetch does not hit in any region. The violating instruction is not executed. The address of the failing instruction is placed on the system stack. DTLB Miss Violation An DTLB miss violation is issued if a data access does not hit in any region. The violating access is not executed. The address of the failing instruction is placed on the system stack. TLB Multiple Hit Violation An access hit in multiple protection regions. The address of the failing instruction is placed on the system stack. This is a critical system error that should not occur. 7.2.2.3 7.2.2.4 7.2.2.5 7.3 Example of MPU functionality As an example, consider region 0. Let region 0 be of size 16 KB, thus each subregion is 1KB. Subregion 0 has offset 0-1KB from the base address, subregion 1 has offset 1KB-2KB and so on. MPUAPRA and MPUAPRB each has one field per region. Each subregion in region 0 can get its access permissions from either MPUAPRA[AP0] or MPUAPRB[AP0], this is selected by the subregion’s bitfield in MPUPSR0. Let: MPUPSR0 = {0b0000_0000_0000_0000, 0b1010_0000_1111_0101} MPUAPRA = {A, B, C, D, E, F, G, H} MPUAPRB = {a, b, c, d, e, f, g, h} where {A-H, a-h} have legal values as defined in Table 7-3. Thus for region 0: Table 7-4. Subregion 0 1 2 3 4 5 6 7 Example of access rights for subregions Access permission h H h H h h h h Subregion 8 9 10 11 12 13 14 15 Access permission H H H H H h H h 70 32002F–03/2010 AVR32 8. Instruction Cycle Summary This chapter presents the instructions in AVR32UC CPU, and the number of clock cycle they require to complete. All the instructions in each group behave similarly in the pipeline. The final subchapter presents code examples to illustrate the clock cycle requirements of various code constructs. 8.1 Definitions The following definitions are presented in the tables below: 8.1.1 Issue An instruction is issued when it leaves the ID stage and enters the EX stage. 8.1.2 Issue latency The issue latency represents the number of clock cycles required between the issue of the instruction and the issue of the following instruction. For some change-of-flow instructions, this includes the cycle penalty caused by the pipeline flush. The issue latency assumes, unless stated otherwise, that the instruction and data memories are able to return an instruction or data in a single cycle, which may not be true for slow program memories or data memories mapped on the HSB bus. 8.2 8.2.1 Special considerations PC as destination register Most instructions can take PC as destination register. This will result in a jump to the calculated address. The jump is performed when the instruction writing to PC has completed, and all other effects of the instruction, like updating of pointer registers for load instructions with PC as target instruction, have been committed. Instructions writing to PC will have an additional issue latency of 2 cycles due to the pipeline flush. Alignment of change-of-flow targets The cycle count number for change-of-flow instructions assumes that the target instruction is a compact instruction or word-aligned extended instruction. An extra cycle will be required if the target instruction is a halfword-aligned extended instruction, since both halves of the instruction must be fetched before it can be issued. Memory and bus timings Performance of memory accesses and instruction fetching are affected by the performance of system memories and system bus. The following are examples of factors that may affect the cycle count of such operations: • Accesses to the IRAM section in parallel with another bus master, for example a DMA controller. • Accesses to memories with wait states, for example flash or external memories. • Using system buses with wait states or arbitration overhead. • Accesses to memories that are simultaneously being accessed by other bus masters. 8.2.2 8.2.3 71 32002F–03/2010 AVR32 8.3 CPU revision Revision 1, 2 and 3 of the AVR32UC CPU has the same instruction timings, except that the divider in revision 2 and later is faster than in revision 1. Instructions only present in revision 2 or 3 of the CPU are explicitly noted. 8.4 ALU instructions This group comprises simple single-cycle ALU instructions like add and sub. The conditional subtract and move instructions are also in this group. All instructions in this group, except ssrf to bits 15 to 31, take one cycle to execute, and the result is available for use by the following instruction. Table 8-1. Mnemonics abs acr adc add C C E C E add{cond4} addhh.w addabs cp.b cp.h E C E E E C cp.w C E C cpc E max min neg rsub E rsub{cond4} sbc scr E E C Rd, Rs, k8 Rd, imm Rd, Rx, Ry Rd Reverse subtract immediate if condition satisfied. CPU revision 2 and higher only. Subtract with carry. Subtract carry from register. E E C C Rd, Rs Rd, Rx, Ry Rd, Rx, Ry Rd Rd, Rs Reverse subtract. 1 1 1 1 Return signed maximum Return signed minimum Two’s Complement. ALU instructions Operands Rd Rd Rd, Rx, Ry Rd, Rs Rd, Rx, (Ry sa Rd, Rx, Ry Rd, imm Rd, imm Rd, Rs Rd, Rx, Ry > sa Rd, Rx, Ry Logical OR if condition satisfied. CPU revision 2 and higher only. Logical (Inclusive) OR. Logical EOR if condition satisfied. CPU revision 2 and higher only. Logical Exclusive OR (High Halfword). Logical Exclusive OR (Low Halfword). Logical Exclusive OR. Logical AND Low Halfword, clear other halfword. One’s Complement (NOT). 1 1 1 1 1 1 1 1 1 1 1 1 Rd, imm, COH Rd, imm Logical AND High Halfword, clear other halfword. Logical AND Low Halfword, high halfword is unchanged. 1 1 Rd, imm Rd Rd, Rs Rd, Rx, Ry > sa Rd, Rx, Ry Rd, Rs Rd, imm Logical AND if condition satisfied. CPU revision 2 and higher only. Logical AND NOT. Logical AND High Halfword, low halfword is unchanged. Logical AND. Rd, Rs Rd, Rx, (Ry > sa, b5 Rd >> sa, b5 Rd >> sa, b5 Rd >> sa, b5 Signed saturate from bit given by sa after a right shift with rounding of b5 bit positions. Unsigned saturate from bit given by sa after a right shift with rounding of b5 bit positions. Shift sa positions and do signed saturate from bit given by b5. Shift sa positions and do unsigned saturate from bit given by b5. E E E E Saturate instructions Operands Rd, Rx, Ry Rd, Rx, Ry Rd, Rx, Ry Rd, Rx, Ry Saturated subtract. 1 2 2 1 1 Description Saturated add halfwords. Saturated add. Saturated subtract halfwords. Issue latency 1 1 1 1 8.10 Load and store instructions This group includes all the load and store instructions. The address calculations are performed by the adder in the EX stage. The EX adder also performs the writeback address calculation for the autoincrement and autodecrement operation. Loaded data are available at the end of the cycle in the EX stage. Byte and halfword data must be extended and rotated before they are valid. This is performed in the EX stage. Ldins and ldswp instructions also require modification in the EX stage before their results are valid. Stswp instructions require modification before their data is output to the memory interface. This modification is performed in the EX stage. 77 32002F–03/2010 AVR32 The s tcond i nstruction takes 2 cycles if the store is not performed, 3 cycles if the store is performed. All issue latencies are given for accesses to IRAM or LOCAL. These timings must be modified as follows for accesses to BOOT or HSB sections: • A byte, halfword or word load requires 1+w cycles in addition to the count listed in Table 8-7, where w is the number of wait states from the slave and bus system. The pipeline will stall during these cycles. • A doubleword load performs two memory accesses, so 2(1+w) cycles are needed in addition to the count listed in Table 8-7. The pipeline will stall during these cycles. • A byte, halfword or word store requires (1+w) cycles in addition to the count listed in Table 87, where w is the number of wait states from the slave and bus system. Stores to BOOT or HSB can be performed in the background, so the pipeline will only stall if another memory access is attempted during these w cycles. However, multiple stores to addresses in BOOT or HSB can be automatically combined by the memory interface to create bursts on the HSB bus. This means that any consecutive stores to BOOT or HSB sections will not stall the pipeline unless the bus itself inserts wait cycles, for example due to wait states or bus contention. Instructions not performing memory accesses will never stall the pipeline when executed after stores to BOOT or HSB. • A doubleword store performs two memory accesses, but these will be pipelined. The last of these accesses will stall if the instruction following the doubleword is a memory access instruction other than a store to BOOT or HSB. Therefore, a non-memory instruction or another store to BOOT or HSB should be scheduled after a doubleword store to BOOT or HSB for maximum performance. Table 8-7. Load and store instructions Issue latency IRAM 2 2 1 Load unsigned byte with displacement. E E Rd, Rp[disp] Rd, Rb[Ri

下载 PDF

AVR32UC_10 价格&库存

-> 查询更多价格&库存

很抱歉，暂时无法提供与“AVR32UC_10”相匹配的价格&库存，您可以联系我们找货

免费人工找货

搜索历史

AVR32UC_10

相关技术文章