INTEGRATED CIRCUITS
PNX1300 Series
Media Processors
Preliminary Specification
Supersedes PNX1300 data of 2001 Oct 12
File under INTEGRATED CIRCUITS, TR1
2002 Feb 15
Philips Semiconductors
Media Processors
2002 Feb 15
Preliminary Specification
PNX1300 Series
PNX1300 Series Data Book
Foreword
13 System Boot
Table of Contents
14 Image Coprocessor
1
Pin List
15 Variable Length Decoder
2
Overview
16 I2C Interface
3
DSPCPU Architecture
17 Synchronous Serial Interface
4
Custom Operations for Multimedia
18 JTAG Functional Specification
5
Cache Architecture
19 On-Chip Semaphore Assist Device
6
Video In
20 Arbiter
7
Enhanced Video Out
21 Power Management
8
Audio In
22 PCI-XIO Bus Functional Specification
9
Audio Out
A
DSPCPU Operations
10 SPDIF Out
B
MMIO Register Summary
11 PCI Interface
C
Endian-ness
12 SDRAM Memory System
Index
2001 Philips Electronics North America Corporation
All rights reserved.
See Terms and Conditions on the next page.
2002 Feb 15
Preliminary Specification
Terms and Conditions
TERMS AND CONDITIONS
Philips Semiconductors and Philips Electronics North America Corporation reserve the right to make changes,
without notice, in the products, including circuits, standard cells, and/or software, described or contained
herein in order to improve design and/or performance. Philips Semiconductors assumes no responsibility or
liability for the use of any of these products, conveys no license or title under any patent, copyright, or most
work right to these products, and makes no representations or warranties that these products are free from
patent, copyright, or most work right infringement, unless otherwise specified. Applications that are described
herein for any of these products are for illustrative purposes only. Philips Semiconductors makes no
representation or warranty that such applications will be suitable for the specified use without further testing
or modification.
LIFE SUPPORT APPLICATIONS
Philips Semiconductors and Philips Electronics North America Corporation products are not designed for use
in life support appliances, devices, or systems where malfunction of a Philips Semiconductors and Philips
Electronics North America Corporation product can reasonably be expected to result in a personal injury.
Philips Semiconductors and Philips Electronics North America Corporation customers using or selling Philips
Semiconductors and Philips Electronics North America Corporation products for use in such applications do
so at their own risk and agree to fully indemnify Philips Semiconductors and Philips Electronics North America
Corporation for any damages resulting from improper use or sale.
Philips Semiconductors and Philips Electronics North America Corporation register eligible circuits under the
Semiconductor Chips Protection Act.
DEFINITIONS
Data Sheet
Identification
Product Status
Definition
Objective
Specification
Formative or in
Design
This data sheet contains the design target or goal specifications for product
development. Specifications may change in any manner without notice.
Preliminary
Specification
Preproduction
Product
This data sheet contains preliminary data, and supplementary data will be published at a later date. Philips Semiconductors reserves the right to make
changes at any time without notice in order to improve design and supply the
best possible product.
Product
Specification
Full
Production
This data sheet contains Final Specifications. Philips Semiconductors reserves
the right to make changes at any time without notice, in order to improve the
design and supply the best possible product.
2001, 2002 Philips Electronics North America Corporation
All rights reserved.
Printed in U.S.A.
Business Line Media Processing, 811 E. Arques Avenue, Sunnyvale, CA 94088
Foreword
The TriMedia PNX1300 Series is an enhanced version
of the TM-1300 family of media processor.
The PNX1300 Series contains an ultra-high performance
Very Long Instruction Word processor, as well as a complete intelligent video and audio input/output subsystem.
The processor has an instruction set that is optimized for
processing audio, video and graphics. It includes powerful SIMD multimedia operators for eight- and 16-bit signal
datatypes as well as a full complement of 32-bit IEEE
compatible floating point operations.
The PNX1300 Series is intended as a multi-standard
programmable video, audio and graphics processor. It
can either be used standalone, or as an accelerator to a
general purpose processor.
The architecture of the TriMedia family came about as
the result of many years of effort of many dedicated individuals. Going back in history, the origin of TriMedia was
laid by the LIFE-1 VLIW processor, designed by Junien
Labrousse and myself in 1987. Work continued afterwards in Philips Research Labs, Palo Alto. My special
thanks go to the entire Palo Alto research team: Mike
Ang, Uzi Bar-Gadda, Peter Donovan, Martin Freeman,
Eino Jacobs, Beomsup Kim, Bob Law, Yen Lee, Vijay
Mehra, Pieter van der Meulen, Ross Morley, Mariette
Parekh, Bill Sommer, Artur Sorkin and Pierre Uszynski.
The Palo Alto period matured the architecture—we ported all video and audio algorithms that we could find to the
compiler/simulator and refined the operation set. In addition, we learned how to give the architecture a market direction. In May 1994, Philips management—in particular
Cees-Jan Koomen, Eddy Odijk, Theo Claasen and Doug
Dunn—decided to develop TriMedia into a major Philips
Semiconductors product line.
Under the guidance of Keith Flagler, the TriMedia team
was built. All of them contributed to take this from a set
of interesting ideas to a reliable and competitive product
in a short period of time. The initial TriMedia team included Fuad Abu Nofal, Karel Allen, Mike Ang, Robert Aquino, Manju Asthana, Patrick de Bakker, Shiv Balakrishnan, Jai Bannur, Marc Berger, Sunil Bhandari, Rusty
Biesele, Ahmet Bindal, David Blakely, Hans Bouwmeester, Steve Bowden, Robert Bradfield, Nancy
Breede, Shawn Brown, Sujay Chari, Catherine Chen,
Howen Chen, Yan-ming Chen, Yong Cho, Scott Clapper,
Matthew Clayson, Paul Coelho, Richard Dodds, Marc
Duranton, Darcia Eding, Aaron Emigh, Li Chi Feng, Keith
Flagler, Jean Gobert, Sergio Golombek, Mike Grimwood,
Yudi Halim, Hari Hampapuram, Carl Hartshorn, Judy
Heider, Laura Hrenko, Jim Hsu, Eino Jacobs, Marcel
Janssens, Patricia Jones, Hann-Hwan Ju, Jayne Keith,
Bhushan Kerur, Ayub Khan, Keith Knowles, Mike Kong,
Ashok Krishnamurti, Yen Lee, Patrick Leong, Bill Lin,
Laura Ling, Chialun Lu, Naeem Maan, Nahid Mansipur,
Mike Maynard, Vijay Mehra, Jun Mejia, Derek Meyer,
Prabir Mohanty, Saed Muhssin, Chris Nelson, Stephen
Ness, Keith Ngo, Francis Nguyen, Kathleen Nguyen,
Derek Noonburg, Ciaran O’Donnel, Sang-Ju Park,
Charles Peplinski, Gene Pinkston, Maryam Pirayou, Pardha Potana, Bill Price, Victor Ramamoorthy, Babu Rao
Kandamilla, Ehsan Rashid, Selliah Rathnam, Margaret
Redmond, Donna Richardson, Alan Rodgers, Tilakray
Roychoudhury, Hani Salloum, Chris Salzmann, Bob
Seltzer, Ravi Selvaraj, Jim Shimandle, Deepak Singh,
Bill Sommer, Juul van der Spek, Manoj Srivastava, Renga Sundararajan, Ken-Sue Tan, Ray Ton, Steve Tran,
Cynthia Tripp, Ching-Yih Tseng, Allan Tzeng, Barbara
Vendelin, John Vivit, Rudy Wang, Rogier Wester, Wayne
Wonchoba, Anthony Wong, Sara Wu, David Wyland,
Ken Xie, Vincent Xie, Bettina Yeung, Robert Yin, Charles
Young, Grace Yun, Elena Zelayeta and Vivian Zhu.
Expert help and feedback was received from many. In
particular, I’d like to mention Kees van Zon of Philips
Eindhoven for the help with filtering-related issues, and
Craig Clapp of PictureTel for excellent feedback on all
aspects of the architecture.
My special thanks go to Joe Kostelec. He made me understand that my ambitions could better be realized in
California than in Europe. Furthermore, his vision and his
wisdom are credited with keeping this project alive and
growing until the ‘investment decision.’
The vision of a universal media accelerator is credited to
Jaap de Hoog. Jaap, I wish you were here to see it come
to fruition.
–Gerrit Slavenburg
After the initial TM-1000 product, the TM-1100, TM-1300
and now PNX1300 Series chips have been successfully
integrated in many video and audio products. It has been
my pleasure to have been involved in these designs and
would like to thank the people involved in TM-1300 and
PNX1300 Series projects under the guidande of Cees
Hartgring and Simon Wegerif. The team included Karel
Allen, Tien-Cheng Bau, Jim Campbell, Anitamk Chan,
John Chang, Roel Coppoolse, Taufik Dakhil, Mitch Daniil, Nam Dao, Patrick Debaumarche, Thuy Duong, Torsten Fink, Jan Grotenbreg, Mohammad Hafeez, Feng
Hao, Farah Jubran, Babu Rao Kandamalla, Aki Kaniel,
Yan-Ling Li, Ying-Chao Liu, Naeem Maan, Don Marshal,
Thomas Meyer, Javed Mukarram, Long Nguyen, Tu
Nghiem, Elaine Outler, Charles Peplinski, Duc T. Pham,
Thorwald Rabeler, Raquel Ruiz, Ensieh Saffari, Hani
Salloum, Wenyi Song, Stephen Tomasello, Tran Tung,
Maria F. Wangsahamidjaja, Chang-Ming Yang, Mohammed I. Yousuf, Hui Zhang and Gerrit Slavenburg.
- Luis Lucas
PRELIMINARY INFORMATION
1
PNX1300/01/02/11 Data Book
2
PRELIMINARY INFORMATION
Philips Semiconductors
Table of Contents
Foreword
1 Pin List
1.1 PNX1300 Series versus TM-1300 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
1.2 Boundary Scan Notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
1.3 I/O Circuit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
1.4 Signal Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.5 Power Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
1.6 Pin Reference Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
1.7 Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
1.8 Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
1.9 Parametric Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
1.9.1 PNX1300/01/02/11 Absolute Maximum Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
1.9.2 PNX1300/01/02 Operating Range and Thermal Characteristics . . . . . . . . . . . . . . . . . . . . . . . 1-11
1.9.3 PNX1311 Operating Range and Thermal Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
1.9.4 PNX1300/01/02/11 Power Supply Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
1.9.5 PNX1300/01/02 DC/AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
1.9.6 PNX1311 DC/AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
1.9.7 PNX1300 Series Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
1.9.7.1 Power Consumption for Applications on PNX1300 Series . . . . . . . . . . . . . . . . . . . . . . 1-13
1.9.7.2 PNX1300/01/02 DSPCPU Core Current and Power Consumption . . . . . . . . . . . . . . . . 1-14
1.9.7.3 PNX1311 DSPCPU Core Current and Power Consumption Details . . . . . . . . . . . . . . . 1-14
1.9.7.4 PNX1300/01/02 Current Consumption For On-Chip Peripherals . . . . . . . . . . . . . . . . . 1-15
1.9.7.5 PNX1311 Current Consumption For On-Chip Peripherals . . . . . . . . . . . . . . . . . . . . . . 1-16
1.9.7.6 STRG3, STRG5 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17
1.9.7.7 NORM3 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17
1.9.7.8 WEAK5 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17
1.9.7.9 IICOD (I2c) type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17
1.9.7.10 SDRAM interface timing for PNX1300/01/02/11 speed grades. . . . . . . . . . . . . . . . . . 1-18
1.9.7.11 PCI Bus timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18
1.9.7.12 JTAG I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19
1.9.7.13 I2C I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19
1.9.7.14 Video In I/O Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19
1.9.7.15 Video Out I/O Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19
1.9.7.16 AudioIn I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20
1.9.7.17 Audio Out I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20
1.9.7.18 SSI I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20
PRELIMINARY SPECIFICATION
3
PNX1300/01/02/11 Data Book
Philips Semiconductors
2 Overview
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.2 PNX1300 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.3 PNX1300 Chip Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.4 Brief Examples of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.4.1 Video Decompression in a PC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.4.2 Video Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.5 Introduction to PNX1300 Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.5.1 Internal ‘Data Highway’ Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.5.2 VLIW Processor Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.5.3 Video In Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.5.4 Enhanced Video Out Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.5.5 Image Coprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.5.6 Variable-Length Decoder (VLD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2.5.7 Audio In and Audio Out Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.5.8 S/PDIF Out Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.5.9 Synchronous Serial Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.5.10 I2C Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.6 New In PNX1300 (Versus TM-1300) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.7 New In PNX1300 (Versus TM-1100) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.8 New In PNX1300 (Versus TM-1000) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
3 DSPCPU Architecture
3.1 Basic Architecture Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
3.1.1 Register Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
3.1.2 Basic DSPCPU Execution Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.1.3 PCSW Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.1.4 SPC and DPC—Source and Destination Program Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.1.5 CCCOUNT—Clock Cycle Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.1.6 Boolean Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.1.7 Integer Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.8 Floating Point Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.9 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.1.10 Software Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.2 Instruction Set Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3.2.1 Guarding (Conditional Execution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3.2.2 Load and Store Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3.2.3 Compute Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
3.2.4 Special-Register Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
3.2.5 Control-Flow Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
4
PRELIMINARY SPECIFICATION
Philips Semiconductors
3.3 PNX1300 Instruction Issue Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
3.4 Memory and MMIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.4.1 Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.4.2 The Memory Hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.4.3 MMIO Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.5 Special Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
3.5.1 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.5.2 EXC (Exceptions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.5.3 INT and NMI (Maskable and Non-Maskable Interrupts) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.5.3.1 Interrupt vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.5.3.2 Interrupt modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.5.3.3 Device interrupt acknowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.5.3.4 Interrupt priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.5.3.5 Interrupt masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.5.3.6 Software interrupts and acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3.5.3.7 NMI sequentialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3.5.3.8 Interrupt source assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3.6 PNX1300 to Host Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3.7 Host to PNX1300 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
3.8 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
3.9 Debug Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
3.9.1 Instruction Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
3.9.2 Data Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
4 Custom Operations for Multimedia
4.1 Custom OperationS Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
4.1.1 Custom Operation Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
4.1.2 Introduction to Custom Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
4.1.3 Example Uses of Custom Ops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.2 Example 1: Byte-Matrix Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.3 Example 2: MPEG Image Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4.4 Example 3: Motion-Estimation Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
4.4.1 A Simple Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
4.4.2 More Unrolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
5 Cache Architecture
5.1 Memory System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
5.2 DRAM Aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
5.3 Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.3.1 General Cache Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.3.2 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
PRELIMINARY SPECIFICATION
5
PNX1300/01/02/11 Data Book
Philips Semiconductors
5.3.3 Miss Processing Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.3.4 Replacement Policies, Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.3.5 Alignment, Partial-Word Transfers, Endian-ness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.3.6 Dual Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 5-4
5.3.7 Cache Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.3.8 Memory Hole and PCI Aperture Disable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5.3.9 Non-cacheable Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5.3.10 Special Data Cache Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.3.10.1 Copyback and invalidate operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.3.10.2 Data cache tag and status operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.3.10.3 Data cache allocation operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.3.10.4 Data cache prefetch operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.3.11 Memory Operation Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.3.12 Operation Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.3.13 MMIO Register References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.3.14 PCI Bus References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.3.15 CPU Stall Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.3.16 Data Cache Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.4 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 5-8
5.4.1 General Cache Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.4.2 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.4.3 Miss Processing Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.4 Replacement Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.5 Location of Program Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.6 Branch Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.7 Coherency: Special iclr Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.8 Reading Tags and Cache Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.9 Cache Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5.4.10 Instruction Cache Initialization and Boot Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5.5 LRU Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.5.1 Two-Way Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6 Cache Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6.1 Example 1: Data-Cache/Input-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6.2 Example 2: Data-Cache/Output-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6.3 Example 3: Instruction-Cache/Data-Cache Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6.4 Example 4: Instruction-Cache/Input-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6.5 Four-Way Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6.6 LRU Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5.6.7 LRU Bit Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5.6.8 LRU for the Dual-Ported Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
6
PRELIMINARY SPECIFICATION
Philips Semiconductors
5.7 Performance Evaluation Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5.8 MMIO Register Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
6 Video In
6.1 video in overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
6.1.1 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
6.1.2 Diagnostic Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.1.3 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.1.4 Hardware and Software Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.2 Clock Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
6.3 Fullres Capture Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
6.4 Halfres Capture Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
6.5 Raw Capture Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
6.6 Message-Passing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11
6.6.1 VI_DVALID in Message Passing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
6.7 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
7 Enhanced Video Out
7.1 Enhanced Video Out Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
7.2 About This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
7.3 Backward Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
7.4 Function summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
7.4.1 Detailed Feature Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.4.2 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.5 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.6 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7.7 Clock System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7.8 Image Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4
7.8.1 CCIR 656 Pixel Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4
7.8.2 CCIR 656 Line Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4
7.8.3 SAV and EAV Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
7.8.4 Video Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.8.5 CCIR 656 Frame Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.9 Enhanced Video Out Timing Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.9.1 Active Video Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.9.2 SAV and EAV Overlap Period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7.9.3 Control of Frame and Image Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7.9.4 Horizontal and Frame Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7.10 Genlock Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
7.11 Data Transfer Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.12 Image Data Memory Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
PRELIMINARY SPECIFICATION
7
PNX1300/01/02/11 Data Book
Philips Semiconductors
7.12.1 Video Image Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.12.2 Planar Storage of Video Image Data in Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7.12.3 Graphics Overlay Image Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7.13 Video Image Conversion Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7.13.1 YUV 4:2:2 Interspersed to YUV 4:2:2 Co-sited Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7.13.2 YUV 4:2:0 to YUV 4:2:2 Co-sited Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7.13.3 YUV-2x Upscaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7.13.4 Pixel Mirroring for Four-tap Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7.14 EVO Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
7.15 Video Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
7.15.1 Alpha Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
7.15.2 Chroma Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7.15.3 Programmable Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7.16 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 7-14
7.16.1 VO Status Register (VO_STATUS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
7.16.2 VO Control Register (VO_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17
7.16.3 VO-Related Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
7.16.4 EVO Control Register (EVO_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20
7.16.5 EVO-Related Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7.17 Enhanced Video Out Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7.17.1 Video Refresh Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7.18 Frame and field timing control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7.18.1 Recommended values for timing registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7.18.2 Data-transfer Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7.18.3 Interrupts and Error Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7.18.4 Latency and Bandwidth Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24
7.18.5 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24
7.19 DDS and PLL Filter Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25
8 Audio In
8.1 Audio In Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
8.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-1
8.3 Clock System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8.3.1 PNX1300 Improved Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8.3.2 TM-1000 Compatibility Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8.4 Clock System Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8.5 Serial Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
8.6 Memory Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
8.7 Audio In Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
8.8 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8.9 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8
PRELIMINARY SPECIFICATION
Philips Semiconductors
8.10 Error Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8.11 Diagnostic Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
9 Audio Out
9.1 Audio Out Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
9.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
9.3 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
9.4 Internal Clock Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
9.4.1 PNX1300 Standard Improved Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3
9.4.2 TM-1000 Compatibility Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.5 Clock System Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.6 Serial Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.6.1 Serial Frame Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5
9.6.2 I2S Serial Framing Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
9.7 Codec Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
9.8 Memory Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7
9.9 Audio Out Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-8
9.10 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 9-9
9.11 Timestamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 9-10
9.12 powerdown and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10
9.13 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10
9.14 Error Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11
10 SPDIF Out
10.1 SPDIF Out Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1
10.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1
10.3 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1
10.3.1 SPDIF Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1
10.3.2 Transparent DMA Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1
10.4 IEC-958 Serial Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
10.5 IEC-958 Bit Cell and Pre-amble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
10.6 IEC-958 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
10.7 IEC-958 Memory Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
10.8 Sample Rate Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
10.9 Transparent Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
10.10 DMA Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
10.11 DMA Error Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
10.12 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
10.13 Timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
10.14 MMIO Register Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5
10.15 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 10-6
PRELIMINARY SPECIFICATION
9
PNX1300/01/02/11 Data Book
Philips Semiconductors
10.16 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6
10.17 HBE and Highway Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6
10.18 Literature References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7
11 PCI Interface
11.1 PCI Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 11-1
11.2 PCI Interface as an Initiator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.2.1 DSPCPU Single-Word Loads/Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.2.2 I/O Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.2.3 Configuration Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.2.4 DMA Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.3 PCI Interface as a Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.4 Transaction Concurrency, Priorities, and Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.5 Registers Addressed in PCI Configuration Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.5.1 Vendor ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.5.2 Device ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.5.3 Command Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.5.4 Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-5
11.5.5 Revision ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6
11.5.6 Class Code Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6
11.5.7 Cache Line Size Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11.5.8 Latency Timer Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11.5.9 Header Type Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11.5.10 Built-In Self Test Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11.5.11 Base Address Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11.5.12 Subsystem ID, Subsystem Vendor ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.5.13 Expansion ROM Base Address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.5.14 Interrupt Line Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.5.15 Interrupt Pin Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.5.16 Max_Lat, Min_Gnt Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.6 Registers in MMIO Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.6.1 DRAM_BASE Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.6.2 MMIO_BASE Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
11.6.3 MMIO/DRAM_BASE updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-10
11.6.4 BIU_STATUS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11
11.6.5 BIU_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11
11.6.6 PCI_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12
11.6.7 PCI_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12
11.6.8 CONFIG_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12
11.6.9 CONFIG_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13
11.6.10 CONFIG_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13
10
PRELIMINARY SPECIFICATION
Philips Semiconductors
11.6.11 IO_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13
11.6.12 IO_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13
11.6.13 IO_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13
11.6.14 SRC_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14
11.6.15 DEST_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14
11.6.16 DMA_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14
11.6.17 INT_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15
11.7 PCI Bus Protocol Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15
11.7.1 Single-Data-Phase Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-16
11.7.2 Multi-Data-Phase Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-16
11.8 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17
11.8.1 Bus Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17
11.8.2 No Expansion ROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17
11.8.3 No Cacheline Wrap Address Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17
11.8.4 No Burst for I/O or Configuration Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17
11.8.5 Word-Only MMIO Register Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17
12 SDRAM Memory System
12.1 New in PNX1300/01/02/11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1
12.2 PNX1300 Main Memory Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1
12.3 Main-Memory Address Aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1
12.4 Memory Devices Supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12.4.1 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12.4.2 SGRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12.5 Memory Granularity and Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12.6 Memory System Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
12.6.1 MM_CONFIG Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
12.6.2 PLL_RATIOS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-4
12.7 Memory Interface Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5
12.8 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5
12.8.1 Address Mapping in 32-bit mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5
12.8.2 Address Mapping in 16-bit mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6
12.9 Memory Interface and SDRAM Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6
12.10 On-Chip SDRAM Interleaving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6
12.11 Refresh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 12-6
12.12 Power-Down Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.13 Output Driver Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.14 Signal Propagation Delay Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.15 Circuit Board Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.15.1 General Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.15.2 Specific Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8
PRELIMINARY SPECIFICATION
11
PNX1300/01/02/11 Data Book
Philips Semiconductors
12.15.3 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8
12.16 Timing Budget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8
12.16.1 Main AC Parameter requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.17 Example Block Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.17.1 Block Diagrams for a 32-bit interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.17.1.1 16-Mbit Devices or Less . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.17.1.2 64-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-10
12.17.1.3 128-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-13
12.17.1.4 256-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16
12.17.2 Block Diagrams for a 16-bit interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-17
13 System Boot
13.1 Boot Sequence Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-1
13.2 Boot Hardware Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
13.2.1 Boot Procedure Common to Both Autonomous and Host-Assisted Bootstrap . . . . . . . . . . . . 13-2
13.2.2 Initial DSPCPU Program Load for Autonomous Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
13.3 Host-Assisted Boot Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
13.3.1 Stage 1: PNX1300 System Boot Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
13.3.2 Stage 2: Host-System PCI Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
13.3.3 Stage 3: PNX1300 Driver Executing on the Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
13.4 Detailed EEPROM Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-7
13.5 EEPROM Access Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-9
14 Image Coprocessor
14.1 Image Coprocessor Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1
14.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 14-1
14.2.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 14-1
14.2.2 Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1
14.2.3 Image Size and Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3
14.3 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 14-3
14.4 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 14-3
14.4.1 Image Input Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3
14.4.1.1 YUV 4:2:2 Co-Sited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3
14.4.1.2 YUV 4:2:2 Interspersed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3
14.4.1.3 YUV 4:2:0 XY Interspersed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3
14.4.1.4 YUV 4:1:1 Co-Sited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3
14.4.2 Image Overlay Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5
14.4.3 Alpha Blending Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5
14.4.4 Output Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5
14.5 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6
14.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6
12
PRELIMINARY SPECIFICATION
Philips Semiconductors
14.5.2 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6
14.5.3 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6
14.5.4 YUV to RGB Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9
14.5.5 Overlay and Alpha Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9
14.5.6 Dithering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-10
14.5.7 Implementation Overview: Horizontal Scaling and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 14-11
14.5.7.1 Loading the extra pixels in the filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12
14.5.7.2 Mirroring pixels at the ends of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12
14.5.7.3 Horizontal filter SDRAM timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12
14.5.8 Implementation Overview: Vertical Scaling and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-13
14.5.8.1 Mirroring lines at the ends of an image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15
14.5.8.2 Vertical filter SDRAM block timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15
14.5.9 Horizontal Scaling and Filtering for RGB Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15
14.5.9.1 YUV sequence counter in YUV 4:2:2 output Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15
14.5.9.2 PCI output block timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-16
14.6 Operation and Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-16
14.6.1 ICP Register Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-17
14.6.2 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-17
14.6.3 ICP Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18
14.6.4 ICP Microprogram Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18
14.6.5 ICP Processing Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18
14.6.6 Priority Delay and ICP Minimum Bus Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-21
14.6.7 ICP Parameter Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22
14.6.8 Load Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22
14.6.9 Horizontal Filter - SDRAM to SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22
14.6.9.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22
14.6.9.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22
14.6.9.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-23
14.6.10 Vertical Filter - SDRAM to SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24
14.6.10.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24
14.6.10.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24
14.6.10.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-25
14.6.11 Horizontal Filter with RGB/YUV Conversion to PCI or SDRAM . . . . . . . . . . . . . . . . . . . . . . 14-25
14.6.11.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-25
14.6.11.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-26
14.6.11.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-27
15 Variable Length Decoder
15.1 VLD Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1
15.2 VLD Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1
15.3 Decoding up to A slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2
PRELIMINARY SPECIFICATION
13
PNX1300/01/02/11 Data Book
Philips Semiconductors
15.4 VLD Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2
15.5 VLD Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 15-3
15.5.1 Macroblock Header Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3
15.5.2 Run-Level Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4
15.6 VLD Time Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4
15.7 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15-4
15.7.1 VLD Status (VLD_STATUS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4
15.7.2 VLD Interrupt Enable (VLD_IMASK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4
15.7.3 VLD Control (VLD_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5
15.8 VLD DMA Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5
15.8.1 DMA Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5
15.8.2 Macroblock Header Output DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5
15.8.3 Run-Level Output DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5
15.9 VLD Operational Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7
15.9.1 VLD Command (VLD_COMMAND) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7
15.9.2 VLD Shift Register (VLD_SR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7
15.9.3 VLD Quantizer Scale (VLD_QS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7
15.9.4 VLD Picture Info (VLD_PI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8
15.10 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8
15.11 Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 15-8
15.12 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 15-8
15.13 Endian-ness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 15-8
15.14 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 15-8
15.15 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8
16 I2C Interface
16.1 I2C Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1
16.2 Compared TO TM-1000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1
16.3 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1
16.4 I2C Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-1
16.4.1 IIC_AR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1
16.4.2 IIC_DR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2
16.4.3 IIC_SR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-3
16.4.4 IIC_CR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-4
16.5 I2C Software Operation Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-5
16.6 I2C Hardware Operation Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-5
16.6.1 Slave NAK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-6
16.7 I2C Clock Rate Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-7
17 Synchronous Serial Interface
17.1 Synchronous Serial Interface Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-1
14
PRELIMINARY SPECIFICATION
Philips Semiconductors
17.2 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 17-1
17.3 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-1
17.3.1 General Purpose I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2
17.3.2 Frame Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3
17.3.3 SSI Transmit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3
17.3.4 SSI Receive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3
17.4 SSI Transmit operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5
17.4.1 Setup SSI_CTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5
17.4.2 Operation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5
17.4.3 Interrupt and Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5
17.5 SSI Receive Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6
17.5.1 Setup SSI_CTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6
17.5.2 Operation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6
17.5.3 Interrupt and Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6
17.6 Frame Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6
17.7 Interrupt Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-7
17.8 16-bit Endian-ness and Shift Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-7
17.9 SSI Test Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8
17.9.1 Remote Loopback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8
17.9.2 Local Loopback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8
17.10 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8
17.10.1 SSI Control Register (SSI_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-9
17.10.2 SSI Control/Status Register (SSI_CSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-11
17.11 Timing Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-12
17.12 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-12
18 JTAG Functional Specification
18.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 18-1
18.2 Test Access Port (TAP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1
18.2.1 TAP Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1
18.2.2 PNX1300 JTAG Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2
18.3 Using JTAG for PNX1300 Debug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3
18.3.1 JTAG Instruction and Data Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-4
18.3.2 JTAG Communication Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5
18.3.3 Example Data Transfer Via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5
18.3.3.1 Transferring data to TriMedia via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5
18.3.3.2 Transferring data from TriMedia via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6
18.3.4 JTAG Interface Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6
19 On-Chip Semaphore Assist Device
19.1 OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 19-1
PRELIMINARY SPECIFICATION
15
PNX1300/01/02/11 Data Book
Philips Semiconductors
19.2 SEM Device Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1
19.3 Constructing a 12-Bit ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1
19.4 Which SEM to Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1
19.5 Usage Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1
20 Arbiter
20.1 Arbiter Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 20-1
20.2 Dual Priorities with Priority Raising Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-1
20.3 Round Robin Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2
20.3.1 Weighted Round Robin Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2
20.3.2 Arbitration Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3
20.4 Arbiter Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-4
20.5 Arbiter programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-5
20.5.1 Latency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-5
20.5.2 Bandwidth Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-6
20.6 Extended Behavior Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7
20.6.1 Extended Bandwidth Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7
20.6.2 Extended Latency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7
20.6.3 Raising Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-8
20.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-8
21 Power Management
21.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 21-1
21.2 Entering and Exiting Global Power Down Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1
21.3 Effect Of Global Power Down On Peripherals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1
21.4 Detailed Sequence of Events For Global Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2
21.5 MMIO Register POWER_DOWN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2
21.6 Block Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2
22 PCI-XIO External I/O Bus
22.1 Summary Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1
22.1.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1
22.2 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 22-3
22.3 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 22-5
22.4 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 22-5
22.4.1 PCI-XIO Bus Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5
22.4.1.1 Flash EEPROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6
22.4.1.2 68K Bus I/O device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6
22.4.1.3 x86/ISA Bus I/O device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6
22.4.1.4 Multiple Flash EEPROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6
22.5 XIO_CTL MMIO Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7
16
PRELIMINARY SPECIFICATION
Philips Semiconductors
22.5.1 PCI_CLK Bus Clock Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7
22.5.2 Wait State Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-8
22.6 PCI-XIO Bus Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-8
22.7 PCI-XIO Bus Controller Operation and Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-12
A PNX1300/01/02/11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DSPCPU Operations
A.1 Alphabetic Operation List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
A.2 Operation List By Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
alloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4
allocd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-5
allocr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-6
allocx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-7
asl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-8
asli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-9
asr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-10
asri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-11
bitand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-12
bitandinv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-13
bitinv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14
bitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15
bitxor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-16
borrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17
carry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18
curcycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-19
cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-20
dcb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-21
dinvalid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-22
dspiabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-23
dspiadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-24
dspidualabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-25
dspidualadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-26
dspidualmul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-27
dspidualsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-28
dspimul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-29
dspisub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-30
dspuadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-31
dspumul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-32
dspuquadaddui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-33
dspusub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-34
dualasr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-35
PRELIMINARY SPECIFICATION
17
PNX1300/01/02/11 Data Book
Philips Semiconductors
dualiclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-36
dualuclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-37
fabsval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-38
fabsvalflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-39
fadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-40
faddflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-41
fdiv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-42
fdivflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-43
feql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-44
feqlflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-45
fgeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-46
fgeqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-47
fgtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-48
fgtrflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-49
fleq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-50
fleqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-51
fles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-52
flesflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-53
fmul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-54
fmulflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-55
fneq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-56
fneqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-57
fsign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-58
fsignflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-59
fsqrt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-60
fsqrtflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-61
fsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-62
fsubflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-63
funshift1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-64
funshift2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-65
funshift3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-66
h_dspiabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-67
h_dspidualabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-68
h_iabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-69
h_st16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-70
h_st32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-71
h_st8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-72
hicycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-73
iabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-74
iadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-75
18
PRELIMINARY SPECIFICATION
Philips Semiconductors
iaddi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-76
iavgonep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-77
ibytesel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-78
iclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-79
iclr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-80
ident . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-81
ieql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-82
ieqli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-83
ifir16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-84
ifir8ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-85
ifir8ui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-86
ifixieee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-87
ifixieeeflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-88
ifixrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-89
ifixrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-90
iflip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-91
ifloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-92
ifloatflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-93
ifloatrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-94
ifloatrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . A-95
igeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-96
igeqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-97
igtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-98
igtri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-99
iimm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-100
ijmpf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-101
ijmpi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-102
ijmpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-103
ild16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-104
ild16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-105
ild16r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-106
ild16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-107
ild8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-108
ild8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-109
ild8r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-110
ileq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-111
ileqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-112
iles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-113
ilesi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-114
imax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-115
PRELIMINARY SPECIFICATION
19
PNX1300/01/02/11 Data Book
Philips Semiconductors
imin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-116
imul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-117
imulm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-118
ineg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-119
ineq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-120
ineqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-121
inonzero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-122
isub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-123
isubi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-124
izero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-125
jmpf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-126
jmpi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-127
jmpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-128
ld32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-129
ld32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-130
ld32r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-131
ld32x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-132
lsl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-133
lsli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-134
lsr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-135
lsri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-136
mergedual16lsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. A-137
mergelsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-138
mergemsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . A-139
nop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-140
pack16lsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-141
pack16msb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . A-142
packbytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-143
pref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-144
pref16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-145
pref32x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-146
prefd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-147
prefr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-148
quadavg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-149
quadumax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . A-150
quadumin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-151
quadumulmsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . A-152
rdstatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-153
rdtag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-154
readdpc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-155
20
PRELIMINARY SPECIFICATION
Philips Semiconductors
readpcsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-156
readspc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-157
rol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-158
roli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-159
sex16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-160
sex8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-161
st16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-162
st16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-163
st32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-164
st32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-165
st8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-166
st8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-167
ubytesel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-168
uclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-169
uclipu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-170
ueql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-171
ueqli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-172
ufir16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-173
ufir8uu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-174
ufixieee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-175
ufixieeeflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-176
ufixrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-177
ufixrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-178
ufloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-179
ufloatflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-180
ufloatrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-181
ufloatrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-182
ugeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-183
ugeqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-184
ugtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-185
ugtri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-186
uimm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-187
uld16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-188
uld16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-189
uld16r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-190
uld16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-191
uld8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-192
uld8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-193
uld8r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-194
uleq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-195
PRELIMINARY SPECIFICATION
21
PNX1300/01/02/11 Data Book
Philips Semiconductors
uleqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-196
ules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-197
ulesi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-198
ume8ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-199
ume8uu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-200
umin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-201
umul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-202
umulm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-203
uneq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-204
uneqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-205
writedpc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-206
writepcsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-207
writespc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-208
zex16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-209
zex8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-210
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-212
B MMIO Register Summary
B.1 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
C Endian-ness
C.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
C.2 Little and Big Endian Addressing Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
C.3 Test to Verify the Correct Operation of PNX1300 in Big and Little Endian Systems . . . . . . . . . . . . . . C-2
C.4 Requirement for the PNX1300 to Operate in Either Little Endian or Big Endian Mode . . . . . . . . . . . . C-2
C.4.1 Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2
C.4.2 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
C.4.3 PNX1300 PCI Interface Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
C.4.4 Image Coprocessor (ICP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
C.4.5 Video In (VI) and Video Out (VO) Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7
C.4.6 Audio In (AI), Audio-Out (AO), and SPDIF Out (SDO) Units . . . . . . . . . . . . . . . . . . . . . . . . . . C-7
C.4.7 Variable Length Encoder (VLD) Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7
C.4.8 Synchronous Serial Interface (SSI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-8
C.4.9 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9
C.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . C-9
C.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9
Index
22
PRELIMINARY SPECIFICATION
Pin List
Chapter 1
by John Chang, Wenyi Song, Thorwald Rabeler, Luis Lucas
1.1
PNX1300 SERIES VERSUS TM-1300
The following summarizes differences between TM-1300 and PNX1300/01/02/11:
•
•
•
•
•
•
•
•
•
Lower core voltage for PNX1311 (2.2V core voltage) and therefore lower power consumption.
DSPCPU speed of up to 200 MHz.
SDRAM speed of up to 183 MHz.
Support for 256 Mbit SDRAM organized in x16. The REFRESH counter must be changed. Refer for in Chapter 12,
“SDRAM Memory System” for details.
Support for 16- and 32-bit Main Memory Interface.
Simplified power supplies sequencing (see Section 1.9.4).
Additional mode where VI_DATA[9:8] in message passing mode are not affected by the VI_DVALID signal.
Bug fixed for PCI Special Cycles. PNX1300 Series discards PCI Special Cycles issued by some PCI chipsets.
Autonomous boot bug in non 1:1 ratio is fixed, resulting in 2KB boot EEPROM size for all CPU:SDRAM ratios.
In the document, ‘PNX1300 Series’ is used interchangebly with ‘PNX1300/01/02/11’, and it always refers to
PNX1300, PNX1301, PNX1302 and PNX1311 products. Any exception will be noted.
1.2
BOUNDARY SCAN NOTICE
PNX1300 Series implements full IEEE 1149.1 boundary scan. Any PNX1300 Series pin designated “IN” only (from a
functionality point of view) can become an output during boundary scan.
1.3
I/O CIRCUIT SUMMARY
PNX1300 Series has a total of 169 functional pins, excluding VDDQ, VSSQ, VREF_PCI and VREF_PERIPH and digital
power/ground. PNX1300 Series uses the types of I/O circuits shown in the table below.
Pad Type
Pad Type Description
PCI
PCI2.1 compliant I/O, capable of using 3.3-V or 5-V PCI signaling conventions.
PCIOD
PCI2.1 compliant Open Drain I/O, capable of using 3.3-V or 5-V PCI signaling conventions.
IICOD
Open drain 3.3-V or 5-V I2C I/O (for I2C pins).
3.3-V only low impedance I/O. Requires board level 27-33 ohm series terminator resistor to match 50 ohm
PCB trace.
3.3-V only I/O circuit with regular drive strength and board trace matched drive impedance.
STRG3
NORM3
STRG5
3.3-V low impedance output, combined with 5-V tolerant input. If used as output, it requires a board level
27-33 ohm series terminator resistor to match 50-ohm PCB trace.
WEAK5
3.3-V regular impedance output, with slow rise/fall, combined with 5-V tolerant input.
For the pins with 5-V input capability, the special pins VREF_PCI or VREF_PERIPH determine 3.3- or 5-V input tolerance, as per the table in Section 1.6. The above pad types are used in the modes listed in the following table.
Modes
Description
IN
Input only, except during boundary scan
OUT
OD
Output only, except during boundary scan
Open drain output - active pull low, no active drive high, requires external pull-up
I/O
Output or input
I/OD
Open drain output with input - active pull low, no active drive high, requires external pull-up
Unused pins may remain floating, i.e. unconnected.
All pins that drive a clock should drive a series resistor.
PRELIMINARY SPECIFICATION
1-1
PNX1300/01/02/11 Data Book
1.4
Philips Semiconductors
SIGNAL PIN LIST
In the table below, a pin name ending in a ‘#’ designates an active-low signal (the active state of the signal is a low
voltage level). All other signals have active-high polarity.
Pin Name
BGA
Ball
Pad
Type
Mode
Description
Main Clock Interface
TRI_CLKIN
L20
NORM3
IN
Main input clock. The SDRAM clock outputs (MM_CLK0 and MM_CLK1) can be set to
2x or 3x this frequency. The on-chip DSPCPU clock (DSPCPU_CLK) can be set to 1x,
5/4, 4/3, 3/2 or 2x the SDRAM clock frequency. Maximum recommended ppm level is
+/- 100 ppm or lower to improve jitter on generated clocks. Duty cycle should not
exceed 30/70% asymmetry.
The operating limits of the internal PLLs are:
• 27 MHz < Output of the SDRAM PLL < 200 MHz
• 33 MHz < Output of the CPU PLL < 266 MHz
These are not the speed grades of the chips, just the PLL limits.
VDDQ
K20
N/A
PWR
Quiet VDD for the PLL subsystem. This pin should be supplied from VDD through a
low-Q series inductor. It should be bypassed for AC to VSSQ, using a dual capacitor
bypass (hi and low frequency AC bypass).
VSSQ
L19
N/A
GND
Quiet VSS for the PLL subsystem. Should be AC bypassed to VDDQ, but should
otherwise be left DC floating. It is connected on-chip to VSS. No external coil or
other connection to board ground is needed, such connection would create a
ground loop.
Miscellaneous System Interface
TRI_RESET#
G19
WEAK5
IN
PNX1300/01/02/11 RESET input. This pin can be tied to the PCI RST# signal in PCI
bus systems. Upon releasing RESET, PNX1300/01/02/11 initiates its boot protocol.
BOOT_CLK
T20
NORM3
IN
Used for testing purposes. Must be connected to TRI_CLKIN for normal operation.
TESTMODE
P19
NORM3
IN
Used for testing purposes. Must be connected to VSS for normal operation.
SCANCPU
D20
NORM3
IN
Used for testing purposes. Must be connected to VSS for normal operation.
RESERVED1
E19
NORM3
I/O
Reserved pin. Has to be left unconnected for normal operation.
RESERVED2
D19
STRG5
I/O
Reserved pin. Has to be left unconnected for normal operation.
F2
N/A
PWR
VREF_PCI determines the mode of operation of the PCI pins listed in Section 1.6.
VREF_PCI must be connected to 5V for use in a 5-V PCI signaling environment or to
VSS (0 V) for use in 3.3-V PCI signaling environment. The supply to this pin should be
AC bypassed and provide 40 mA of DC sink or source capability. Note that this pin
can not be directly connected to the PCI ‘I/O designated power pins’ in a dual
voltage PCI plug-in card. Board level conversion circuitry is required.
VREF_PERIPH
C18
N/A
PWR
VREF_PERIPH determines the mode of operation of the I/O pins listed in Section 1.6.
VREF_PERIPH should be connected to 5V if any of the listed I/O pins provided should
be 5-V input voltage capable. VREF_PERIPH should be connected to VSS (0-V) if all
listed I/O pins are 3.3-V only inputs. The supply to this pin should be AC bypassed and
provide 40 mA of DC sink or source capability.
TRI_USERIRQ
G20
WEAK5
IN
General purpose level/edge interrupt input. Vectored interrupt source number 4.
TRI_TIMER_CLK
H19
WEAK5
IN
External general purpose clock source for timers. Max. 40 MHz.
VREF_PCI
1-2
PRELIMINARY SPECIFICATION
Philips Semiconductors
Pin List
BGA
Ball
Pad
Type
Mode
Description
MM_CLK0
MM_CLK1
Y10
W10
STRG3
OUT
SDRAM output clock at 2x or 3x TRI_CLKIN frequency. Two identical outputs are provided to reliably drive several small memory configurations without external glue.
A series terminating resistor close to PNX1300/01/02/11 is required to reduce ringing.
For driving a 50-ohm trace, a resistor of 27 to 33 ohm is recommended. It is recommended against using higher impedance traces in the SDRAM signals.
MM_A00
MM_A01
MM_A02
MM_A03
MM_A04
MM_A05
MM_A06
MM_A07
MM_A08
MM_A09
MM_A10
MM_A11
MM_A12
MM_A13
W12
Y12
W11
Y11
Y9
W9
V9
Y8
W8
Y7
V12
Y13
W13
Y14
NORM3
OUT
Main memory address bus; used for row and column addresses
MM_DQ00
MM_DQ01
MM_DQ02
MM_DQ03
MM_DQ04
MM_DQ05
MM_DQ06
MM_DQ07
MM_DQ08
MM_DQ09
MM_DQ10
MM_DQ11
MM_DQ12
MM_DQ13
MM_DQ14
MM_DQ15
MM_DQ16
MM_DQ17
MM_DQ18
MM_DQ19
MM_DQ20
MM_DQ21
MM_DQ22
MM_DQ23
MM_DQ24
MM_DQ25
MM_DQ26
MM_DQ27
MM_DQ28
MM_DQ29
MM_DQ30
MM_DQ31
Y20
V18
W19
W20
U18
V19
V20
T18
W18
V17
Y18
W17
Y17
W16
Y16
V15
W7
Y6
W6
V6
Y5
W5
Y4
W4
V2
V3
W1
W2
Y1
Y2
W3
Y3
NORM3
I/O
32-bit data I/O bus.
The Main Memory Interface unit also supports a 16-bit I/O interface. Refer to Chapter
12, “SDRAM Memory System.”
MM_CKE0
MM_CKE1
Y19
U1
NORM3
OUT
Clock enable output to SDRAMs. Two identical outputs are provided in order to reliably
drive several small memory configurations without external glue.
MM_CS0#
MM_CS1#
MM_CS2#
MM_CS3#
U2
U20
U3
U19
NORM3
OUT
Chip select for DRAM rank n; active low
In PNX1300/01/02/11 the chip selects pins may be used as address pins to support
the 256 Mbit SDRAM device organized in x16. Refer to Chapter 12, “SDRAM Memory
System.”
MM_RAS#
W14
NORM3
OUT
Row address strobe; active low
MM_CAS#
Y15
NORM3
OUT
Column address strobe; active low
MM_WE#
W15
NORM3
OUT
Write enable; active low
Pin Name
Main Memory Interface
WARNING: MM_A[13:11] DO NOT CONNECT DIRECTLY TO SDRAM A[13:11] pins.
Refer to Chapter 12, “SDRAM Memory System ” for accurate connection diagrams.
PRELIMINARY SPECIFICATION
1-3
PNX1300/01/02/11 Data Book
Pin Name
MM_DQM0
MM_DQM1
MM_DQM2
MM_DQM3
Philips Semiconductors
BGA
Ball
Pad
Type
Mode
T19
R18
V1
V4
NORM3
OUT
Description
MM_DQ Mask Enable; these are byte enable signals for the 32-bit MM_DQ bus
PCI Interface (Note: current buffer design allows drive/receive from either 3.3 or 5V PCI bus)
PCI_CLK
T2
PCI
IN
All PCI input signals are sampled with respect to the rising edge of this clock. All PCI
outputs are generated based on this clock. Clock is required for normal operation of
the PCI block.
PCI_AD00
PCI_AD01
PCI_AD02
PCI_AD03
PCI_AD04
PCI_AD05
PCI_AD06
PCI_AD07
PCI_AD08
PCI_AD09
PCI_AD10
PCI_AD11
PCI_AD12
PCI_AD13
PCI_AD14
PCI_AD15
PCI_AD16
PCI_AD17
PCI_AD18
PCI_AD19
PCI_AD20
PCI_AD21
PCI_AD22
PCI_AD23
PCI_AD24
PCI_AD25
PCI_AD26
PCI_AD27
PCI_AD28
PCI_AD29
PCI_AD30
PCI_AD31
T1
R3
R2
R1
P2
P1
N2
N1
M2
M1
L2
L1
K1
K2
J1
J2
D1
D3
C1
B2
B1
C2
C3
A1
A3
C4
B4
A4
A5
C6
B6
A6
PCI
I/O
Multiplexed address and data.
PCI_C/BE#0
PCI_C/BE#1
PCI_C/BE#2
PCI_C/BE#3
M3
J3
D2
B3
PCI
I/O
Multiplexed bus commands and byte enables. High for command, low for byte enable.
PCI_PAR
H1
PCI
I/O
Even parity across AD and C/BE lines.
PCI_FRAME#
E2
PCI
I/O
Sustained tri-state. Frame is driven by a master to indicate the beginning and duration
of an access.
PCI_IRDY#
E1
PCI
I/O
Sustained tri-state. Initiator Ready indicates that the bus master is ready to complete
the current data phase.
PCI_TRDY#
F3
PCI
I/O
Sustained tri-state. Target Ready indicates that the bus target is ready to complete the
current data phase.
PCI_STOP#
G2
PCI
I/O
Sustained tri-state. Indicates that the target is requesting that the master stop the current transaction.
PCI_IDSEL
A2
PCI
IN
Used as chip select during configuration read/write cycles.
PCI_DEVSEL#
F1
PCI
I/O
Sustained tri-state. Indicates whether any device on the bus has been selected.
PCI_REQ#
B7
PCI
I/O
Driven by PNX1300/01/02/11 as PCI bus master to request use of the PCI bus.
PCI_GNT#
B5
PCI
IN
Indicates to PNX1300/01/02/11 that access to the bus has been granted.
PCI_PERR#
G1
PCI
I/O
Sustained tri-state. Parity error generated/received by PNX1300/01/02/11.
PCI_SERR#
H2
PCI
OD
System error. This signal is asserted when operating as target and detecting an
address parity error.
1-4
PRELIMINARY SPECIFICATION
Philips Semiconductors
Pin List
BGA
Ball
Pad
Type
PCI_INTA#
PCI_INTB#
PCI_INTC#
PCI_INTD#
C9
A8
B8
A7
PCIOD
PCI
PCIOD
PCIOD
JTAG_TDI
F20
WEAK5
IN
JTAG test data input
JTAG_TDO
F18
WEAK5
I/O
JTAG test data output. This pin can either drive active low, high or float.
JTAG_TCK
F19
WEAK5
IN
JTAG test clock input
JTAG_TMS
E20
WEAK5
IN
JTAG test mode select input
Pin Name
Mode
Description
I/OD • Can operate as input (power up default) or output, as determined by direction control bits in PCI MMIO register INT_CTL.
I/O/OD
I/OD • As input, a PCI_INT# pin can be used to receive PCI interrupt requests (normal
PCI use is active low, level sensitive mode, but the VIC can be set to treat these as
I/OD
positive edge triggered mode). As input, a PCI_INT# pin can also be used as a
general interrupt request pin if not needed for PCI.
• As output, the value of a PCI_INT# can be programmed through PCI MMIO registers to generate interrupts for other PCI masters.
• Whenever XIO bus functionality is active, PCI_INTB# is a push-pull CMOS I/O pin.
When the XIO bus is not active and regular PCI bus functionality is activated, then
PCI_INTB# has a PCI compatible open drain output.
JTAG Interface (debug access port and 1149.1 boundary scan port)
Video In
VI_CLK
C20
STRG5
I/O
• If configured as input (power up default):a positive transition on this incoming video
clock pin samples all other VI_DATA input signals below if VI_DVALID is HIGH. If
VI_DVALID is LOW, VI_DATA is ignored. Clock and data rates of up to 81 MHz are
supported. PNX1300 Series supports an additional mode where VI_DATA[9:8] in
message passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on
page 6-12.
• If configured as output: programmable output clock to drive an external video A/D
converter. Can be programmed to emit integral dividers of DSPCPU_CLK.
If used as output, a board level 27-33 ohm series resistor is recommended to reduce
ringing.
VI_DVALID
A17
WEAK5
IN
VI_DVALID indicates that valid data is present on the VI_DATA lines. If HIGH, VI_DATA
will be accepted on the next VI_CLK positive edge. If LOW, no VI_DATA will be sampled. PNX1300 Series supports an additional mode where VI_DATA[9:8] in message
passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on p age6-12.
VI_DATA0
VI_DATA1
VI_DATA2
VI_DATA3
VI_DATA4
VI_DATA5
VI_DATA6
VI_DATA7
D18
C19
B20
B19
A20
A19
C17
B18
WEAK5
IN
CCIR656 style YUV 4:2:2 data from a digital camera, or general purpose high speed
data input pins. Sampled on VI_CLK if VI_DVALID HIGH.
VI_DATA8
VI_DATA9
A18
B17
WEAK5
IN
Extension high speed data input bits to allow use of 10 bit video A/D converters in
raw10 modes. VI_DATA[8] serves as START and VI_DATA[9] as END message input in
message passing mode. Sampled on positive transitions of VI_CLK if VI_DVALID
HIGH. PNX1300 Series supports an additional mode where VI_DATA[9:8] in message
passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on p age6-12.
I2C Interface
IIC_SDA
R19
IICOD
I/OD
I2C serial data
IIC_SCL
R20
IICOD
I/OD
I2C clock
VO_DATA0
VO_DATA1
VO_DATA2
VO_DATA3
VO_DATA4
VO_DATA5
VO_DATA6
VO_DATA7
P20
N19
N20
M18
M19
M20
K19
J20
WEAK5
OUT
CCIR656 style YUV 4:2:2 digital output data, or general purpose high speed data output channel. Output changes on positive edge of VO_CLK.
Video Out
PRELIMINARY SPECIFICATION
1-5
PNX1300/01/02/11 Data Book
Philips Semiconductors
BGA
Ball
Pad
Type
Mode
VO_IO1
J18
WEAK5
I/O
This pin can function as HS output or as STMSG (Start Message) output.
• If set as HS output, it outputs the horizontal sync signal
• In message passing mode, this pin acts as STMSG output.
VO_IO2
H20
WEAK5
I/O
This pin can function as FS (frame sync) input, FS output or as ENDMSG output.
• If set as FS input, it can be set to respond to positive or negative edge transitions.
• If the Video Out (VO) unit operates in external sync mode and the selected transition
occurs, the VO unit sends two fields of video data. Note: this works only once after a
reset.
• In message passing mode, this pin acts as ENDMSG output.
VO_CLK
J19
STRG5
I/O
The VO unit emits VO_DATA on a positive edge of VO_CLK. VO_CLK can be configured as input (reset default) or output.
• If configured as input: VO_CLK is received from external display clock master circuitry.
• If configured as output, PNX1300/01/02/11 emits a programmable clock frequency.
The emitted frequency can be set between approx. 4 and 81 MHz with a sub-Hertz
resolution. The clock generated is frequency accurate and has low jitter properties
due to a combination of an on-chip DDS (Direct Digital Synthesizer) and VCO/PLL.
If used as output, a board level 27-33 ohm series resistor is recommended to reduce
ringing.
Pin Name
Description
Audio In (always acts as receiver, but can be master or slave for A/D timing)
AI_OSCLK
B15
STRG3
OUT
Over-sampling clock. This output can be programmed to emit any frequency up to 40
MHz with a sub-Hertz resolution. It is intended for use as the 256fs or 384fs over sampling clock by external A/D subsystem. A board level 27-33 ohm series resistor is recommended to reduce ringing.
AI_SCK
A16
STRG5
I/O
• When the Audio In (AI) unit is programmed as a serial-interface timing slave
(power-up default), AI_SCK is an input. AI_SCK receives the serial bit clock from
the external A/D subsystem. This clock is treated as fully asynchronous to the
PNX1300/01/02/11 main clock.
• When the AI unit is programmed as the serial-interface timing master, AI_SCK is an
output. AI_SCK drives the serial clock for the external A/D subsystem. The frequency is a programmable integral divisors of the AI_OSCLK frequency.
AI_SCK is limited to 22 MHz. The sample rate of valid samples embedded within the
serial stream is variable. If used as output, a board level 27-33 ohm series resistor is
recommended to reduce ringing.
AI_SD
C15
WEAK5
IN
Serial data from external A/D subsystem. Data on this pin is sampled on positive or
negative edges of AI_SCK as determined by the CLOCK_EDGE bit in the AI_SERIAL
register.
AI_WS
B16
WEAK5
I/O
• When the AI unit is programmed as the serial-interface timing slave (power-up
default), AI_WS acts as an input. AI_WS is sampled on the same edge as selected
for AI_SD.
• When Audio In is programmed as the serial-interface timing master, AI_WS acts as
an output. It is asserted on the opposite edge of the AI_SD sampling edge.
AI_WS is the word-select or frame-synchronization signal from/to the external A/D
subsystem.
1-6
PRELIMINARY SPECIFICATION
Philips Semiconductors
Pin Name
BGA
Ball
Pad
Type
Pin List
Mode
Description
Audio Out (always acts as sender, but can be master or slave for D/A timing)
AO_OSCLK
B14
STRG3
OUT
Over sampling clock. This output can be programmed to emit any frequency up to 40
MHz, with a sub-Hertz resolution. It is intended for use as the 256 or 384fs over sampling clock by the external D/A conversion subsystem. A board level 27-33 ohm series
resistor is recommended to reduce ringing.
AO_SCK
A14
STRG5
I/O
• When the Audio Out (AO) unit is programmed to act as the serial interface timing
slave (power up default), AO_SCK acts as input. It receives the Serial Clock from
the external audio D/A subsystem. The clock is treated as fully asynchronous to the
PNX1300/01/02/11 main clock.
• When the AO unit is programmed to act as serial interface timing master, AO_SCK
acts as output. It drives the serial clock for the external audio D/A subsystem. The
clock frequency is a programmable integral divisor of the AO_OSCLK frequency.
AO_SCK is limited to 22 MHz. The sample rate of valid samples embedded within the
serial stream is variable. If used as output, a board level 27-33 ohm series resistor is
recommended to reduce ringing.
AO_SD1
B13
WEAK5
OUT
Serial data to external stereo audio D/A subsystem for first 2 of 8 channels. The timing
of transitions on this output is determined by the CLOCK_EDGE bit in the AO_SERIAL
register, and can be on positive or negative AO_SCK edges.
AO_SD2
A13
WEAK5
OUT
Serial data.
AO_SD3
C12
WEAK5
OUT
Serial data.
AO_SD4
B12
WEAK5
OUT
Serial data.
AO_WS
A15
WEAK5
I/O
• When the AO unit is programmed as the serial-interface timing slave (power-up
default), AO_WS acts as an input. AO_WS is sampled on the opposite AO_SCK
edge at which AO_SDx are asserted.
• When the AO unit is programmed as serial-interface timing master, AO_WS acts as
an output. AO_WS is asserted on the same AO_SCK edge as AO_SDx.
AO_WS is the word-select or frame-synchronization signal from/to the external D/A
subsystem. Each audio channel receives 1 sample for every WS period.
S/PDIF Output (Output)
SPDO
A12
STRG3
OUT
Self clocking serial data stream as per IEC958, with 1937 extensions. Note that the
low impedance output buffer requires a 27 to 33 ohm series terminator close to
PNX1300/01/02/11 in order to match the board trace impedance. This series terminator can be/must be part of the voltage divider needed to create the coaxial output
through the AC isolation transformer.
Synchronous Serial Interface (SSI) to an off-chip modem front-end
SSI_CLK
B11
WEAK5
IN
Clock signal of the synchronous serial interface to an off-chip modem analog frontend
or ISDN terminal adapter; provided by the receive channel of an external communication device.
SSI_RXFSX
A11
WEAK5
IN
Receive frame sync reference of the synchronous serial interface, provided by the
receive channel of an external communication device.
SSI_RXDATA
A10
WEAK5
IN
Receive serial data input; provided by the receive channel of an external communication device.
SSI_TXDATA
B10
WEAK5
OUT
Transmit serial data output; sent to the transmit channel of the external communication
device.
SSI_IO1
A9
WEAK5
I/O
General purpose programmable I/O. Set to input on power up.
SSI_IO2
B9
WEAK5
I/O
General purpose programmable I/O. Set to input on power up. Can also be programmed to function as the transmit channel frame synchronization reference output.
PRELIMINARY SPECIFICATION
1-7
PNX1300/01/02/11 Data Book
1.5
POWER PIN LIST
VSS (ground)
C5
C16
D4
D5
D16
D17
E3
E4
E17
E18
T3
T4
T17
U4
U5
U16
U17
V5
V16
1-8
Philips Semiconductors
H8
H9
H10
H11
H12
H13
J8
J9
J10
J11
J12
J13
K8
K9
K10
K11
K12
K13
L8
VCC (3.3V I/O supply)
L9
L10
L11
L12
L13
M8
M9
M10
M11
M12
M13
N8
N9
N10
N11
N12
N13
C7
C10
C11
C14
D6
D7
D10
D11
D14
D15
F4
F17
G3
G4
PRELIMINARY SPECIFICATION
G17
G18
K3
K4
K17
K18
L3
L4
L17
L18
P3
P4
P17
P18
R4
R17
U6
U7
U10
U11
U14
U15
V7
V10
V11
V14
VDD (2.5V core supply)
C8
C13
D8
D9
D12
D13
H3
H4
H17
H18
J4
J17
M4
M17
N3
N4
N17
N18
U8
U9
U12
U13
V8
V13
Philips Semiconductors
1.6
Pin List
PIN REFERENCE VOLTAGE
With the exception of Open Drain mode outputs, outputs always drive to a level determined by the 3.3-V I/O voltage.
VREF_PERIPH and VREF_PCI purely determine input voltage clamping, not input signal thresholds or output levels.
Inputs always in 3.3-V mode
TRI_CLKIN
BOOT_CLK
TESTMODE
SCANCPU
RESERVED1
VREF_PCI determined mode
PCI_AD00
PCI_AD01
PCI_AD02
PCI_AD03
PCI_AD04
PCI_AD05
PCI_AD06
PCI_AD07
PCI_AD08
PCI_AD09
PCI_AD10
PCI_AD11
PCI_AD12
PCI_AD13
PCI_AD14
PCI_AD15
PCI_AD16
PCI_AD17
PCI_AD18
PCI_AD19
PCI_AD20
PCI_AD21
PCI_AD22
PCI_AD23
PCI_AD24
PCI_AD25
PCI_AD26
PCI_AD27
PCI_AD28
PCI_AD29
PCI_AD30
PCI_AD31
PCI_CLK
PCI_C/BE#0
PCI_C/BE#1
PCI_C/BE#2
PCI_C/BE#3
PCI_PAR
PCI_FRAME#
PCI_IRDY#
PCI_TRDY#
PCI_STOP#
PCI_IDSEL
PCI_DEVSEL#
PCI_REQ#
PCI_GNT#
PCI_PERR#
PCI_SERR#
PCI_INTA#
PCI_INTB#
PCI_INTC#
PCI_INTD#
TRI_RESET#
VREF_PERIPH determined mode
TRI_USERIRQ
TRI_TIMER_CLK
JTAG_TDI
JTAG_TDO
JTAG_TCK
JTAG_TMS
VI_CLK
VI_DVALID
VI_DATA0
VI_DATA1
VI_DATA2
VI_DATA3
VI_DATA4
VI_DATA5
VI_DATA6
VI_DATA7
VI_DATA8
VI_DATA9
IIC_SDA
IIC_SCL
VO_IO1
VO_IO2
VO_CLK
Output only pins
VO_DATA0
VO_DATA1
VO_DATA2
VO_DATA3
VO_DATA4
VO_DATA5
VO_DATA6
VO_DATA7
AI_SCK
AI_SD
AI_WS
AO_SCK
AO_WS
SSI_CLK
SSI_RXFSX
SSI_RXDATA
SSI_IO1
SSI_IO2
RESERVED2
AI_OSCLK
AO_OSCLK
AO_SD1
AO_SD2
AO_SD3
AO_SD4
SSI_TXDATA
SPDO
SDRAM i/f (always 3.3-Volt mode)
MM_CLK0
MM_CLK1
MM_A00
MM_A01
MM_A02
MM_A03
MM_A04
MM_A05
MM_A06
MM_A07
MM_A08
MM_A09
MM_A10
MM_A11
MM_A12
MM_A13
MM_DQ00
MM_DQ01
MM_DQ02
MM_DQ03
MM_DQ04
MM_DQ05
MM_DQ06
MM_DQ07
MM_DQ08
MM_DQ09
MM_DQ10
MM_DQ11
MM_DQ12
MM_DQM0
MM_DQM1
PRELIMINARY SPECIFICATION
MM_DQM2
MM_DQM3
MM_DQ13
MM_DQ14
MM_DQ15
MM_DQ16
MM_DQ17
MM_DQ18
MM_DQ19
MM_DQ20
MM_DQ21
MM_DQ22
MM_DQ23
MM_DQ24
MM_DQ25
MM_DQ26
MM_DQ27
MM_DQ28
MM_DQ29
MM_DQ30
MM_DQ31
MM_CKE0
MM_CKE1
MM_CS0#
MM_CS1#
MM_CS2#
MM_CS3#
MM_RAS#
MM_CAS#
MM_WE#
1-9
PNX1300/01/02/11 Data Book
1.7
Philips Semiconductors
PACKAGE
HBGA292: plastic, heatsink ball grid array package; 292 balls; body 27 x 27 x 1.75 mm
SOT553-1
B
D
A
D1
ball A1
index area
A
∅ j E1 E
A2
A1
detail X
k
k
e1
C
v M B
b
e
∅w M
v M A
y
y1 C
Y
W
V
U
e
T
R
P
N
M
L
e1
K
J
H
G
F
E
D
C
B
A
2
1
4
3
6
5
8
7
10
9
12
11
14
13
16
15
18
17
20
19
X
0
10
scale
DIMENSIONS (mm are the original dimensions)
A
UNIT
max.
mm
1.8
To
To
To
To
2.51
A1
A2
b
D
D1
E
E1
e
e1
∅j
k
v
w
y
y1
0.70
0.50
1.83
1.63
0.90
0.60
27.2
26.8
24.1
23.9
27.2
26.8
24.1
23.9
1.27
24.13
21.0
15.4
4.2
3.8
0.2
0.2
0.15
0.25
nc product
nc product
nc product
nc product
code
code
code
code
7097
7097
7098
7098
6557.
9557.
2557.
5557.
ORDERING INFORMATION
order 143-MHz/2.5V product, part number is
order 180-MHz/2.5V product, part number is
order 200-MHz/2.5V product, part number is
order 166-MHz/2.2V product, part number is
1-10
20 mm
‘PNX1300’,
‘PNX1301’,
‘PNX1302’,
‘PNX1311’,
PRELIMINARY SPECIFICATION
12
12
12
12
9352
9352
9352
9352
Philips Semiconductors
1.9
Pin List
PARAMETRIC CHARACTERISTICS
1.9.1
PNX1300/01/02/11 Absolute Maximum Ratings
Permanent damage may occur if these conditions are exceeded
Symbol
Parameter
Min.
Max
Units
Notes
VDDMAX
2.5-V core supply voltage (PNX1300/01/02/11)
-0.5
3.5
V
VCCMAX
3.3-V I/O supply voltage
-0.5
4.6
V
VI-5V
DC input voltage on all 5-V pins
-0.5
VX+0.5
V
VI-3.3V
DC input voltage on all 3.3-V pins
-0.5
VCC+0.3
V
Tstg
Storage temperature range
-65
150
Deg. C
T
Operating case temperature range
0
120
Deg. C
HBMESD
Human Body Model Electrostatic handling for all pins
-
-
CLASS 1C
2
MMESD
Machine Model Electrostatic handling for all pins
-
-
CLASS A
3
case
1
Notes: 1. VX in the 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6.
2. JEDEC Standard, June 2000
3. JEDEC Standard, October 1997
1.9.2
PNX1300/01/02 Operating Range and Thermal Characteristics
Functional operation, long-term reliability and AC/DC characteristics are guaranteed for the operating conditions below.
Symbol
Parameter
Minimum Typical Maximum
Units
VDD
PNX1300/01/02 Core supply voltage
2.375
2.50
2.625
VCC
I/O supply voltage
3.135
3.30
3.465
V
T case
Operating case temperature range
85
°C
Ψ jt
junction to case thermal resistance
3.8
°C/W
ϑja
junction to ambient thermal resistance (natural convection)
15
°C/W
1.9.3
0
V
PNX1311 Operating Range and Thermal Characteristics
Functional operation, long-term reliability and AC/DC characteristics are guaranteed for the operating conditions below.
Symbol
Parameter
Minimum Typical Maximum
Units
VDD
PNX1311 Core supply voltage
2.090
2.20
2.310
VCC
I/O supply voltage
3.135
3.30
3.465
V
T
Operating case temperature range
85
°C
case
0
V
Ψ jt
junction to case thermal resistance
3.8
°C/W
ϑja
junction to ambient thermal resistance (natural convection)
15
°C/W
1.9.4
PNX1300/01/02/11 Power Supply Sequencing
Power application and power removal should obey the following rule:
VDD should never exceed V CC by more than 0.5 V
Permanent damage may occur if this rule is not observed.
PRELIMINARY SPECIFICATION
1-11
PNX1300/01/02/11 Data Book
1.9.5
PNX1300/01/02 DC/AC Characteristics
Symbol
V
Philips Semiconductors
Parameter
Condition/Notes
Core supply voltage
DD
Max
Units
2.625
V
3.135
3.465
V
I DD-typ
Core supply current
200 MHz CPU operation (Max. application)
1400
mA
I CC-typ
I/O supply current
183 MHz SDRAM operation (Max. application)
160
mA
I DD-pdn
Core supply current
CPU power down mode; 200 MHz
300
mA
I CC-pdn
I/O supply current
CPU power down mode; 183 MHz
50
mA
V
Input HIGH voltage for I/O-5 V
Note 1. All I/O’s except IICOD
2.0
VX+ 0.5
V
VIH-3.3v
Input HIGH voltage for I/O-3.3 V
All I/Os except IICOD
2.0
V
Input LOW voltage for I/O-5 V
All I/Os except IICOD
-0.5
0.8
Input LOW voltage for I/O-3.3 V
All I/Os except IICOD
-0.3
0.8
V
Input leakage current for I/O-5 V
0 < VIN < 2.7V
-70
70
uA
Input leakage current for I/O-3.3 V
0 < VIN < 2.7V
-0
10
uA
8
pF
Units
V
V
I
I
I/O supply voltage
Min.
2.375
CC
IH-5v
IL-5v
IL-3.3v
IL-5v
IL--3.3v
C
Input pin capacitance
IN
V
CC +
0.3
V
V
Notes: 1. VX for a 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6.
1.9.6
PNX1311 DC/AC Characteristics
Symbol
V
V
DD
CC
Min.
Max
Core supply voltage
Parameter
Condition/Notes
2.090
2.310
V
I/O supply voltage
3.135
3.465
V
I DD-typ
Core supply current
166 MHz CPU operation (Max. application)
1110
mA
I CC-typ
I/O supply current
166 MHz SDRAM operation (Max. application)
145
mA
I DD-pdn
Core supply current
CPU power down mode; 166 MHz
215
mA
I CC-pdn
I/O supply current
CPU power down mode; 166 MHz
46
mA
V
Input HIGH voltage for I/O-5 V
Note 1. All I/O’s except IICOD
2.0
VIH-3.3v
Input HIGH voltage for I/O-3.3 V
All I/Os except IICOD
2.0
V
Input LOW voltage for I/O-5 V
All I/Os except IICOD
-0.5
0.8
V
Input LOW voltage for I/O-3.3 V
All I/Os except IICOD
-0.3
0.8
V
Input leakage current for I/O-5 V
0 < VIN < 2.7V
-70
70
uA
Input leakage current for I/O-3.3 V
0 < VIN < 2.7V
-0
10
uA
8
pF
V
I
I
IH-5v
IL-5v
IL-3.3v
IL-5v
IL--3.3v
C
IN
Input pin capacitance
Notes: 1. VX for a 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6.
1-12
PRELIMINARY SPECIFICATION
VX+ 0.5
V
CC +
0.3
V
V
Philips Semiconductors
1.9.7
Pin List
PNX1300 Series Power Consumption
The power consumption of PNX1300 Series is dependent on the activity of the DSPCPU, the amount of peripherals being used, the frequency at which the system
is running as well as the loads on the pins.
•
The first section presents the power consumption for
known applications. The other power related sections
present the maximum power consumption. These maximum values are obtained with a ‘fake’ application that
turns on all the peripherals and runs intensive compute
on the CPU.
1.9.7.1
Power Consumption for
Applications on PNX1300 Series
The Table 1-1 and Table 1-2 present the power consumption for two typical applications:
•
•
The DVD playback includes video display using the
VO peripheral and audio streaming using AO peripheral. The bitstream is brought into the TM-1300 system over the PCI peripheral. The VLD co-processor
is used to perform the bitstream parsing. The bitstream is not scrambled therefore the DVDD co-processor is not used and it is turned off.
The MPEG4 application includes video and audio
playback of an enocded CIF stream. The bit stream
is brought into the PNX1300 system over the PCI
peripheral. The Video and Audio subsystems of the
PNX1300 were used to render the video and sound
from the decoded stream into the video monitor and
speakers.
The H263 video conferencing application includes
the following steps. It captures a CCIR656 video
stream at 30 frames/second using the VI peripheral.
The incoming video stream is downscaled, on the fly,
to SIF resolution by VI. The captured frames are then
downscaled to a QSIF resolution using the ICP coprocessor. The resulting QSIF image is sent over the
PCI bus via the ICP co-processor to a SVGA card
(PC monitor display) and encoded by the DSPCPU.
The resulting bitstream is then decoded by the
DSPCPU and displayed as a SIF image on the same
PC monitor (also using the ICP co-processor). All the
encoding/decoding part is done in the YUV color
space. The display is in the RGB16 color space.
Software is not optimized.
Three main technics may be applied to reduce the ‘Out
of the Box’ power consumption.
•
•
•
Turn off the unused peripherals. Refer to Section
21.6 on pag e21-2.
Run the system at the required speed, i.e. some
application may not require to run at the full speed
grade of the chip.
Powerdown the system or the DSPCPU each time
the DSPCPU reached the Idle task.
A more detailed description can be found in the application note ‘TM-1300 Power Saving Features’ available at
the following website:
http://www.semiconductors.philips.com/trimedia/
Table 1-1. Power Consumption of Example Applications for PNX1300/01/02 (Vdd = 2.5V)
Optimizations
APPLICATIONS
AFTER
POWER
OPTIMIZATIONS
WITHOUT
POWER
OPTIMIZATIONS
Unused
Peripherals
Turned Off
System Speed
Adjustment
Idle task power
management
DVD Playback
2.2 W
3.0 W @ 180 MHz
2.6 W @ 180 MHz
2.6 W @ 180 MHz
2.2 W @ 180 MHz
H.263 Vconf
1.7 W
2.9 W @ 166 MHz
2.7 W @ 166 MHz
1.9 W @ 111 MHz
1.7 W @ 111 MHz
Table 1-2. Power Consumption of Example Applications for PNX1311(Vdd = 2.2V)
Optimizations
APPLICATIONS
AFTER
POWER
OPTIMIZATIONS
MPEG4 (CIF) A/V
Playback
1.2 W
H.263 Vconf
1.5 W
WITHOUT
POWER
OPTIMIZATIONS
Unused
Peripherals
Turned Off
System Speed
Adjustment
Idle task power
management
2.5 W @ 166 MHz
2.1 W @ 166 MHz
1.3 W @ 70 MHz
1.2 W @ 70 MHz
2.4 W @ 166 MHz
2.2 W @ 166 MHz
1.7 W @ 111 MHz
1.5 W @ 111 MHz
As previously mentioned the Table 1-1 and Table 1-2
show that the final power consumption for a realistic application may be lower than the values reported in the
next section.
Based on these results and the following section, the
power consumption of PNX1300 Series, using an artifi-
cial scenario depicting an extremely demanding application, for commonly used speeds, is as follows:
•
•
•
PNX1300/01/02 is < 3.4 W @ 166:133 MHz
PNX1311 is < 2.9 W @ 166:133 MHz
PNX1302 is < 4.0 W @ 200:133 MHz
PRELIMINARY SPECIFICATION
1-13
PNX1300/01/02/11 Data Book
1.9.7.2
Philips Semiconductors
PNX1300/01/02 DSPCPU Core Current and Power Consumption
PNX1300
143:143
Symbol
Current/Notes
PNX1302
192:144
PNX1302
200:133
Pwd
Typ
Max
Pwd
Typ
Max
Pwd
Typ
Max
Pwd
Typ
Max
Units
225
1125
1200
250
1200
1300
300
1380
1475
300
1400
1525
mA
ICC
40
125
135
40
120
135
40
130
135
36
125
130
mA
Total Power Dissipation
IDD , DSPCPU Only
0.8
-
3.2
820
3.5
920
0.8
-
3.4
900
3.7
1030
0.9
-
3.9
1030
4.1
1200
0.9
-
4.0
1050
4.2
1250
W
mA
PNX130x IDD
(note 1)
PNX1301
166:133
ICC , DSPCPU Only
-
55
45
-
50
45
-
55
45
-
55
45
mA
Power DSPCPU Only
-
2.2
2.5
-
2.4
2.7
-
2.8
3.1
-
2.8
3.3
W
PNX130x IDD , Standby
-
550
-
-
615
-
-
720
-
-
740
-
mA
(note 1,2)
Power Standby
IDD , Standby + bpwd
-
1.5
405
-
-
1.7
450
-
-
1.9
525
-
-
2.0
540
-
W
mA
Power Standby + bpwd
-
1.1
-
-
1.2
-
-
1.4
-
-
1.5
-
W
Notes: 1. Consumption for PNX1300/01/02 is organized in several categories. The “Typ” column shows current consumption for a typical application with a CPI (Clocks Per Instruction) of 1.4. The “Max” column provides current consumption for an application
with a CPI of 1.1. The measurements were taken with all the peripheral units turned on (peripherals run on a random data
pattern at the specified frequencies, except for VO which runs at 27 MHz). This “Max” data represnts an application that
heavily uses the DSPCPU and does not reflect a realistic application; it is used to determine peak currents. The “Typ” measurements reflect real applications. The “Pwd” column shows current consumption when Global Powerdown mode is activated. See Chapter 21, “Power Management.”
2. Standby rows indicate current consumption when DSPCPU is maintained under RESET (See Section 11.6.5, “BIU_CTL
Register”), all peripherals turned off (i.e. not enabled) and all peripherals powered down (+ bpwd row).
3. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.5V and Vcc set to 3.3V.
4. Currents do not scale with frequency unless the CPU to SDRAM ratio is maintained. As an example, the data for CPU to
SDRAM ratio 1:1 for 183:183 MHz can be calculated by using the data from the 143:143 MHz column, and scaling the currents by a factor of 1.279.
1.9.7.3
PNX1311 DSPCPU Core Current and Power Consumption Details
PNX1311
100:100
Symbol
Current/Notes
PNX1311
166:166
PNX1311
166:133
Pwd
Typ
Max Pwd
Typ
Max
Pwd
Typ
Max
Pwd
Typ
Max
Units
129
670
720
185
955
1025
215
1110
1200
200
1032
1100
mA
ICC
28
87
100
40
125
140
46
145
170
37
123
130
mA
Total Power Dissipation
IDD , DSPCPU Only
0.4
-
1.8
490
1.9
550
0.5
-
2.5
700
2.7
785
0.6
-
2.9
815
3.2
915
0.6
-
2.7
756
2.9
880
W
mA
PNX131x IDD
(note 1)
PNX1311
143:143
ICC , DSPCPU Only
-
38
31
-
55
45
-
65
55
-
50
45
mA
-
1.2
325
1.3
-
-
1.7
460
1.9
-
-
2.0
535
2.2
-
-
1.8
518
2.1
-
W
mA
IDD , Standby + bpwd
-
0.8
240
-
-
1.1
340
-
-
1.3
395
-
-
1.3
375
-
W
mA
Power Standby + bpwd
-
0.6
-
-
0.9
-
-
1.0
-
-
0.9
-
W
Power DSPCPU Only
PNX131x IDD , Standby
(note 1,2) Power Standby
Notes: 1. Consumption for PNX1311 is organized in several categories. The “Typ” column shows current consumption for a typical
application with a CPI (Clocks Per Instruction) of 1.4. The “Max” column provides current consumption for an application with
a CPI of 1.1. The measurements were taken with all the peripheral units turned on (peripherals run on a random data pattern
at the specified frequencies, except for VO which runs at 27 MHz). This “Max” data represnts an application that heavily uses
the DSPCPU and does not reflect a realistic application; it is used to determine peak currents. The “Typ” measurements
reflect real applications. The “Pwd” column shows current consumption when Global Powerdown mode is activated. See
Chapter 21, “Power Management.”
2. Standby rows indicate current consumption when DSPCPU is maintained under RESET (See Section 11.6.5, “BIU_CTL
Register”), all peripherals turned off (i.e. not enabled) and all peripherals powered down (+ bpwd row).
3. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.2V and Vcc set to 3.3V.
4. Currents do not scale with frequency unless the CPU to SDRAM ratio is maintained.
1-14
PRELIMINARY SPECIFICATION
Philips Semiconductors
1.9.7.4
Pin List
PNX1300/01/02 Current Consumption For On-Chip Peripherals
PNX1300
143:143
Symbol
PNX1301
166:133
PNX1302
192:144
PNX1302
200:133
Current/Notes
Pwd
Typ
Max
Pwd
Typ
Max
Pwd
Typ
Max
Pwd
Typ
Max
Units
VO
27 MHz
IDD , running raw mode
50
28
39
55
29
38
65
16
26
72
27
36
mA
ICC , running raw mode
-
9
17
-
12
17
-
12
17
-
12
17
mA
VO
81 MHz
IDD , running raw mode
-
23
75
-
33
54
-
30
58
-
47
72
mA
ICC , running raw mode
-
33
51
-
37
51
-
36
52
-
36
52
mA
VI
27 MHz
IDD , running raw mode
6
8
18
6
6
18
7
8
18
7
6
18
mA
ICC , running raw mode
-
7
14
-
6
14
-
8
15
-
9
15
mA
AO
44 KHz
IDD , stereo 16-bit
2
3
1
1
3
1
1
3
4
5
3
3
mA
ICC , stereo 16-bit
-
2
1
-
1
1
-
1
1
-
1
1
mA
AI
44 KHz
IDD , stereo 16-bit
1
2
2
1
3
3
1
3
2
1
3
3
mA
ICC , stereo 16-bit
-
1
1
-
1
1
-
1
1
-
1
1
mA
SPDIF
48 KHz
IDD running PCM audio
2
3
2
2
3
1
3
3
3
4
2
2
mA
ICC running PCM audio
-
3
3
-
2
2
-
2
2
-
2
2
mA
ICP
IDD , mem. block move
61
95
176
67
95
170
80
105
188
86
106
184
mA
ICC , mem. block move
-
28
28
-
27
54
-
30
61
-
29
59
mA
PCI
33 MHz
IDD , DMA transfer
-
37
83
-
34
80
-
32
83
-
40
53
mA
ICC , DMA transfer
-
58
102
-
58
102
-
58
104
-
58
82
mA
VLD
IDD
3
-
-
5
-
-
6
-
-
6
-
-
mA
ICC
-
-
-
-
-
-
-
-
-
-
-
-
mA
IDD
4
-
-
5
-
-
6
-
-
6
-
-
mA
ICC
-
-
-
-
-
-
-
-
-
-
-
-
mA
IDD
18
-
-
21
-
-
24
-
-
24
-
-
mA
ICC
-
-
-
-
-
-
-
-
-
-
-
-
mA
SSI
10 MHz
DVDD
Notes: 1. Pwd. column for peripheral units indicates current savings when block powerdown is activated compared to when it is idle.
See Chapter 21, “Power Management” for block powerdown activation.
2. Typ. column for peripheral units indicates current required when data pattern is random. The Max. column indicates current
ratings when data is switching from high to low level each cycle. Again that Max. column is to show peak current and does
not represent a real application. For both columns the current reported is the current required by the peripheral as well as
the internal bus and MMI to transfer the data to/from the peripheral unit.
3. Some currents are not reported due to the difficulty to measure it or because they are not relevant. For example SSI current
is difficult to measure because it heavily involves the DSPCPU and thus makes it almost impossible to separate the current
consumed by the SSI or the DSPCPU.
4. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.5V and Vcc set to 3.3V.
5. Currents do not scale with frequency if the CPU:SDRAM ratio are different. Same ratio must be used.
PRELIMINARY SPECIFICATION
1-15
PNX1300/01/02/11 Data Book
1.9.7.5
Philips Semiconductors
PNX1311 Current Consumption For On-Chip Peripherals
Symbol
PNX1311-100:100
PNX1311-143:143
PNX1311-166:166
PNX1311-166:133
Current/Notes
Pwd
Typ
Max
Pwd
Typ
Max
Pwd
Typ
Max
Pwd
Typ
Max
Units
VO
27 MHz
IDDL , running raw mode
33
17
23
47
25
33
56
29
38
48
24
31
mA
ICC , running raw mode
-
8
12
-
12
17
-
14
20
-
25
17
mA
VO
81 MHz
IDDL , running raw mode
-
14
31
-
20
44
-
23
51
-
33
54
mA
ICC , running raw mode
-
25
36
-
36
52
-
42
60
-
37
51
mA
IDDL , running raw mode
3
5
8
5
7
11
6
8
13
5
7
15
mA
VI
27 MHz
AO
44 KHz
AI
44 KHz
SPDIF
48 KHz
ICP
PCI
33 MHz
VLD
SSI
10 MHz
DVDD
ICC , running raw mode
-
6
10
-
9
15
-
10
17
-
8
15
mA
IDDL , stereo 16-bit
4
2
1
6
3
2
7
3
2
1
2
2
mA
ICC , stereo 16-bit
-
1
1
-
1
1
-
1
1
-
1
1
mA
IDDL , stereo 16-bit
1
1
1
1
2
2
1
2
2
1
2
3
mA
ICC , stereo 16-bit
-
1
1
-
1
1
-
1
1
-
1
1
mA
IDDL running PCM audio
2
2
1
3
3
2
3
3
2
2
2
2
mA
ICC running PCM audio
-
1
1
-
2
2
-
2
2
-
2
2
mA
IDDL , mem. block move
40
55
101
57
79
144
66
92
167
60
76
136
mA
ICC , mem. block move
-
19
38
-
27
55
-
31
64
-
26
54
mA
IDDL , DMA transfer
-
17
36
-
25
51
-
29
59
-
20
50
mA
ICC , DMA transfer
-
41
57
-
58
82
-
67
95
-
45
81
mA
IDDL
3
-
-
4
-
-
5
-
-
4
-
-
mA
ICC
-
-
-
-
-
-
-
-
-
-
-
-
mA
IDDL
2
-
-
3
-
-
3
-
-
4
-
-
mA
ICC
-
-
-
-
-
-
-
-
-
-
-
-
mA
IDDL
11
-
-
16
-
-
19
-
-
18
-
-
mA
ICC
-
-
-
-
-
-
-
-
-
-
-
-
mA
Notes: 1. The “Pwd” column for peripheral units indicates current savings when block powerdown is activated, compared to when it is
idle. See Chapter 21, “Power Management” for block powerdown activation.
2. The “Typ” column for peripheral units indicates current required when data pattern is random. The “Max” column indicates
current ratings when data is switching from high to low level each cycle. Again that “Max” column is to show peak current
and does not represent a real application. For both columns the current reported is the current required by the peripheral as
well as the internal bus and MMI to transfer the data to/from the peripheral unit.
3. Some currents are not reported due to the difficulty to measure it or because they are not relevant. For example SSI current
is difficult to measure because it heavily involves the DSPCPU and thus makes it almost impossible to separate the current
consumed by the SSI or the DSPCPU.
4. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.2V and Vcc set to 3.3V.
5. Currents do not scale with frequency if the CPU:SDRAM ratio are different. Same ratio must be used.
1-16
PRELIMINARY SPECIFICATION
Philips Semiconductors
1.9.7.6
Pin List
STRG3, STRG5 type I/O circuit
PNX1300/01/02/11
Symbol
Parameter
Condition/Notes
Min.
Nominal
Output HIGH voltage
I
OUT =
V
OL
Output LOW voltage
I
OUT = -16.0 mA
Z
Output AC impedance
HIGH level output state
11
11
OH
Max
Units
0.9VCC
V
OH
16.0 mA
V
0.1VCC
V
ohm
Output AC impedance
LOW level output state
tr
Output rise time
Test load of Figure 1-1.
2.0
ns
tr
Output fall time
Test load of Figure 1-1.
2.0
ns
Z
OL
1.9.7.7
ohm
NORM3 type I/O circuit
PNX1300/01/02/11
Symbol
V
OH
V
OL
Z
OH
Parameter
Condition/Notes
Min.
Nominal
Max.
Units
0.9VCC
Output HIGH voltage
I
OUT =
Output LOW voltage
I
OUT = -8.0 mA
Output AC impedance
HIGH level output state
23
23
8.0 mA
V
0.1VCC
V
ohm
Output AC impedance
LOW level output state
tr
Output rise time
Test load of Figure 1-2.
4.0
ns
tr
Output fall time
Test load of Figure 1-2.
4.0
ns
Z
OL
1.9.7.8
ohm
WEAK5 type I/O circuit
PNX1300/01/02/11
Symbol
Parameter
Condition/Notes
Min.
Nominal
Max.
Units
0.9VCC
V
OH
Output HIGH voltage
I
OUT =
6.0 mA
V
OL
Output LOW voltage
I
OUT =
-6.0 mA
Z
Output AC impedance
HIGH level output state
33
ohm
Output AC impedance
LOW level output state
33
ohm
Z
OH
OL
V
0.1VCC
V
tr
Output rise time
Test load of Figure 1-3.
4.0
ns
tr
Output fall time
Test load of Figure 1-3.
4.0
ns
1.9.7.9
IICOD (I2c) type I/O circuit
Symbol
V
V
V
IL-IIC
IH-IIC
HYS
Parameter
Condition/Notes
Input LOW voltage
Input HIGH voltage
VX is 3.3V or 5V depending
on VREF_PERIPH value
Input Schmitt trigger hysteresis
Min.
Nominal
Max.
1.0
V
2.3
VX+0.5
V
0.25
V
OL
Output LOW voltage
I
OUT
tf
Output fall time
10 - 400 pF load
Units
-0.5
= -6.0 mA
1.5
PRELIMINARY SPECIFICATION
V
0.6
V
250
ns
1-17
PNX1300/01/02/11 Data Book
1.9.7.10
Philips Semiconductors
SDRAM interface timing for PNX1300/01/02/11 speed grades.
Symbol
Parameter
N
o
t
e
Max Min Max Min Max Units
s
PNX1300
PNX1301
PNX1301
143
166
180
Min
Max
Min
Max
Min
PNX1311
166
PNX1302
200
f SDRAM
MM_CLK frequency
143
166
166
166
183
MHz
1
TCS
Skew between MM_CLK0, CLK1
0.05
0.05
0.05
0.05
0.05
ns
2
TPD
Propagation delay of data, address, control
TOH
Output hold time of data, address and control
TSU
TIH
ns
3
1.5
4.7
1.5
4.2
1.5
4.2
1.5
4.2
1.5
3.7
ns
3
Input data setup time
0
0
0
0
0
ns
4
Input data hold time
2.0
1.5
1.5
1.5
1.5
ns
4
Notes: 1. For best high speed SDRAM operation, 50-ohm matched PCB traces are recommended for all MM_xxx signals.
Use 27-33 ohm series terminator resistors close to PNX1300/01/02/11 in the MM_CLK0 and MM_CLK1 line only.
2. Equal load circuit. MM_CLK0 and MM_CLK1 are matched output buffers.
3. The center of the two rising edges on MM_CLK0, MM_CLK1 are used as the clock reference point.
Propagation delay guarantee is defined from 50% point of clock edge to 50% level on D/A/C.
Output hold time guarantee is defined from 50% point of clock edge to 50% level on D/A/C.
4. MM_CLK0 is used as a reference clock.
Input setup time requirement is defined as data value 50% complete to 50% level on clock.
Input hold time requirement is defined as minimum time from 50% level on clock to 50% change on data.
1.9.7.11
PCI Bus timing
The following specifications meet the PCI Specifications, Rev. 2.1 for 33-MHz bus operation.
Min.
Max
Units
Notes
Tval-PCI (Bus)
Symbol
Clk to signal valid delay, bused signals
Parameter
2
11
ns
1,2,3
Tval-PCI (ptp)
Clk to signal valid delay, point-to-point signals
2
12
ns
1,2,3
Ton-PCI
Float to active delay
2
TOff-PCI
Active to float delay
Tsu-PCI
Input setup time to CLK - bused signals
Tsu-PCI (ptp)
Input setup time to CLK - point-to-point signals
Th-PCI
Input hold time from CLK
Trst-PCI
Reset active time after power stable
Trst-clk-PCI
Reset active time after CLK stable
Trst-off-PCI
Reset active to output float delay
ns
1
ns
1,7
7
ns
3,4
12
ns
3,4
ns
4
28
0.2
1
1
ms
5
100
µs
5
ns
5,6,7
40
1. PCI Clock skew between two PCI devices must be lower than 1.8ns instead of the 2 ns as specified in PCI
2.1 specification
Notes: 1. See the timing measurement conditions in Figure 1-4.
2. Minimum times are measured at the package pin with the load circuit shown in Figure 1-8. Maximum times are measured
with the load circuit shown in Figure 1-6 and Figure 1-7.
3. REG# and GNT# are point-to-point signals and have different input setup times. All other signals are bused.
4. See the timing measurement conditions in Figure 1-5.
5. RST# is asserted and de-asserted asynchronously with respect to CLK.
6. All output drivers are floated when RST# is active.
7. For the purpose of Active/Float timing measurements, the Hi-Z or ‘off’ state is defined to be when the total current delivered
through the component pin is less than or equal to the leakage current specification.
1-18
PRELIMINARY SPECIFICATION
Philips Semiconductors
1.9.7.12
Pin List
JTAG I/O timing
Symbol
Parameter
Min.
Max
Units
20
MHz
Notes
f JTAG-CLK
JTAG clock frequency
Tclk-TDO
JTAG_TCK to JTAG_TDO valid delay
2
ns
1
Tsu-TCK
Input setup time to JTAG_TCK
3
ns
2
Th-TCK
Input hold time from JTAG_TCK
7
ns
2
Max
Units
400
kHz
1
10
Notes: 1. See the timing measurement conditions in Figure 1-10.
2. See the timing measurement conditions in Figure 1-9.
I2C I/O timing
1.9.7.13
Symbol
Parameter
Min.
Notes
f SCL
SCL clock frequency
TBUF
Bus free time
1
µs
2
Tsu-STA
Start condition set up time
1
µs
3
Th-STA
Start condition hold time
1
µs
3
TLOW
SCL LOW time
1
µs
1
THIGH
SCL HIGH time
1
µs
1
Tf
SCL and SDA fall time (Cb = 10-400 pF, from VIH-IIC to VIL-IIC)
ns
1
Tsu-SDA
Data setup time
100
ns
4
Th-SDA
Data hold time
0
ns
4
Tdv-SDA
SCL LOW to data out valid
Tdv-STO
SCL HIGH to data out
Notes: 1.
2.
3.
4.
5.
See
See
See
See
See
1.9.7.14
the timing measurement conditions in Figure
the timing measurement conditions in Figure
the timing measurement conditions in Figure
the timing measurement conditions in Figure
the timing measurement conditions in Figure
20+0.1Cb
250
0.5
1
µs
5
ns
5
1-11.
1-12.
1-13.
1-14.
1-15.
Video In I/O Timing
Symbol
Parameter
Min.
Max
Units
81
MHz
Notes
f VI-CLK
Video In clock frequency
Tsu-CLK
Input setup time to VI_CLK
2
ns
1
Th-CLK
Input hold time from VI_CLK
2
ns
1
Max
Units
Notes
81
MHz
Notes: 1. See the timing measurement conditions in Figure 1-16.
1.9.7.15
Video Out I/O Timing
Symbol
Parameter
Min.
f VO-CLK
Video Out clock frequency
TCLK-DV
VO_CLK to VO_DATA (or VO_IO*) out
3
7.5
ns
1,3
TCLK-DV
VO_CLK to VO_DATA (or VO_IO*) out
3
7.5
ns
1,4
Tsu-CLK
VO_IO* setup time to VO_CLK
10
ns
2
Th-CLK
VO_IO* hold time from VO_CLK
3
ns
2
Notes: 1.
2.
3.
4.
See the timing measurement conditions in Figure 1-17.
See the timing measurement conditions in Figure 1-18.
CLKOUT asserted, i.e. the VO unit is the source of VO_CLK
CLKOUT negated, i.e. the external world is the source of VO_CLK
PRELIMINARY SPECIFICATION
1-19
PNX1300/01/02/11 Data Book
1.9.7.16
Philips Semiconductors
AudioIn I/O timing
Symbol
Parameter
Min.
Max
Units
22
MHz
Notes
f AI-SCK
Audio In AI_SCK clock frequency
Tsu-SCK
Input setup time to AI_SCK
3
ns
1,2
Th-SCK
Input hold time from AI_SCK
2
ns
1,2
TSCK-WS
AI_SCK to AI_WS
ns
3
10
Notes: 1. See the timing measurement conditions in Figure 1-19.
2. The timing measurements are done with respect to the clock edge according to CLOCK_EDGE
3. SER_MASTER asserted, i.e. Audio In is the source of AI_WS. See the timing measurement condition in Figure 1-20.
1.9.7.17
Audio Out I/O timing
Symbol
Parameter
f AO-SCK
Audio Out AO_SCK clock frequency
TSCK-DV
AO_SCK to AO_SDx valid
TSCK-DV
AO_SCK to AO_SDx valid
Tsu-SCK
Input setup time to AO_SCK
Th-SCK
Input hold time from AO_SCK
TSCK-WS
AO_SCK to AO_WS
Notes: 1.
2.
3.
4.
5.
6.
Min.
Max
Units
22
MHz
2
12
ns
1,3,4
2
12
ns
1,3,5
4
ns
2,3,5
2
ns
2,3,5
ns
3,4,6
10
See the timing measurement conditions in Figure 1-21.
See the timing measurement conditions in Figure 1-23.
The timing measurements are done with respect to the AO_SCK clock edge according to CLOCK_EDGE
PNX1300/01/02/11 is the serial interface master, i.e. AO_SCK, AO_WS are outputs
PNX1300/01/02/11 is serial interface slave, i.e. AO_SCK, AO_WS are inputs
See the timing measurement conditions in Figure 1-22.
1.9.7.18
Symbol
SSI I/O timing
Parameter
Min.
Max
Units
Notes
20
MHz
1
12
ns
2
3
ns
3
2
ns
3
f SSI-CLK
SSI_CLK clock frequency
TCLK-DV
SSI_CLK to data valid
2
Tsu-CLK
Input setup time to SSI_CLK
Th-CLK
Input hold time from SSI_CLK
Notes: 1. Interrupt latency limits SSI to a practical use at a bit rate of 1.5 Mbit/sec.
2. See the timing measurement conditions in Figure 1-24.
3. See the timing measurement conditions in Figure 1-25.
1-20
Notes
PRELIMINARY SPECIFICATION
Philips Semiconductors
PNX1300 pin
Pin List
rise/fall test point
2” true length
CLK
V_th
V_tl
V_test
30-ohm
Output
T_su T_h
50-ohm
Buffer
12 pF
Input
Figure 1-1. STRG3, STRG5 test load circuit
PNX1300 pin
V_th
V_test
V_tl
1/2 in. max
Output
Buffer
10 pF
25 Ω
30 pF
Figure 1-6. PCI T val(max) Rising Edge
Figure 1-2. NORM3 test load circuit
PNX1300 pin
V_max
pin
50-ohm
Buffer
V_test
Figure 1-5. PCI Input Timing Measurement Conditions
rise/fall test point
2” true length
Output
inputs
valid
pin
rise/fall test point
2” true length
1/2 in. max
Output
Output
Buffer
50-ohm
Buffer
Vcc
10 pF
15 pF
25 Ω
Figure 1-7. PCI T val(max) Falling Edge
Figure 1-3. WEAK5 test load circuit
pin
1/2 in. max
CLK
Output
Buffer
V_th
V_tl
V_test
10 pF
1K Ω
T_fval
Output
Delay
Vcc
1K Ω
V_tfall
Figure 1-8. PCI T val(min) and Slew Rate
T_rval
Output
Delay
V_trise
Tri-State
Output
TCK
Tsu_TCK
T_on
T_off
TDI, TMS
Figure 1-4. PCI Output Timing Measurement Conditions
Th_TCK
valid
Figure 1-9. JTAG Input Timing
PRELIMINARY SPECIFICATION
1-21
PNX1300/01/02/11 Data Book
Philips Semiconductors
SCL
TCK
Tclk_TDO
Tdv_SDA
Figure 1-15. I2C I/O Timing
Figure 1-10. JTAG Output Timing
THIGH
valid
SDA
valid
TDO
Tdv_STO
TLOW
VI_CLK
SCL
Tf
Tsu_CLK
Tr
Th_CLK
valid
VI_DATA, VI_IO
Figure 1-16. VideoI n I/O Timing
Figure 1-11. I2C I/O Timing
SCL
VO_CLK
TTBUF
TCLK_DV
SDA
VO_DATA
Figure 1-12. I2C I/O Timing
Figure 1-17. Video Out I/O Timing
SCL
VO_CLK
Tsu_STA
Tsu_CLK
Th_STA
SDA
Th_CLK
valid
VO_IO
Figure 1-13. I2C I/O Timing
Figure 1-18. Video Out I/O Timing
AI_SCK
SCL
Tsu_SDA
Tsu_SCK
Th_SDA
valid
SDA
Figure 1-14. I2C I/O Timing
1-22
valid
PRELIMINARY SPECIFICATION
AI_SD, AI_WS
Th_SCK
valid
Figure 1-19. Audio In I/O Timing
Philips Semiconductors
Pin List
AI_SCK
AO_SCK
TSCK_WS
AI_WS
Tsu_SCK
Th_SCK
valid
valid
AO_WS
Figure 1-20. Audio In I/O Timing
Figure 1-23. Audio Out I/O Timing
AO_SCK
SSI_CLK
TSCK_DV
AO_SDx
valid
Figure 1-21. Audio Out I/O Timing
TCLK_DV
SSI I/O
Figure 1-24. SSI I/O Timing
AO_SCK
SSI_CLK
TSCK_WS
AO_WS
valid
Tsu_CLK
SSI_IO
Figure 1-22. Audio Out I/O Timing
Th_CLK
valid
valid
Figure 1-25. SSI I/O Timing
PRELIMINARY SPECIFICATION
1-23
PNX1300/01/02/11 Data Book
1-24
PRELIMINARY SPECIFICATION
Philips Semiconductors
Overview
Chapter 2
by Gert Slavenburg
2.1
INTRODUCTION
In this document, the generic PNX1300 name refers
to the PNX1300 Series, or the PNX1300/01/02/11
products.
PNX1300 is a successor to the TM-1300, TM-1100 and
TM-1000 media processors. For those familiar with the
TM-1300, the new features specific to the PNX1300 are
summarized in Section 2.6. For those familiar with the
TM-1100, the new features specific to the PNX1300 are
summarized in Section 2.7. For those familiar with the
TM-1000, new features for the PNX1300 are summarized in Section 2.8.
2.2
PNX1300 FUNDAMENTALS
PNX1300 is a media processor for high-performance
multimedia applications that deal with high-quality video
and audio. These applications can range from low-cost,
dedicated systems such as video phones, video editing,
digital television, security systems or set-top boxes to reprogrammable, multipurpose plug-in cards for personal
computers. PNX1300 easily implements popular multimedia standards such as MPEG-1 and MPEG-2, but its
orientation around a powerful general-purpose CPU
(called the DSPCPU) makes it capable of implementing
a variety of multimedia algorithms, both open and proprietary. PNX1300 is also easily configured in multiple processor configurations for very high-end applications.
More than just an integrated microprocessor with unusual peripherals, the PNX1300 is a fluid computer system
controlled by a small real-time OS kernel running on a
very-long instruction word (VLIW) processor core.
PNX1300 contains a DSPCPU, a high-bandwidth internal bus, and internal bus-mastering DMA peripherals.
Software compatibility between current and future Trimedia processor family members is at the source-code and
library API level; binary compatibility between family
members is not guaranteed.
Defining software compatibility at the source-code level
gives Philips the freedom to strike the optimum balance
between cost and performance for all chips in the family.
A powerful compiler and software development environment ensure that programmers never need to resort to
non-portable assembler programming. Programmers
use the library APIs and multimedia operations from C
and C++ source code.
PNX1300 is designed both for use as an accelerator in a
PC environment or as the sole CPU in cost-effective
standalone systems. In standalone system applications,
the PNX1300 external bus allows for glueless connection
of 8-bit wide ROM, EEPROM, or Flash memory for code
storage. The external bus also allows intermixing of
PCI2.1 master/slave peripherals and 8-bit simple peripherals, such as UARTs and other 8-bit microprocessor peripherals. This powerful external bus architecture gives
system designers a variety of options to configure lowcost, high-performance system solutions.
Because it is based on a general-purpose CPU,
PNX1300 can also serve as a multifunctional PC enhancement vehicle. Typically, a PC must deal with multi
standard video and audio streams; and applications require both decompression and compression. While the
CPU chips used in PCs are becoming capable of lowresolution, real-time video decompression, high-quality
decompression—not to mention compression—of studio-resolution video is still out of reach. Further, users
expect their systems to handle live video and audio without sacrificing system responsiveness.
PNX1300 enhances a PC system by providing real-time
multimedia with the advantages of a special-purpose,
embedded solution—low cost and chip count—and the
advantages of a general-purpose processor—reprogrammability. For PC applications, PNX1300 far surpasses the capabilities of fixed-function multimedia
chips.
Future media processor family members will have different sets of interfaces appropriate for their intended use.
2.3
PNX1300 CHIP OVERVIEW
Key features of PNX1300 include:
•
•
•
A very powerful, general-purpose VLIW processor
core (the DSPCPU) that coordinates all on-chip
activities. In addition to implementing the non-trivial
parts of multimedia algorithms, the DSPCPU runs a
small real-time operating system driven by interrupts
from the other units.
Independent DMA-driven multimedia I/O units that
properly format data to make software media processing efficient.
DMA-driven multimedia coprocessors that operate
independently and in parallel with the DSPCPU to
perform operations specific to important multimedia
algorithms.
PRELIMINARY SPECIFICATION
2-1
PNX1300/01/02/11 Data Book
•
•
Philips Semiconductors
A high-performance bus and memory system that
provide communication between PNX1300’s processing units.
A flexible external bus interface.
2Mx32 SDRAM
Figure 2-1 shows a PNX1300 block diagram. The bulk of
a PNX1300 system consists of the PNX1300 microprocessor itself, external synchronous DRAM (SDRAM),
and the external circuitry needed to interface to incoming
and/or outgoing video and audio data streams and communication lines. PNX1300’s external peripheral bus can
gluelessly interface to PC! 2.1 components and/or 8-bit
microprocessor peripherals.
Figure 2-2 shows a possible minimally configured
PNX1300 system. A video input stream might come directly from a CCIR 656-compliant video camera chip in
YUV 4:2:2 format through a glueless interface in this
case. An analog camera can be connected via a CCIR
656 interface chip (such as the Philips SAA7113H).
PNX1300 outputs a CCIR656 video stream to drive a
dedicated video monitor. Stereo audio input and up to 8channel audio output require only low-cost external ADC
and DAC. The operation of the video and audio interface
units is highly customizable through programmable parameters.
CCIR656
digital video
stereo
audio in
The glueless PCI interface allows the PNX1300 to display video in a host PC’s video card. The Image Coprocessor (ICP) provides display support for live video input
an arbitrary number of arbitrarily overlapped windows.
32-bit data
up to 572 MB/sec
Huffman decoder
Slice-at-a-time
MPEG-1 & 2
VLD
Coprocessor
Stereo digital audio
8 and 16-bit data
I2S DC, up to 22 MHz AI_SCK
Audio In
Video Out
2/4/6/8 ch. digital audio
16 and 32-bit data
I2S DC, up to 22 MHz AO_SCK
Audio Out
Timers
IEC958
up to 40 Mbit/sec
SPDIF Out
Synchronous
Serial
Interface
I2C Interface
DVDD
VLIW
CPU
Image
Coprocessor
PCI-XIO Interface
Figure 2-1. PNX1300 block diagram.
PRELIMINARY SPECIFICATION
modem
front end
Figure 2-2. PNX1300 system connections. A minimal
PNX1300 requires few supporting components.
Video In
32K
I$
16K
D$
2 - 8 ch
audio out
PC I a n d 8 -b i t p e rip he ra l b us
CCIR656 dig. video
YUV 4:2:2
up to 81 MHz (40 Mpix/sec)
I2C bus to
camera, etc.
DAC
ROM
Main Memory
Interface
PNX1300
PNX1300
JTAG
SDRAM
2-2
ADC
CCIR656
dig. video
CCIR656 digital video
YUV 4:2:2
up to 81 MHz (40 Mpix/sec)
Analog modem or ISDN
front end
Down & up scaling
YUV → RGB
50 Mpix/sec
External bus
- PC!2.1 (32 bits, 33-MHz)
+ glueless 24A/8D slaves
Philips Semiconductors
Finally, the Synchronous Serial Interface (SSI) requires
only an external ISDN or analog modem front-end chip
and phone line interface to provide remote communication support. It can be used to connect PNX1300-based
systems for video phone or videoconferencing applications, or it can be used for general-purpose data communication in PC systems.
The PNX1300 JTAG port allows a debugger on a host
system to access and control the state of a PNX1300 in
a target system. It also implements 1149.1 boundary
scan functionality.
2.4
BRIEF EXAMPLES OF OPERATION
The key to understanding PNX1300 operation is observing that the DSPCPU and peripherals are time-shared
and that communication between units is through
SDRAM memory. The DSPCPU switches from one task
to the next; first it decompresses a video frame, then it
decompresses a slice of the audio stream, then back to
video, etc. As necessary, the DSPCPU issues commands to the peripheral function units to orchestrate their
operation.
The DSPCPU can enlist the ICP and other coprocessors
to help with some of the straightforward, tedious tasks
associated with video processing. The ICP is very well
suited for arbitrary size horizontal and vertical video resizing and color space conversion.
The DSPCPU can enlist the input/output peripherals to
autonomously receive or transmit digital video and audio
data with minimal CPU supervision. The I/O units have
been designed to interface to the outside world through
industry standard audio and video interfaces, while delivering or taking data in memory in formats suitable for
software processing.
2.4.1
Video Decompression in a PC
An example PNX1300 implementation is as a video-decompression engine on a PCI card in a PC. In this case,
the PC does not need to know the PNX1300 has a powerful, general-purpose CPU; rather, the PC just treats the
hardware on the PCI card as a ‘black-box’ engine.
Video decompression begins when the PC operating
system hands the PNX1300 a pointer to compressed video data in the PC’s memory (the details of the communication protocol are handled by the software driver installed in the PC’s operating system).
The DSPCPU fetches data from the compressed video
stream via the PCI bus, decompresses frames from the
video stream, and places them into local SDRAM. Decompression may be aided by the VLD (variable-length
decoder) coprocessor unit, which implements Huffman
decoding and is controlled by the DSPCPU.
When a frame is ready for display, the DSPCPU gives
the ICP a display command. The ICP then autonomously
fetches the decompressed frame data from SDRAM and
transfers it over the PCI bus to the frame buffer in the
Overview
PC’s video display card. Alternately, video can be sent to
the graphics card using the VO unit.
2.4.2
Video Compression
Another typical application for PNX1300 is in video compression. In this case, uncompressed video is usually
supplied directly to the PNX1300 system via the Video In
(VI) unit. A camera chip connected directly to the VI unit
supplies YUV data in 8-bit, 4:2:2 format. The VI unit samples the data from the camera chip and demultiplexes
the raw video to SDRAM in three separate areas, one
each for Y, U, and V.
When a complete video frame has been read from the
camera chip by the VI unit, it interrupts the DSPCPU. The
DSPCPU compresses the video data in software (using
a set of powerful data-parallel multimedia operations)
and writes the compressed data to a separate area of
SDRAM.
The compressed video data can now be transmitted or
stored in any of several ways. It can be sent to a host
system over the PCI bus for archival on local mass storage, or the host can transfer the compressed video over
a network. The data can also be sent to a remote system
using the modem/ISDN interface to create, for example,
a video phone or videoconferencing system.
Since the powerful, general-purpose DSPCPU is available, the compressed data can be encrypted before being transferred for security.
2.5
INTRODUCTION TO PNX1300 BLOCKS
The remainder of this chapter provides a brief introduction to the internal components of PNX1300.
2.5.1
Internal ‘Data Highway’ Bus
The internal bus (or data highway) connects all internal
blocks together and provides access to internal control/
status registers of each block, external SDRAM, and the
external bus peripheral chips. The internal bus consists
of separate 32-bit data and address buses. Transactions
on the bus use a block-transfer protocol. On-chip peripheral units and coprocessors can be masters or slaves on
the bus.
Access to the internal bus is controlled by a central arbiter, which has a request line from each potential bus
master. The arbiter is programmable so that the arbitration algorithm can be tailored for different applications.
Peripheral units make requests to the arbiter for bus access and, depending on the arbitration mode, bus bandwidth is allocated to the units in different amounts. Each
mode allocates bandwidth differently, but each mode
guarantees each unit a minimum bandwidth and maximum service latency. All unused bandwidth is allocated
to the DSPCPU.
The bus allocation mechanism is one of the features of
PNX1300 that makes it a true real-time system instead of
just a highly integrated microprocessor with unusual peripherals.
PRELIMINARY SPECIFICATION
2-3
PNX1300/01/02/11 Data Book
2.5.2
VLIW Processor Core
The heart of PNX1300 is a powerful 32-bit DSPCPU
core. The DSPCPU implements a 32-bit linear address
space and 128, fully general-purpose 32-bit registers.
The registers are not separated into banks; any operation can use any register for any operand.
The PNX1300 core uses a VLIW instruction-set architecture and is fully general-purpose. The VLIW instruction
length allows five simultaneous operations to be issued
every clock cycle. These operations can target any five
of the 27 functional units in the DSPCPU, including integer and floating-point arithmetic units and data-parallel
multimedia operation units.
Although the processor core runs a real-time operating
system to coordinate all activities in the PNX1300 system, the core is not intended for true general-purpose
computer use. For example, the PNX1300 processor
core does not implement demand-paged virtual memory,
memory address translation, or 64-bit floating point - all
essential features in a general-purpose computer system.
PNX1300 uses a VLIW architecture to maximize processor throughput at the lowest possible cost. VLIW architectures have performance exceeding that of superscalar general-purpose CPUs without the cost and
complexity of a superscalar CPU implementation. The
hardware saved by eliminating superscalar logic reduces
cost and allows the integration of multimedia-specific
features that enhance the power of the processor core.
The PNX1300 operation set includes all traditional microprocessor operations. In addition, multimedia operations
are included that dramatically accelerate standard video
and audio compression and decompression algorithms.
As just one of the five operations issued in a single
PNX1300 instruction, a single ‘custom’ or ‘media’ operation can implement up to 11 traditional microprocessor
operations. These multimedia operations combined with
the VLIW architecture result in tremendous throughput
for multimedia applications.
The DSPCPU core is supported by separate 16-KB data
and 32-KB instruction caches. The data cache is dualported to allow two simultaneous accesses; both caches
are 8-way set-associative with a 64-byte block size.
2.5.3
Video In Unit
The Video In (VI) unit interfaces directly to any CCIR 601/
656-compliant device that outputs 8-bit parallel, 4:2:2
YUV time-multiplexed data. Such devices include direct
digital camera systems, which can connect gluelessly to
PNX1300 or through the standard CCIR 656 connector
with only the addition of ECL level converters. A single
chip external device can be used to convert to/from serial
D1 professional video. Non-CCIR-compliant devices can
use a digital video decoder chip, such as the Philips
SAA7113H, to interface to PNX1300.
The VI unit demultiplexes the captured YUV data before
writing it into local PNX1300 SDRAM. Separate planar
data structures are maintained for Y, U, and V.
2-4
PRELIMINARY SPECIFICATION
Philips Semiconductors
The VI unit can be programmed to perform on-the-fly
horizontal resolution subsampling by a factor of two if
needed. Many camera systems capture a 640-pixel/line
or 720-pixel/line image. With subsampling, direct conversion to a 320-pixel/line or a 360-pixel/line image can be
performed with no DSPCPU intervention. Performing this
function during video input reduces initial storage and
bus bandwidth requirements for applications requiring
reduced resolution.
2.5.4
Enhanced Video Out Unit
The Enhanced Video Out (EVO) unit essentially performs the inverse function of the VI unit. EVO generates
an 8-bit, CCIR656 digital video data stream that contains
a composited video and graphics overlay image. The video image is taken from separate Y, U, and V planar data
structures in SDRAM. The graphics overlay is taken from
a pixel-packed YUV data structure in SDRAM. Compositing allows both alpha-blending and chroma keying.
The EVO unit can also upscale the video image horizontally by a factor of two to convert from CIF/SIF to CCIR
601 resolution. The overlay image, if enabled, is always
in full-pixel resolution.
The EVO unit is capable of pixel emission rates up to 40
Mpix/sec and allows full programming of a horizontal and
vertical frame/field structure. It is thus capable of refreshing both interlaced and non-interlaced (‘two fh’) video displays with 4:3 or 16:9 or other aspect ratios.
The sample rate for EVO unit pixels is independently and
dynamically programmable. The high-quality, on-chip
sample clock generator circuit allows the programmer
subtle control over the sampling frequency so that audio
and video synchronization can be achieved in any system configuration. When changing the sample frequency, the instantaneous phase does not change, which allows sample frequency manipulation without introducing
audio or video distortion.
2.5.5
Image Coprocessor
The ICP off-loads common image scaling or filtering
tasks from the DSPCPU. Although these tasks can be
easily performed by the DSPCPU, they are a poor use of
the relatively expensive CPU resource. When performed
in parallel by the ICP, these tasks are performed efficiently by simple hardware, which allows the DSPCPU to
continue with more complex tasks.
The ICP can operate as either a memory-to-memory or a
memory-to-PCI coprocessor device.
In memory-to-memory mode, the ICP can perform either
horizontal or vertical image filtering and resizing. A high
quality algorithm is used (5-tap polyphase filter in each
direction). Filtering or scaling is done in either the horizontal or vertical direction in one pass. Two invocations
of the ICP are required to filter or resize in both directions.
In memory-to-PCI mode, the ICP can perform horizontal
resizing followed by color-space conversion. For example, assume an n × m pixel array is to be displayed in a
Philips Semiconductors
Overview
PC Screen
In SDRAM
Image 2
Y
FrameMaker 5
File Edit Format View
Image 1
U
IMAGE 1
Y
V
U
0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
Calendar
File Edit
0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
1 1 1 1 1 0 0 00 0 0 1 1 1 1 1
1 1 1 1 1 0 0 00 0 0 1 1 1 1 1
1 1 1 1 1 1 1 11 1 1 1 1 1 1 1
1 1 1 1 1 1 1 11 1 1 1 1 1 1 1
V
1 1 1 1 1 1 1 11 1 1 1 1 1 1 1
1 1 1 1 1 1 1 11 1 1 1 1 1 1 1
1
1
1
1
1
1
1
1
1
1
Image 1
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
1
0
0
0
111
111
111
111
111
111
111
011
011
011
111
111
111
111
111
111
111
111
111
111
11 1
11 1
11 1
11 1
11 1
11 1
11 1
11 1
11 1
11 1
Image 2
ICP
Figure 2-3. ICP - Windows on the PC screen and data structures in SDRAM for two live video windows.
window on the PC video screen while the PC is running
a graphical user interface. The first step (if necessary)
would use the ICP in memory-to-memory mode to perform a vertical resizing. The second step would use the
ICP in memory-to-PCI mode to perform horizontal resizing and optional colorspace conversion from YUV to
RGB.
While sending the final, resampled and converted pixels
over the PCI bus to the video frame buffer, the ICP uses
a full, per-pixel occlusion bit mask—accessed in destination coordinates—to determine which pixels are actually
written to the graphics card frame buffer for display. Conditioning the transfer with the bit mask allows PNX1300
to accommodate an arbitrary arrangement of overlapping windows on the PC video screen.
Figure 2-3 illustrates a possible display situation and the
data structures in SDRAM that support ICP operation.
On the left, the PC video screen has four overlapping
windows. Two, Image 1 and Image 2, are being used to
display video generated by PNX1300. The right side
shows a conceptual view of SDRAM contents. Two data
structures are present, one for Image 1 and the other for
Image 2. Figure 2-3 represents a point in time during
which the ICP is displaying Image 2.
When the ICP is displaying an image (i.e., copying it from
SDRAM to a frame buffer), it maintains four pointers to
the SDRAM data structures. Three pointers locate the Y,
U, and V data arrays, the fourth locates the per-pixel occlusion bit map. The Y, U, and V arrays are indexed by
source coordinates while the occlusion bit map is accessed with screen coordinates.
As the ICP generates pixels for display, it performs horizontal scaling and colorspace conversion. The final RGB
pixel value is then copied to the destination address in
the screen’s frame buffer only if the corresponding bit in
the occlusion bit map is a ‘1’.
As shown in the conceptual diagram, the occlusion bit
map has a pattern of 1s and 0s corresponding to the
shape of the visible area of the destination window in the
frame buffer. When the arrangement of windows on the
PC screen changes, modifications to the occlusion bit
map is performed by PNX1300 or host resident software.
It is important to note that there is no preset limit on the
number and sizes of windows that can be handled by the
ICP. The only limit is the available bandwidth. Thus, the
ICP can handle a few large windows or many small windows. The ICP can sustain a transfer rate of 50 megapixels per second, which is more than enough to saturate
PCI when transferring images to video frame buffers.
2.5.6
Variable-Length Decoder (VLD)
The variable-length decoder (VLD) relieves the DSPCPU
of decoding Huffman-encoded video data streams. It can
be used to help decode high bitrate MPEG-1 and MPEG2 video streams. The lower bitrate of videoconferencing
can be adequately handled by DSPCPU software without coprocessor.
The VLD is a memory-to-memory coprocessor. The
DSPCPU hands the VLD a pointer to a Huffman-encoded bit stream, and the VLD produces a tokenized bit
stream that is very convenient for the PNX1300 image
decompression software to use. The format of the output
token stream is optimized for the MPEG-2 decompression software so that communication between the
DSPCPU and VLD is minimized.
PRELIMINARY SPECIFICATION
2-5
PNX1300/01/02/11 Data Book
2.5.7
Audio In and Audio Out Units
The Audio In (AI) and Audio Out (AO) units are similar to
the video units. They connect to most serial ADC and
DAC chips, and are programmable enough to handle
most serial bit protocols. These units can transfer MSB
or LSB first and left or right channel first.
The audio sampling clock is driven by PNX1300 and is
software programmable within a wide range. Like the VO
unit, AI and AO sample rates are separately and dynamically programmable. The high-quality on-chip sample
clock generator circuits allows the programmer subtle
control over the sampling frequency so that audio and
video synchronization can be achieved in any system
configuration. When changing the sample frequency, the
instantaneous phase does not change, which allows
sample frequency manipulation without introducing audio or video distortion.
As with the video units, the audio-in and audio-out units
buffer incoming and outgoing audio data in SDRAM. The
audio-in unit buffers samples in either 8- or 16-bit format,
mono or stereo. The audio-out unit transfers 16- or 32-bit
sample data for mono, stereo or up to 8 audio channels
from memory to the external DACs. Any manipulation or
mixing of sound data is performed by the DSPCPU since
this processing will require only a small fraction of its processing capacity.
2.5.8
S/PDIF Out Unit
The Sony/Philips Digital Interface Out (SPDO) unit allows output of a 1-bit high-speed serial data stream. The
primary application is output of digital audio data in Sony/
Philips Digital Interface (S/PDIF) format to an external
electrically isolated transformer. The SPDO unit can also
be used as a general purpose high-speed data stream
output device such as a UART.
The SPDO unit supports 2-channel PCM audio, one or
more Dolby Digital six-channel data streams, or one or
more MPEG-1 or MPEG-2 audio streams (embedded
per Project 1937). It supports arbitrary programmable
sample rates independent of and asynchronous to the
AO unit sample rate.
2.5.9
Synchronous Serial Interface
The on-chip synchronous serial interface (SSI) is specially designed to interface to high integration analog modem frontends or ISDN frontend devices. In the analog
modem case, all of the modem signal processing is performed in the PNX1300 DSPCPU.
2.5.10
I2C Interface
The I2C bus is a 2-wire multi-master, multi-slave interface capable of transmitting up to 400kbit/sec. PNX1300
implements an I2C master for use in single master environments only. This interface allows PNX1300 to configure and inspect the status of I2C peripheral devices, such
as video decoders, video encoders and some camera
types.
Philips Semiconductors
2.6 NEW IN PNX1300 (VERSUS TM-1300)
PNX1300/01/02/11 offers the following improvements
over the TM-1300:
•
•
•
•
•
•
•
•
2.7
PRELIMINARY SPECIFICATION
NEW IN PNX1300 (VERSUS TM-1100)
In addition to the features described in Section 2.6
PNX1300 offers also the following improvements over
the TM-1100:
•
•
•
•
•
•
•
no external MATCHOUT to MATCHIN delay line.
Video output speed improvement: up to 81 MHz.
Video input speed improvement: up to 81 MHz.
Prefetcheable SDRAM aperture to increase performance. See Chapter 11, “PCI Interface.”
Individual powerdown capability for each coprocessor (e.g. ICP, EVO, etc.).
New AO coprocessor with four separate channels
and support of 16 or 32-bit samples. 8-bit samples
are no longer supported.
New SPDO coprocessor (for output of SPDIF and
other 1-bit high-speed serial data streams)
2.8
NEW IN PNX1300 (VERSUS TM-1000)
In addition to the features described in Section 2.7
PNX1300 offers also the following improvements over
the TM-1000:
•
•
•
•
•
•
•
2-6
Lower core voltage for PNX1311 (2.2V core voltage)
and therefore lower power consumption.
DSPCPU speed of up to 200 MHz for PNX1302.
Support for 256 Mbit SDRAM organized in x16. The
REFRESH counter must be changed. Refer for Section 12.11, “Refresh” in Chapter 12, “SDRAM Memory System” for details.
Support for 16 and 32-bit Main Memory Interface.
Bug fixes in VI message passing mode.
Additional VI mode where VI_DATA[9:8] in message
passing mode are not affected by the VI_DVALID
signal.
PCI bug fix on PCI Special Cycles.
Autonomous boot in non 1:1 ratio is fixed.
New DSPCPU instructions. See Appendix A,
“PNX1300/01/02/11 DSPCPU Operations.”
Video Output unit improvements (8-bit alpha blending, chroma keying, genlock). See Chapter 7,
“Enhanced Video Out.”
Capability to intermix PCI2.1 and 8-bit peripherals or
ROM/Flash memories on the external bus. See
Chapter 22, “PCI-XIO External I/O Bus.”
An on-chip DVD authentication/descrambling coprocessor. Information available to DVD product developers on special request.
Full 1149.1 boundary scan.
Improved PCI DMA read performance. See Chapter
11, “PCI Interface.”
Improved clock generation with new DDS blocks.
DSPCPU Architecture
Chapter 3
by Gert Slavenburg, Marcel Janssens
3.1
BASIC ARCHITECTURE CONCEPTS
In the document the generic PNX1300 product name
refers to PNX1300 Series, or the PNX1300/01/02/11
products.
This section documents the system programmer or
‘bare-machine’ view of the PNX1300 CPU (or DSPCPU).
3.1.1
Register Model
Figure 3-1 shows the DSPCPU’s 128 general purpose
registers, r0...r127. In addition to the hardware program
counter, PC, there are 4 user-accessible special purpose
registers, PCSW, DPC (destination program counter),
SPC (source program counter), and CCCOUNT.
Table 3-1 lists the registers and their purposes.
Register r0 always contains the integer value '0', corresponding to the boolean value 'FALSE' or the single-precision floating point value +0.0. Register r1 always contains the integer value '1' ('TRUE'). The programmer is
NOT allowed to write to r0 or r1.
Note: Writing to r0 or r1 may cause reads from r0 or
r1 scheduled in adjacent clock cycles to return unpredictable values. The standard assembler prevents/
forbids the use of r0 or r1 as a destination register.
Registers r2 through r127 are true general purpose registers; the hardware does not imply their use in any way,
31
though compiler or programmer conventions may assign
particular roles to particular registers. The DPC and SPC
relate to interrupt and exception handling and are treated
in Section 3.1.4, “SPC and DPC—Source and Destination Program Counter.” The PCSW (Program Control
and Status Word) register is treated in Section 3.1.3,
“PCSW Overview.” CCCOUNT, the 64-bit clock cycle
counter is treated in Section 3.1.5, “CCCOUNT—Clock
Cycle Counter.”
Table 3-1. DSPCPU registers
Register
Size
Details
r0
32 bits Always reads as 0x0; must not be used
as destination of operations
r1
32 bits Always reads as 0x1; must not be used
as destination of operations
r2–r127
PC
PCSW
32 bits 126 general-purpose registers
32 bits Program counter
32 bits Program control & status word
DPC
32 bits Destination program counter; latches
target of taken branch that is interrupted
SPC
32 bits Source program counter; latches target
of taken branch that is not interrupted
CCCOUNT 64 bits Counts clock cycles since reset
23
15
7
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
128 General-Purpose Registers
• r0 & r1 fixed
• r2–r127 variable
•
•
•
r0
r1
r2
r3
•
•
•
r126
r127
31
23
15
7
0
PC
PCSW
System Status & Control Registers
DPC
63
55
47
SPC
39
CCCOUNT
Figure 3-1. PNX1300 registers.
PRELIMINARY SPECIFICATION
3-1
PNX1300/01/02/11 Data Book
3.1.2
Philips Semiconductors
Basic DSPCPU Execution Model
3.1.3
The DSPCPU issues one ‘long instruction’ every clock
cycle. Each instruction consists of several operations
(five operations for the PNX1300 microprocessor). Each
operation is comparable to a RISC machine instruction,
except that the execution of an operation is conditional
upon the content of a general purpose register. Examples of operations are:
IF r10 iadd r11 r12 → r13
(if r10 true, add r11 and r12 and write sum in r13)
IF r10 ld32d(4) r15 → r16
(if r10 true, load 32 bits from mem[r15+4] into r16)
IF r20 jmpf r21 r22
(if r20 true and r21 false, jump to address in r22)
Each operation has a specific, known execution latency
in clock cycles. For example, iadd takes 1 cycle; thus the
result of an iadd operation started in clock cycle i is available for use as an argument to operations issued in cycle
i+1 or later. The other operations issued in cycle i cannot
use the result of iadd. The ld32d operation has a latency
of 3 cycles. The result of an ld32d operation started in cycle j is available for use by other operations issued in cycle j+3 or later. Branches, such as the jmpf example
above have three delay slots. This means that if a branch
operation in cycle k is taken, all operations in the instructions in cycle k+1, k+2 and k+3 are still executed.
In the above examples, r10 and r20 control conditional
execution of the operations. Also known as ‘guarding’,
here r10 and r20 contain the operation ‘guard’. See Section 3.2.1, “Guarding (Conditional Execution).”
Certain restrictions exist in the choice of what operations
can be packed into an instruction. For example, the
DSPCPU in PNX1300 allows no more than two load/
store class operations to be packed into a single instruction. Also, no more than five results (of previously started
operations) can be written during any one cycle. The
packing of operations is not normally done by the programmer. Instead, the instruction scheduler (See Philips
TriMedia SDE Reference Manual) takes care of converting the parallel intermediate format code into packed instructions ready for the assembler. The rules are formally
described in the machine description file used by the instruction scheduler and other tools.
15
PCSW[15:0]
14
13
MSE WBE RSE
12
11
10
UNDEF
CS
IEN
9
PCSW Overview
Figure 3-2 shows the PCSW register. The PNX1300 value of PCSW on reset is 0x800. For compatibility, any undefined PCSW fields should never be modified.
Note that the DSPCPU architecture has no condition
codes or integer arithmetic status flags. Integer operations that generate out-of-range results deliver an operation specific bit pattern. For examples, see dspiadd in
Appendix A, “PNX1300/01/02/11 DSPCPU Operations.”
Predicate operations exist that take the place of integer
status flags in a classical architecture. Multiword arithmetic is supported by the ‘carry’ operation which generates a ‘0’ or ‘1’ depending on the carry that would be generated if its arguments were summed.
FP-Related Fields.The IEEE mode field determines the
IEEE rounding mode of all floating point operations, with
the exception of a few floating point conversion operations that use fixed rounding mode. For examples, see ifixrz, ifloatrz, ifixrz, ifloatrz in Appendix A, “PNX1300/01/
02/11 DSPCPU Operations.”
The FP exception flags are ‘sticky bits’ that are set as a
side effect of floating-point computations. Each floating
point operation can set one or more of the flags if it incurs
the corresponding exception. The flags can only be reset
by direct software manipulation of the PCSW (using the
writepcsw operation). The bits have the meanings shown
in Table 3-2.
The FP exception trap enable bits determine which FP
exception flags invoke CPU exception handling. An exception is requested if the intersection of the exception
flags and trap enable flags is non-zero. The acceptance
and handling of exceptions is described in Section 3.5,
“Special Event Handling.”
BSX (Bytesex). The DSPCPU has a switchable bytesex.
The BSX flag in the PCSW can be written by software.
Load/store operations observe little- or big-endian byte
ordering based on the current setting of BSX.
IEN (Interrupt Enable). The IEN flag disables or enables
interrupt processing for most interrupt sources. Only NMI
(non-maskable interrupt) bypasses IEN. The acceptance
and handling of interrupts is described in Section 3.5.3,
“INT and NMI (Maskable and Non-Maskable Interrupts).”
8
7
6
BSX IEEE MODE OFZ
5
4
3
2
1
0
IFZ
INV
OVF
UNF
INX
DBZ
FP exceptions
IEEE rounding mode
0 ⇒ to nearest, 1 ⇒ to zero, 2 ⇒ to positive, 3 ⇒ to negative
Misaligned store exception
Write back error
Reserved exception
Byte sex (1 ⇒ little endian)
PCSW = 0x800
after RESET
Count stalls (1 ⇒ Yes)
Interrupt enable (1 ⇒ allow interrupts)
31
PCSW[31:16]
30
29
TRP TRP TRP
MSE WBE RSE
Misaligned store
exception trap enable
Write back error trap enable
28
27
UNDEF
Reserved exception
trap enable
26
25
TFE
23
UNDEFINED
Trap on first exit
22
21
20
19
18
17
16
TRP
OFZ
TRP
IFZ
TRP
INV
TRP
OVF
TRP
UNF
TRP
INX
TRP
DBZ
FP exception trap-enable bits
Figure 3-2. PNX1300 PCSW (Program Control and Status Word) register format.
3-2
PRELIMINARY SPECIFICATION
Philips Semiconductors
Table 3-2. PCSW FP exception flag definitions
Flag
INV
Function
Standard IEEE invalid flag
OVF
Standard IEEE overflow flag
UNF
Standard IEEE underflow flag
INX
Standard IEEE inexact flag
DBZ
Standard IEEE divide-by-zero flag
OFZ
‘Output flushed to zero’ set if an operation caused a
denormalized result
IFZ
‘Input flushed to zero’ set if an operation was applied to
one or more denormalized operands
CS (Count Stalls). The CS flag determines the mode of
CCCOUNT, the 64-bit clock cycle counter. If CS = ‘1’, the
cycle counter increments on all clock cycles. If CS = ‘0’,
the clock cycle counter only increments on non-stall cycles. See also Section 3.1.5, “CCCOUNT—Clock Cycle
Counter.” After RESET, CS is set to ‘1’.
MSE and TRPMSE (Misaligned-Store Exception). The
MSE bit will be set when the processor detects a store
operation to an address that is not aligned. For example,
a 32-bit store executed with an address that is not a multiple of four will cause MSE to be set. The TRPMSE bit
enables the DSPCPU to raise misaligned address exceptions. An exception is requested if the intersection of
MSE and TRPMSE is non-zero. The acceptance and
handling of exceptions is described in Section 3.5, “Special Event Handling.”
Unaligned load operations do not cause an exception,
because load operations can be speculative (i.e. their result is thrown away).
When the DSPCPU generates an unaligned address, the
low order address bit(s) (one bit in the case of a 16-bit
load, two bits for a 32-bit load) are forced to zero and the
load/store is executed from this aligned address.
WBE and TRPWBE (Write Back Error). The WBE flag
will be set whenever a program attempts to write back
more than 5 results simultaneously. This is indicative of
a programming error, likely caused by the scheduler or
assembler. The TRPWBE bit enables the corresponding
exception.
RSE, TRPRSE (Reserved Exception). RSE and TRPRSE are reserved for diagnostic purposes and not described here.
TFE (Trap on First Exit). The TFE bit is a support bit for
the debugger. The TFE bit is set by the debugger prior to
taking a (non-interruptible) jump to the application program. On the next interruptible jump (the first interruptible jump in the application being debugged), an exception is requested because the TFE bit is set. The
acceptance and handling of exception processing is described in Section 3.5, “Special Event Handling.” It is the
responsibility of the exception handler software to clear
the TFE bit. The hardware does not clear or set TFE.
Corner-case note: Whenever a hardware update (e.g. an
exception being raised) and a software update (through
writepcsw) of the PCSW coincide, the new value of the
DSPCPU Architecture
PCSW will be the value that is written by the writepcsw
instruction, except for those bits that the hardware is currently updating (which will reflect the hardware value).
3.1.4
SPC and DPC—Source and
Destination Program Counter
The SPC and DPC registers are support registers for exception processing. The DPC is updated during every interruptible jump with the target address of that interruptible jump. If an exception is taken at an interruptible
jump, the value in the DPC register can be used by the
exception handling routine as the return address to resume the program at the place of interruption.
The SPC register is updated during every interruptible
jump that is not interrupted by an exception. Thus on an
interrupted interruptible jump, the SPC register is not updated. The SPC register allows the exception handling
routine to determine the start address of the decision tree
(a block of uninterruptible, scheduled PNX1300 code)
that was executing when the exception was taken (see
also Section 3.5, “Special Event Handling”).
Corner-case note: Whenever a hardware update (during
an interruptible jump) and a software update (through
writedpc or writespc) coincide, the software update takes
precedence.
3.1.5
CCCOUNT—Clock Cycle Counter
CCCOUNT is a 64-bit counter that counts clock cycles
since RESET. Cycle counting can occur in two modes,
depending on PCSW.CS. If PCSW.CS = ‘1’, the cycle
count increments on every CPU clock cycle. If PCSW.CS
= ‘0’, the clock cycle count only increments on non-stall
CPU cycles.
CCCOUNT is implemented as a master counter/slave
register pair. The master 64-bit counter gets updated
continuously. The value of the CCCOUNT slave register
is updated with the current master cycle count during
successful interruptible jumps only. The cycles and hicycles DSPCPU operations return the content of the 32
LSBs and 32 MSBs, respectively, of the slave register.
This ensures that the value returned by hicycles and cycles is coherent, as long as there is no intervening interruptible jump, which makes these operations suitable for
64-bit high resolution timing from C source code programs. The curcycles DSPCPU operation returns the 32
LSBs of the master counter. The latter operation can be
used for instruction cycle precise timing. When used, it
must be precisely placed, probably at the assembly code
level.
3.1.6
Boolean Representation
The bit pattern generated by boolean valued operations
(ileq, fleq etc.) is '00...00' (FALSE) or '00...01' (TRUE).
When interpreting a bit pattern as a boolean value, only
the LSB is taken into account, i.e. 'xx..x0' is interpreted
as FALSE and 'xx..x1' is interpreted as TRUE. In particular, wherever a general purpose register is used as a
‘guard’, the LSB determines whether execution of the
guarded operation takes place.
PRELIMINARY SPECIFICATION
3-3
PNX1300/01/02/11 Data Book
3.1.7
Integer Representation
The architecture supports the notion of 'unsigned integers' and 'signed integers.' Signed integers use the standard two’s-complement representation.
Arithmetic on integers does not generate traps. If a result
is not representable, the bit pattern returned is operation
specific, as defined in the individual operation description
section. The typical cases are:
•
•
•
Wrap around for regular add- and subtract-type operations.
Clamping against the minimum or maximum representable value for DSP-type operations.
Returning the least significant 32-bit value of a 64-bit
result (e.g., integer/unsigned multiply).
3.1.8
Floating Point Representation
The PNX1300 architecture supports single precision (32bit) IEEE-754 floating point arithmetic.
All arithmetic conforms to the IEEE-754 standard in
flush-to-zero mode.
All floating point compute operations round according to
the current setting of the PCSW IEEE mode field. The
current setting of the field determines result rounding (to
nearest, to zero, to positive infinity, to negative infinity).
Conversions from float to integer/unsigned are available
in two forms: a PCSW rounding-mode-observing form
and an ANSI-C-specific-rounding form. The ANSI-Cspecific form forces round to zero regardless of the
PCSW IEEE rounding mode. Conversion from integer/
unsigned to float always observes the IEEE rounding
mode.
Floating point exceptions are supported with two mechanisms. Each individual floating point operation (e.g. fadd)
has a counterpart operation (faddflags) that computes
the exception flag values. These operations can be used
for precise exception identification1. The second mechanism uses the ‘sticky’ exception bits in the PCSW that
collect aggregate exception events. The PCSW exception bits can selectively invoke CPU exception handling.
See Section 3.5.2, “EXC (Exceptions).”
Table 3-3 shows the representation choices that were
made in PNX1300’s floating point implementation .
3.1.9
Addressing Modes
The addressing modes shown in Table 3-4 are supported by the DSPCPU architecture (store operations allow
only displacement mode).
1.
3-4
This mechanism allows precise exception identification
in the context of our multi-issue microprocessor core—
where many floating point operations may issue simultaneously—at the expense of additional operations
generated by the compiler. It also allows the compiler to
issue compute operations speculatively and compute
exceptions precisely.
PRELIMINARY SPECIFICATION
Philips Semiconductors
Table 3-3. Special Float Value Representation
Item
Representation
+inf
0x7f800000
-inf
0xff800000
self generated qNaN
0xffffffff
result of operation
on any NaN argument
argument | 0x00400000 (forcing the
NaN to be quiet)
signalling NaN
never generated by PNX1300,
accepted as per IEEE-754
Table 3-4. Addressing Modes
Mode
Suffix
Applies to
Name
d
Load & Store
Displacement
R[i] + R[k]
r
Load only
Index
R[i] + scaled(R[k])
x
Load only
Scaled index
R[i] + scaled(#j)
In these addressing modes, R[i] indicates one of the general purpose registers. The scale factor applied (1/2/4) is
Table 3-5. Minimum values for implementationdependent addressing mode components
Parameter
Minimum Range
‘i’ and ‘k’
0..127 (i.e., each implementation has at least 128
registers)
‘j’
-64..63 (i.e., displacements will be at least 7 bits
long and signed)
equal to the size of the item loaded or stored, i.e. 1 for a
byte operation, two for a 16-bit operation and four for a
32-bit operation. The range of valid 'i', 'j' and 'k' values
may differ between implementations of the architecture;
the minimum values for implementation-dependent characteristics are shown in Table 3-5.
Note that the assembly code specifies the true displacement, and not the value to be scaled. For example,
‘ld32d(–8) r3’ loads a 32-bit value from address (r3 – 8).
This is encoded in the binary operation pattern as a –2 in
the seven-bit field by the assembler. At runtime, the
scale factor four is applied to reconstruct the intended
displacement of –8.
3.1.10
Software Compatibility
The DSPCPU architecture expressly does not support
binary compatibility between family members. The ANSI
C compiler ensures that all family members are compatible at the source-code level.
Philips Semiconductors
3.2
INSTRUCTION SET OVERVIEW
3.2.1
Guarding (Conditional Execution)
In the PNX1300 architecture, all operations can be optionally 'guarded'. A guarded operation executes conditionally, depending on the value in the ‘guard' register.
For example, a guarded add is written as:
IF R23 iadd R14 R10 → R13
This should be taken to mean
if R23 then R13 ← R14 + R10.
The ’if R23' clause controls the execution of the operation based on the LSB of R23. Hence, depending on the
LSB of R23, R13 is either unchanged or set to contain
the integer sum of R14 and R10.
Guarding applies to all DSPCPU operations, except iimm
and uimm (load-immediate). It controls the effect on all
programmer-visible states of the system, i.e. register values, memory content, exception raising and device state.
3.2.2
Load and Store Operations
Memory is byte addressable. Loads and stores must be
‘naturally aligned’, i.e. a 16-bit load or store must target
an address that is a multiple of 2. A 32-bit load or store
must target an address that is a multiple of 4. The BSX
bit in the PCSW determines the byte order of loads and
stores. For example, see ld32 and st32 in Appendix A,
“PNX1300/01/02/11 DSPCPU Operations.”
Only 32-bit load and store operations are allowed to access MMIO registers in the MMIO address aperture (see
Section 3.4, “Memory and MMIO”). The results are undefined for other loads and stores. A load from a non-existent MMIO register returns an undefined result. A store to
a non-existent MMIO register times out and then does
not happen. There are no other side effects of an access
to a nonexistent MMIO register. The state of the BSX bit
has no effect on the result of MMIO accesses.
Loads are allowed to be issued speculatively. Loads outside the range of valid data memory addresses for the
active process return an implementation-dependent value and do not generate an exception. Misaligned loads
also return an implementation dependent value and do
not generate an exception.
If a pair of memory operations involves one or more common bytes in memory, the effect on the common bytes is
as defined in Table 3-6.
Table 3-4 shows the supported addressing modes. The
minimum values of implementation-dependent addressing-mode components are shown in Table 3-5.
Note: The index and scaled-index modes are not
allowed with store opcodes, due to the hardware
DSPCPU Architecture
Table 3-6. Behavior of loads and stores with
coincident addresses
Condition
Behavior
Tstore < Tload
If a store is issued before a load, the value
loaded contains the new bytes.
Tload < Tstore
If a load is issued before a store, the value
loaded contains the old bytes.
Tstore1 < Tstore2 If store1 is issued before store2, the resulting value contains the bytes of store2.
Tstore = Tload
If a load and store are issued in the same
clock cycle, the result is UNDEFINED.
Tstore1 = Tstore2 If two stores are issued in the same clock
cycle, the resulting stored value is undefined.
restriction that each operation have at most 2 source
operand registers and 1 condition register. Stores
use 1 operand register for the value to be stored
leaving only 1 register to form an address.
The scale factor applied (1/2/4) in the scaled addressing
modes is equal to the size of the item loaded or stored,
i.e. 1 for a byte operation, 2 for a 16-bit operation and 4
for a 32-bit operation.
Table 3-7 lists the available load and store mnemonics
for the three addressing modes.
Table 3-7. Load and store mnemonics
Operation
Displacement
Index
ScaledIndex
8-bit signed load
ild8d
ild8r
—
8-bit unsigned load
uld8d
uld8r
—
16-bit signed load
ild16d
ild16r
ild16x
16-bit unsigned load
uld16d
uld16r
uld16x
32-bit load
ld32d
ld32r
ld32x
8-bit store
st8d
—
—
16-bit store
st16d
—
—
32-bit store
st32d
—
—
Example usage of load and store operations:
IF r10 ild16d(12) r12 → r13
If the LSB of r10 is set, load 16 bits starting at
address (r12+12) using the byte ordering indicated
in PCSW.BSX, sign-extend the value to 32 bits and
store the result in r13.
IF r10 st32d(40) r12 r13
If the LSB of r10 is set, store the 32-bit value from
r13 to the address (r12+40) using the byte ordering
indicated in PCSW.BSX.
PRELIMINARY SPECIFICATION
3-5
PNX1300/01/02/11 Data Book
3.2.3
Philips Semiconductors
Compute Operations
3.2.5
Compute operations are register-to-register operations.
The specified operation is performed on one or two
source registers and the result is written to the destination register.
Immediate Operations. Immediate operations load an
immediate constant (specified in the opcode) and produce a result in the destination register.
Floating-Point Compute Operations. Floating-point
compute operations are register-to-register operations.
The specified operation is performed on one or two
source registers and the result is written to the destination register. Unless otherwise mentioned all floating
point operations observe the rounding mode bits defined
in the PCSW register. All floating-point operations not
ending in ‘flags’ update the PCSW exception flags. All
operations ending in ‘flags’ compute the exception flags
as if the operation were executed and return the flag values (in the same format as in the PCSW); the exception
flags in the PCSW itself remain unchanged.
Multimedia Operations. These special compute operations are like normal compute operations, but the specified operations are not usually found in general purpose
CPUs. These operations provide special support for multimedia applications.
3.2.4
Control-flow operations change the value of the program
counter. Conditional jumps test the value in a register
and, based on this value, change the program counter to
the address contained in a second register or continue
execution with the next instruction. Unconditional jumps
always change the program counter to the specified immediate address.
Control-flow operations can be interruptible or non-interruptible. Execution of an interruptible jump is the only occasion where PNX1300 allows special event handling to
take place (see Section 3.5, “Special Event Handling”).
3.3
Issue time constraints:
•
•
an operation implies a need for a functional unit type
(as documented in Appendix A, “PNX1300/01/02/11
DSPCPU Operations.”)
each operation requires an issue slot that has an
instance of the appropriate functional unit type
attached
issue slot 1
issue slot 2
issue slot 3
issue slot 4
issue slot 5
CONST
CONST
CONST
CONST
CONST
ALU
ALU
ALU
ALU
ALU
SHIFTER
SHIFTER
FCOMP
DMEM
DMEM
FALU
DSPMUL
DSPMUL
FALU
DMEMSPEC
BRANCH
BRANCH
BRANCH
IFMUL
IFMUL
FTOUGH
(latency 17,
recovery 16)
DSPALU
DSPALU
Figure 3-3. PNX1300 issue slots, functional units, and latency.
3-6
PNX1300 INSTRUCTION ISSUE RULES
The PNX1300 VLIW CPU allows issue of 5 operations in
each clock cycle according to a set of specific issue
rules. The issue rules impose issue time constraints and
a result writeback constraint. Any set of operations that
meets all constraints constitutes a legal PNX1300 instruction. A more extensive description and a few special
case issue rules and limitations can be found in the Philips TriMedia SDE documentation.
Special-Register Operations
Special register operations operate on the special registers: PCSW, DPC, SPC and CCCOUNT.
Control-Flow Operations
PRELIMINARY SPECIFICATION
Philips Semiconductors
•
functional units should be ‘recovered’ from any prior
operation issues
Writeback constraint:
•
No more than 5 results should be simultaneously
written to the register file at any point in time (writeback occurs ‘latency’ cycles after issue)
Figure 3-3 shows all functional units of PNX1300, including the relation to issue slots, and each functional unit’s
latency (e.g. 1 for CONST, 3 for FALU, etc.). With the exception of FTOUGH, each functional unit can accept an
operation every clock cycle, i.e. has a recovery time of 1.
The binding of operations to functional unit types is summarized in Table 3-8. In Appendix A, “PNX1300/01/02/
11 DSPCPU Operations”, each operation lists the precise functional unit and unit latency.
Table 3-8. Functional unit operations
unit type
operation category
const
immediate operations
alu
32-bit arithmetic, logical, pack/unpack
dspalu
dual 16-bit, quad 8-bit multimedia arithmetic
dspmul
dual 16-bit and quad 8-bit multimedia multiplies
dmem
loads/stores
dmemspec
cache coherency, cache control, prefetch
shifter
multi-bit shift
branch
control flow
falu
floating point arithmetic & conversions
ifmul
32-bit integer and floating point multiplies
fcomp
single cycle floating point compares
ftough
iterative floating point square root and division
3.4
MEMORY AND MMIO
PNX1300 defines four apertures in its 32-bit address
space: the memory hole, the DRAM aperture, the MMIO
aperture and the PCI apertures (See Figure 3-4).The
memory hole covers addresses 0..0xff. The DRAM and
MMIO apertures are defined by the values in MMIO registers; the PCI apertures consist of every address that
does not fall in the other three apertures.
3.4.1
Memory Map
DRAM is mapped into an aperture extending from the
address in DRAM_BASE to the address in
DRAM_LIMIT. The maximum DRAM aperture size is 64
MB.
DSPCPU Architecture
not overlap; if they do, the consequences are undefined.
The values of DRAM_BASE, DRAM_LIMIT, and
MMIO_BASE are set during the boot process. In the
case of a PCI host assisted boot, the values are determined by the host BIOS. In case of standalone boot (i.e.,
PNX1300 is the PCI host), the values are taken from the
boot ROM. Refer to Chapter 13, “System Boot” for details. DSPCPU update of DRAM_BASE and
MMIO_BASE is possible, but not recommended, see
Section 11.6.3, “MMIO/DRAM_BASE updates.”
3.4.2
The Memory Hole
The memory hole from address 0 to 0xff serves to protect
the system from performance loss due to speculative
loads. Due to the nature of C program references, most
speculative loads issued by the DSPCPU fall in the
range covered by the hole. Activated by default upon RESET, the hole serves to ensure that these speculative
loads do NOT cause PCI read accesses and slow down
the system. The value returned by any data load from the
hole is 0. The hole only protects loads. Store operations
in the hole do cause writes to PCI, SDRAM or MMIO as
determined by the aperture base address values. If the
SDRAM aperture overlaps the memory hole, the memory
hole is ignored.
The hole can be temporarily disabled through the
DC_LOCK_CTL register. This is described in Section
5.3.8, “Memory Hole and PCI Aperture Disable.”
3.4.3
MMIO Memory Map
Devices are controlled through memory-mapped device
registers, referred to as MMIO registers. To ensure compatibility with future devices, any undefined MMIO bits
should be ignored when read, and written as ‘0’s. Some
devices can autonomously access data memory (DMA)
and most devices can cause CPU interrupts.
The 2-MB MMIO aperture is initially located at address
0xEFE00000 on RESET; it is relocated by the PCI BIOS
0xFFFF FFFFF
PCI
2 MB
MMIO Aperture
MMIO_BASE
PCI
DRAM_LIMIT
DRAM Aperture
The MMIO aperture is located at address MMIO_BASE
and is a fixed 2-MB size.
In the default operating mode, al l memory accesses not
going to either the hole, DRAM or MMIO space are interpreted as PCI accesses. This behavior can be overridden as described in Section 5.3.8, “Memory Hole and
PCI Aperture Disable.”
The MMIO aperture and the DRAM aperture can be at
any naturally aligned location, in any order, but should
1 MB - 64 MB
DRAM_BASE
PCI
0x0000 0000
256byte
hole
Figure 3-4. PNX1300 memory map.
PRELIMINARY SPECIFICATION
3-7
PNX1300/01/02/11 Data Book
Philips Semiconductors
for PC-hosted PNX1300 boards; its final location is determined by the boot EEPROM for standalone systems.
See Chapter 13, “System Boot” for more information.
Figure 3-5 gives a detailed overview of the MMIO memory map (addresses used are offsets with respect to the
MMIO base). The operating system on PNX1300 can
change MMIO_BASE by writing to the MMIO_BASE
MMIO location. User programs should not attempt this.
Refer to the TriMedia SDE Reference Manual for the
standard method to access the device registers from C
language device drivers.
terrupts: ISETTING, IPENDING, ICLEAR, IMASK and
the interrupt vectors. The timer MMIO locations are described in Section 3.8, “Timers.” The instruction and
data breakpoint are described in Section 3.9, “Debug
Support.” The MMIO locations of each device are treated in the respective device chapters.
Only 32-bit load and store operations are allowed to access MMIO registers in the MMIO address aperture. The
results are undefined for other loads and stores. Reads
from non-existent MMIO registers return undefined values. Writes to nonexistent MMIO registers time out.
There are no side effects of accesses to nonexistent
MMIO registers. The state of the PCSW BSX bit has no
effect on the result of MMIO accesses.
With the exception of RESET, which is enabled at all
times, the architecture of the DSPCPU allows special
event handling to begin only during an interruptible jump
operation (ijmpt, ijmpf or ijmpi) that succeeds (i.e., is a
taken jump). EXC, NMI and INT handling can be initiated
during handling of an EXC or an INT, butonly during successful interruptible jumps.
The Icache tag and LRU bit access aperture give the
DSPCPU read-only access to the Icache status. Refer to
Section 5.4.8, “Reading Tags and Cache Status” for details.
Table 3-9. Special Events and Event Vectors
The EXCVEC MMIO location is explained in Section
3.5.2, “EXC (Exceptions).” Section 3.5.3, “INT and NMI
(Maskable and Non-Maskable Interrupts),” describes
the locations that deal with the setup and handling of in-
0x1F FFFFF
Reserved
for
Future Use
0x10 3800
0x10 3400
0x10 3000
0x10 2C00
0x10 2800
0x10 2400
0x10 2000
0x10 1C00
0x10 1800
0x10 1400
0x10 1000
0x10 0C00
0x10 0800
0x10 0400
0x10 0000
JTAG interface
I2C interface
PCI interface
SSI interface
VLD coprocessor
Image coprocessor
Audio Out
Audio In
Video Out
Video In
Debug support
Timers
Vectored interrupt controller
MMIO base
Main memory, cache control
Reserved
for
Future Use
0x01 0000
0x00 0000
Icache tags & LRU (r/o)
3.5
SPECIAL EVENT HANDLING
The PNX1300 microprocessor responds to the special
events shown in Table 3-9, ordered by priority.
Event
Vector
RESET
(Highest priority) vector to DRAM_BASE
EXC
(All exceptions) vector to EXCVEC (programmable)
NMI,
INT
(Non-maskable interrupt, maskable interrupt) use
the programmed vector (one of 32 vectors depending on the interrupt source)
0x10 1200
0x10 1000
data breakpoints
instruction breakpoints
0x10 0C60
0x10 0C40
0x10 0C20
0x10 0C00
systimer
timer3
timer2
timer1
0x10 08Fc
0x10 08F8
intvec31
intvec30
0x10 0888
0x10 0884
0x10 0880
intvec2
intvec1
intvec0
0x10 0828
0x10 0824
0x10 0820
0x10 081C
0x10 0818
0x10 0814
0x10 0810
0x10 0800
imask
iclear
ipending
isetting3
isetting2
isetting1
isetting0
excvec
0x10 0400
MMIO_BASE
0x10 0004
0x10 0000
DRAM_LIMIT
DRAM_BASE
Figure 3-5. Memory map of MMIO address space (addresses are offset from MMIO_BASE).
3-8
PRELIMINARY SPECIFICATION
Philips Semiconductors
DSPCPU Architecture
The instruction scheduler uses interruptible jumps exclusively for inter-decision tree jumps. Hence, within a decision tree, no special-event processing can be initiated. If
a tree-to-tree jump is taken, special-event processing is
allowed. Since the only registers live at this point (i.e.,
that contain useful data) are the global registers allocated by the ANSI C compiler, only a subset of the registers
needs to be preserved by the event handlers. Refer to
the TriMedia SDE Reference Manual for details on which
registers can be in use. The DSPCPU register state can
be described by the contents of this subset of general
purpose registers and the contents of the PCSW and the
DPC value (the target of the inter-tree jump).
The priority resolution mechanism built into the DSPCPU
hardware dispatches the highest-priority, non-masked
special-event request at the time of a successful interruptible jump operation. In view of the simple, real-timeoriented nature of the mechanisms provided, only limited
nesting of events should be allowed.
3.5.1
RESET
RESET is the highest priority special event. It is asserted
by external hardware or by the host CPU. PNX1300 will
respond to it at any time.
External hardware reset through the TRI_RESET# pin
initiates boot protocol execution as described in Chapter
13, “System Boot.” This causes the current PC value to
be lost and instruction execution to start from address
DRAM_BASE.
1. DPC is assigned the intended destination address of
the successful jump.
2. Instruction processing starts at EXCVEC.
All other actions are the responsibility of the EXC handler
software. Note that no other special event processing will
take place until the handler decides to execute an interruptible jump that succeeds.
3.5.3
INT and NMI (Maskable and NonMaskable Interrupts)
The on-chip Vectored Interrupt Controller (VIC) provides
32 INT request input hardware lines. The interrupt controller prioritizes and maps attention requests from several different peripherals onto successive INT requests
to the DSPCPU.
INT special event processing will occur under the following conditions:
1. RESET is de-asserted.
2. The intersection PCSW[15,6:0] & PCSW[31,22:16] is
empty and PCSW.TFE is not set.
3. The intersection of IPENDING and IMASK is nonempty.
4. The interrupt is at level NMI or PCSW.IEN = 1.
5. A successful interruptible jump is in the final jump execution stage.
DSPCPU hardware takes the following actions on the initiation of NMI or INT processing:
A PCI host CPU can perform a PNX1300 DSPCPU-only
reset by an MMIO write to the BIU_CTL.SR and CR bits.
Such a reset does not cause a full boot, instead the
DSPCPU resumes execution from DRAM_BASE.
1. DPC gets assigned the intended destination address
of the successful jump.
2. Instruction processing starts at the appropriate interrupt vector.
3.5.2
All other actions are the responsibility of the INT handler
software. Note that no other special event processing will
take place until the handler decides to execute an interruptible jump that succeeds.
EXC (Exceptions)
The DSPCPU enters EXC special-event processing under the following conditions:
1. RESET is de-asserted.
2. The intersection PCSW[15,6:0] & PCSW[31,22:16] is
non-empty or PCSW.TFE is set.
3. A successful interruptible jump is in the final jump execution stage.
DSPCPU hardware takes the following actions on the initiation of EXC processing:
3.5.3.1
Interrupt vectors
Each of the 32 interrupt sources can be assigned an arbitrary interrupt vector (the address of the first instruction
of the interrupt handler). A vector is setup by writing the
address to one of the MMIO locations shown in
Figure 3-6. The state of the MMIO vector locations is undefined after RESET. (Addresses of the MMIO vector
registers are offset with respect to MMIO_BASE.)
MMIO_BASE
offset:
0x10 08FC
0x10 08F8
INTVEC31 (r/w)
INTVEC30 (r/w)
Source 31 vector
Source 30 vector
•
•
•
•
•
•
•
•
•
0x10 0888
0x10 0884
0x10 0880
INTVEC2 (r/w)
INTVEC1 (r/w)
INTVEC0 (r/w)
Source 2 vector
Source 1 vector
Source 0 vector
31
0
Figure 3-6. Interrupt vector locations in MMIO address space.
PRELIMINARY SPECIFICATION
3-9
PNX1300/01/02/11 Data Book
Philips Semiconductors
ister, with a ‘1’ in the bit position(s) corresponding to the
desired acknowledge flags.
Programmer’s note: See the Philips TriMedia Cookbook
(Book 2 of TriMedia SDE documentation) for information
on writing interrupt handlers.
3.5.3.2
Programmers note: the store operation that performs the
interrupt acknowledge should be issued at least 2 cycles
before the (interruptible) jump that ends an interrupt handler. This ensures that the same interrupt is not dispatched twice due to request de-assertion clock delays.
Interrupt modes
DSPCPU interrupt sources can be programmed to operate in either level-sensitive or edge-triggered mode. Operation in edge-triggered or level-sensitive mode is determined by a bit in the ISETTING MMIO locations
corresponding to the source, as defined in Figure 3-7.
On RESET, all ISETTING registers are cleared.
3.5.3.4
Each interrupt source can be programmed to request
one out of eight levels of priorities. The highest priority
level (level 7) corresponds to requesting an NMI—an interrupt that cannot be masked by the DSPCPU PCSW.IEN bit. The other levels request regular interrupts,
that can be masked as a group by the PCSW.IEN flag.
Level six represents the highest priority normal interrupt
level and level zero represents the lowest. Refer to
Figure 3-7 for details of programming the priority level.
In edge-triggered mode, the leading edge of the signal
on the device interrupt request line causes the VIC (Vectored Interrupt Controller) to set the interrupt pending flag
corresponding to the device source number. Note that,
for active high signals, the leading edge is the positive
edge, whereas for active low request signals (such as
PCI INTA#), the negative edge is the leading edge. The
interrupt remains pending until one of two events occurs:
•
•
The VIC arbitrates the highest-priority pending interrupt
requestor. Sources programmed to request at the same
level are treated with a fixed priority, from source number
0 (highest) to 31 (lowest). At such time as the DSPCPU
is willing to process special events, the vector of highest
priority NMI source will be dispatched. If no NMI is pending, and the DSPCPU allows regular interrupts (PCSW.IEN is asserted), the vector of the highest priority
regular source is dispatched. Once a vector is dispatched, the corresponding interrupt pending flag is deasserted (edge triggered mode sources only).
The VIC successfully dispatches the vector corresponding to the source to the PNX1300 CPU, or
PNX1300 CPU software clears the interrupt-pending
flag by a direct write to the ICLEAR location.
No interrupt acknowledge to ICLEAR is needed for devices operating in edge-triggered mode, since the vector
dispatch clears the IPENDING request. The device itself
may however need a device-specific interrupt acknowledge to clear the requesting condition. Edge-triggered
mode is not recommended for devices that can signal
multiple simultaneous interrupt conditions. The on-chip
timers must be operated in edge triggered mode.
3.5.3.5
Device interrupt acknowledge
All devices capable of generating level-triggered interrupts have interrupt acknowledge bits in their memory
mapped control registers for this purpose. An interrupt
acknowledge is performed by a store to such control reg-
Each interrupt source device typically has its own interrupt enable flag(s) that determine whether certain key
MMIO_BASE
offset:
0x10 081C
ISETTING3 (r/w)
MP31
MP30
MP29
MP28
MP27
MP26
MP25
MP24
0x10 0818
ISETTING2 (r/w)
MP23
MP22
MP21
MP20
MP19
MP18
MP17
MP16
0x10 0814
ISETTING1 (r/w)
MP15
MP14
MP13
MP12
MP11
MP10
MP9
MP8
0x10 0810
ISETTING0 (r/w)
MP7
MP6
MP5
MP4
MP3
MP2
MP1
MP0
31
27
23
19
15
Each MP Field:
0xxx source operates in edge-triggered mode
1xxx source operates in level-sensitive mode
Figure 3-7. Interrupt mode and priority MMIO locations and formats.
3-10
Interrupt masking
A single MMIO register (IMASK in Figure 3-8) allows
masking of an arbitrary subset of the interrupt sources.
Masking applies to both regular as well as NMI level requestors. Masking is used by software to disable unused
devices and/or to implement nested interrupt handling. In
the latter case, each interrupt handler can stack the old
IMASK content for later restoration and insert a new
mask that only allows the interrupts it is willing to handle.
For level-triggered device handlers, IMASK should also
exclude the device itself to prevent repeated handler activation.
In level-sensitive mode, the device requests an interrupt
by asserting the VIC source request line. The device
holds the request until the device interrupt handler performs a device interrupt acknowledge. It is highly recommended that all off-chip and on-chip sources, with the exception of the timers, operate in level-sensitive mode.
3.5.3.3
Interrupt priorities
PRELIMINARY SPECIFICATION
11
7
Each MP
x111
x110
...
x000
3
0
Field:
NMI (highest) priority
maskable level 6
maskable level 0
Philips Semiconductors
DSPCPU Architecture
The ICLEAR register reads the same as the IPENDING
register. Writes to the ICLEAR register serve to clear
pending flags for edge-triggered mode sources. All IPENDING flags corresponding to bit positions in which ‘1’s
are written are cleared. IPENDING flags corresponding
to bit positions in which ‘0’s are written are not affected.
Writes have no effect on level-sensitive mode sources.
When a pending interrupt bit is being cleared through a
write to the ICLEAR register at the same time that the
hardware is trying to set that interrupt bit, the hardware
takes precedence.
device events lead to the request of an interrupt. In addition, the PCSW.IEN flag determines whether the
DSPCPU is willing to handle regular interrupts. Non
maskable interrupts ignore the state of this flag.
All three mechanisms are necessary: the PCSW.IEN flag
is used to implement critical sections of code during
which the RTOS (real-time operating system) is unable
to handle regular interrupts. The IMASK is used to allow
full control over interrupt handler nesting. The device interrupt flags set the operational mode of the device.
When RESET is asserted, IPENDING, ICLEAR, and
IMASK are set to all zeroes. (MMIO register addresses
shown in Figure 3-8 are offset addresses with respect to
MMIO_BASE.)
3.5.3.6
3.5.3.7
Software interrupts and
acknowledgment
The IPENDING register shown in Figure 3-8 can be read
to observe the currently pending interrupts. Each bit read
depends on the mode of the source:
•
•
3.5.3.8
Software can request an interrupt for sources operating
in edge-triggered mode. Writes to the IPENDING register
assert an interrupt request for all sources where a 1 occurred in the bit position of the written value. The state of
sources where a 0 occurred in the written value is unchanged. Writes have no effect on level-sensitive mode
sources. The interrupt request, if not masked, will occur
at the next successful interruptible jump. This differs from
the conventional software interrupt-like semantics of
many architectures. Any of the 32 sources can be requested in software. In normal operation however, software-requested interrupts should be limited to source
vectors not allocated for hardware devices. Note that another PCI master can request interrupts by manipulating
the IPENDING location in the MMIO aperture. This is
useful for inter-processor communication.
31
Interrupt source assignment
Table 3-10 shows the assignment of devices to interrupt
source numbers, as well as the recommended operating
mode (edge or level triggered). Note that there are a total
of 5 external pins available to assert interrupt requests.
The PCI INTA to INTD requests are asserted by active
low signal conventions, i.e. a zero level or a negative
edge asserts a request. The USERIRQ pin operates with
active high signalling conventions.
For a level-sensitive source, a bit value corresponds
to the current state of the device interrupt request
line.
For an edge-triggered interrupt, a ‘1’ is read if and
only if an interrupt request occurred and the corresponding vector has not yet been dispatched.
MMIO_BASE
offset:
0x10 0828
NMI sequentialization
In most applications, it is desirable not to nest NMIs. The
NMI interrupt handler can accomplish this by saving the
old IMASK content and clearing IMASK before the first
interruptible jump is executed by the NMI handler.
3.6
PNX1300 TO HOST INTERRUPTS
In systems where PNX1300 is operating in the presence
of a host CPU on PCI, PNX1300 can generate interrupts
to the host, using any combination of the four PCI INTA#
to INTD# pins. In a typical host system, only one of these
pins needs to be wired to the PCI bus interrupt request
lines. Any unused pins of this group are then available for
use as software programmable I/O pins.
The INT_CTL register (see Figure 3-9) IEx bits, when
set, enable the open collector driver of the four
INTD#..INTA# pins. The INTx bits determine the output
value generated (if enabled). A ‘1’ in INTx causes the
corresponding PCI interrupt pin to be asserted (low INTx# pin). The ISx bits are read-only and reflect the cur-
23
15
7
0
IMASK (r/w)
Each IMASK(i) bit:
On read or write, 0 ⇒ disallow source i interrupt request
On read or write, 1 ⇒ allow source i interrupt request
0x10 0824
ICLEAR (r/w)
Each ICLEAR(i) bit:
On read, same as IPENDING(i)
On write, 1 ⇒ clear source i interrupt request
0x10 0820
IPENDING (r/w)
Each IPENDING(i) bit:
On read, 1 ⇒ source i interrupt request is pending
On write, 1 ⇒ software source i interrupt request
Figure 3-8. Interrupt controller request, clear, and mask MMIO registers.
PRELIMINARY SPECIFICATION
3-11
PNX1300/01/02/11 Data Book
MMIO_BASE
offset:
0x10 3038 INT_CTL (r/w)
Philips Semiconductors
31
27
23
19
15
11
7
3
0
IS[D:A]
IE[D:A]
INT[D:A]
Figure 3-9. Host interrupt control register
Table 3-10. Interrupt source assignments
SOURCE
NAME
SRC
NUM
MODE
SOURCE DESCRIPTION
PCI INTA
0
level
PCI_INTA# pin signal
PCI INTB
1
level
PCI_INTB# pin signal
PCI INTC
2
level
PCI_INTC# pin signal
3.7
HOST TO PNX1300 INTERRUPTS
A host CPU can generate an interrupt to PNX1300 in
several ways:
•
•
by a PCI MMIO write to IPENDING to assert the
HOSTCOMM interrupt (bit 28)
by a hardware circuit that asserts one of the interrupt
request pins TRI_USERIRQ, or INTA..INTD.
PCI INTD
3
level
PCI_INTD# pin signal
TRI_USERIRQ
4
either
external general-purpose
pin
TIMER1
5
edge
general-purpose timer
TIMER2
6
edge
general-purpose timer
3.8
TIMER3
7
edge
general-purpose timer
SYSTIMER
8
edge
reserved for debugger
VIDEOIN
9
level
video in block
VIDEOOUT
10
level
video out block
AUDIOIN
11
level
audio in block
The DSPCPU contains four programmable timer/
counters, all with the same function. The first three
(TIMER1, TIMER2, TIMER3) are intended for general
use. The fourth timer/counter (SYSTIMER) is reserved
for use by the system software and should not be used
by applications.
AUDIOOUT
12
level
audio out block
ICP
13
level
image coprocessor
VLD
14
level
VLD coprocessor
SSI
15
level
SSI interface
PCI
16
level
PCI BIU (DMA, etc.; see
Table 11-14 for possible
interrupt causes)
IIC
17
level
I 2C interface
JTAG
18
level
JTAG interface
t.b.d.
19..24
SPDO
t.b.d.
25
reserved for future devices
level
26..27
SPDO block
reserved for future devices
HOSTCOM
28
edge
(software) host communication
APP
29
edge
(software) application
DEBUGGER
30
edge
(software) debugger
RTOS
31
edge
(software) RTOS
rent actual state of the pins. Note that the pins have negative logic (active low) polarity and are of the open
collector output type. Hence the pin voltage is low (active) when the logical value set or seen in the INT_CTL
register is a ‘1’.
The assertion and de-assertion of host interrupts is the
responsibility of PNX1300 software.
See also Section 11.6.17, “INT_CTL Register.”
3-12
PRELIMINARY SPECIFICATION
The first and most common method requires no circuitry
and leaves the interrupt pins available for other purposes.
TIMERS
Each timer has three registers as shown in Figure 3-10.
The MMIO register addresses shown are offset addresses with respect to the timer’s base address.
Each timer/counter can be set to count one of the event
types specified in Table 3-12. Note that the
DATABREAK event is special, in that the timer/counter
may increment by zero, one or two in each clock cycle.
For all other event types, increments are by zero or one.
The CACHE1 and CACHE2 events serve as cache performance monitoring support. The actual event selected
for CACHE1 and CACHE2 is determined by the
MEM_EVENTS MMIO register, see Section 5.7, “Performance Evaluation Support.” If a PNX1300 pin signal (VICLK, etc.) is selected as an event, positive-going edges
on the signal are counted.
Each timer increments its value until the modulus is
reached. On the clock cycle where the incremented value would equal or exceed the modulus, the value wraps
around to zero or one (in the case of an increment by
two), and an interrupt is generated as defined in
Table 3-10. The timer interrupt source mode should be
set as edge-sensitive. No software interrupt acknowledge to the timer device is necessary.
Counting starts and continues as long as the run bit is
set.
Loading a new modulus does not affect the contents of
the value register. If a store operation to either the modulus or value register results in value and modulus being
the same, no interrupt will be generated. If the run bit is
set, the next value will be modulus+1 or modulus+2, and
Philips Semiconductors
DSPCPU Architecture
Timer base offset:
0
TMODULUS (r/w)
4
TVALUE (r/w)
8
TCTL (r/w)
31
27
23
19
15
11
7
3
0
MODULUS
VALUE
PRESCALE
“PRESCALE”:
Prescale value is
2^PRESCALE, i.e.,
in the range [1..32768]
SOURCE
“SOURCE” select:
see table Table 3-12
R
“RUN” bit:
0 Timer stopped
1 Timer running
Figure 3-10. Timer register definitions.
Table 3-11. Timer base MMIO address
TIMER1
MMIO_BASE+0x10,0C00
TIMER2
MMIO_BASE+0x10,0C20
TIMER3
MMIO_BASE+0x10,0C40
SYSTIMER
MMIO_BASE+0x10,0C60
Table 3-12. Timer source selections
Source Name
Source
Bits
Value
Source Description
CLOCK
0
PRESCALE
1
CPU clock
prescaled CPU clock
TRI_TIMER_CLK
2
external clock pin
DATABREAK
3
data breakpoints
INSTBREAK
4
instruction breakpoints
CACHE1
5
cache event 1
CACHE2
6
cache event 2
VI_CLK
7
video in clock pin
VO_CLK
8
video out clock pin
AI_WS
9
audio in word strobe pin
AO_WS
10
audio out word strobe pin
SSI_RXFSX
11
SSI receive frame sync pin
12
SSI transmit frame sync pin
SSI_IO2
—
13-15
undefined
3.9
DEBUG SUPPORT
This section describes the special debug support offered
by the DSPCPU. Instruction and data breakpoints can be
defined through a set of registers in the MMIO register
space. When a breakpoint is matched, an event is generated that can be used as a timer source (see Section
3.8, “Timers”). The timer TMODULUS has to be set to
generate a DSPCPU interrupt after the desired number
of breakpoint matches.
3.9.1
Instruction Breakpoints
The instruction-breakpoint control register is shown in
Figure 3-11. On RESET, the BICTL register is cleared.
(MMIO-register addresses shown are offset with respect
to MMIO_BASE.)
The instruction-breakpoint address-range registers are
shown in Figure 3-12. After RESET, the value of these
registers is undefined. (MMIO-register addresses shown
are offset with respect to MMIO_BASE.)
When the IC bit in the breakpoint control register is set to
‘1’, instruction breakpoints are activated. Any instruction
address issued by the PNX1300 chip is compared
against the low and high address-range values. The IAC
bit in the breakpoint control register determines whether
the instruction address needs to be inside or outside of
the range defined by the low and high address-range
registers. A successful comparison takes place when either:
IAC = ‘0’ and low ≤ iaddr ≤ high, or
IAC = ‘1’ and iaddr < low or iaddr > high.
the counter will have to loop around before an interrupt is
generated.
•
•
A modulus value of zero causes a wrap-around as if the
modulus value was 232.
On a successful comparison, an instruction breakpoint
event is generated, which can be used as a clock input
to a timer. After counting the programmed number of instruction breakpoint events, the timer will generate an interrupt request.
On RESET, the TCTL registers are cleared, and the value of the TMODULUS and TVALUE registers is undefined.
PRELIMINARY SPECIFICATION
3-13
PNX1300/01/02/11 Data Book
MMIO_BASE
offset:
0x10 1000
Philips Semiconductors
31
27
23
19
15
11
7
3
0
BICTL (r/w)
IC
‘IAC’ Instruction address control:
0 Breakpoint if address inside range
1 Breakpoint if address outside range
‘IC’ Instruction control bit:
0 Disable instruction breakpoints
1 Enable instruction breakpoints
Figure 3-11. Instruction-breakpoint control register.
MMIO_BASE
offset:
0x10 1004
BINSTLOW (r/w)
Address Range Start
0x10 1008
BINSTHIGH (r/w)
Address Range End
31
27
23
19
15
11
7
3
0
11
7
3
0
Figure 3-12. Instruction-breakpoint address-range registers.
MMIO_BASE
offset:
0x10 1030
BDATAALOW (r/w)
0x10 1034
BDATAAHIGH (r/w)
0x10 1038
BDATAVAL (r/w)
0x10 103C
BDATAMASK (r/w)
31
27
23
19
15
Address Range Start
Address Range End
Data Breakpoint Value
Data Breakpoint Value Mask
Figure 3-13. Data-breakpoint address-range and value-compare registers.
3.9.2
When the DC bits in the data breakpoint control register
are not set to ‘0’, data breakpoints are activated. When
the value of the DC bits is ‘1’ or ‘3’, any data address from
load operations (if the BL bit is set) and/or store operations (if the BS bit is set) issued by the DSPCPU is compared against the low and high address-range values.
The DAC bit in the breakpoint control register determines
whether data addresses need to be inside or outside of
the range defined by the low and high address-range
registers. A successful comparison occurs when either:
Data Breakpoints
The data-breakpoint address-range and compare-value
registers are shown in Figure 3-13. After RESET, the value of the data breakpoint registers is undefined. (MMIOregister addresses shown are offset with respect to
MMIO_BASE.)
The data-breakpoint control register is shown in
Figure 3-14. On RESET, the BDCTL register is cleared.
(The register address shown is offset with respect to
MMIO_BASE.)
MMIO_BASE
offset:
0x10 1020
31
27
•
•
23
19
15
11
BDCTL (r/w)
‘DVC’ Data Value Control:
0 Breakpoint if data equal
1 Breakpoint if data not equal
‘BS’ Break on Store:
0 Don’t check data stores
1 Do check data stores
7
3
0
BS BL DC
‘DAC’ Data Address Control:
0 Breakpoint if address inside range
1 Breakpoint if address outside range
‘BL’ Break on Load:
0 Don’t check data loads
1 Do check data loads
‘DC’ Data Control:
0 No checking
1 Check data addresses
2 Check data values
3 Check data value and addresses
Figure 3-14. Data-breakpoint control register.
3-14
DAC = ‘0’ and low ≤ daddr ≤ high, or
DAC = ‘1’ and daddr < low or daddr > high.
PRELIMINARY SPECIFICATION
Philips Semiconductors
Note that this comparison works for all addresses regardless of the aperture to which they belong. When the
value of the DC bits is ‘2’ or ‘3’, any data value from load
operations (if the BL bit is set) and/or store operations (if
the BS bit is set) issued by the PNX1300 CPU is compared against the value in the BDATAVAL register. Only
the bits for which the corresponding BDATAMASK register bits are set to ‘1’ will be used in the comparison. The
DVC bit in the breakpoint control register determines
whether the data value needs to be equal or not equal to
the comparison value. A successful comparison occurs
when either of the following are true:
•
•
DVC = ‘0’ and (data & BDATAMASK) = (BDATAVAL
& BDATAMASK).
DVC = ‘1’ and (data & BDATAMASK) != (BDATAVAL
& BDATAMASK).
DSPCPU Architecture
Note: use a nonzero datamask or the result is undefined.
When a successful comparison has taken place, a data
breakpoint event is generated, which can be used as a
clock input to a timer. After counting the set number of
data breakpoint events, the timer will generate an interrupt request.
When the value of the DC bits is ‘3’, a data breakpoint
event is generated if and only if a successful comparison
occurs on both address and data simultaneously.
Note that up to two data breakpoint events can occur per
clock cycle, due to the dual load/store capability of the
CPU and data cache.
PRELIMINARY SPECIFICATION
3-15
PNX1300/01/02/11 Data Book
3-16
PRELIMINARY SPECIFICATION
Philips Semiconductors
Custom Operations for Multimedia
Chapter 4
by Gert Slavenburg, Pieter v.d. Meulen, Yong Cho, Sang-Ju Park
4.1
CUSTOM OPERATIONS OVERVIEW
In this document, the generic PNX1300 name refers
to the PNX1300 Series, or the PNX1300/01/02/11
products.
Custom operations in the PNX1300 DSPCPU architecture are specialized, high-function operations designed
to dramatically improve performance in important multimedia applications. When properly incorporated into application source code, custom operations enable an application to take advantage of the highly parallel
PNX1300 microprocessor implementation. Achieving a
similar performance increase through other means—
e.g., executing a higher number of traditional microprocessor instructions per cycle—would be prohibitively expensive for PNX1300’s low-cost target applications.
Custom operations are simple to understand and consistent in their definition, but their unusual functions make it
difficult for automatic code generation algorithms to use
them effectively. Consequently, custom operations are
inserted into source code by the programmer. To make
this process as painless as possible, custom operation
syntax is consistent with the C programming language,
and, just as with all other operations generated by the
compiler, the scheduler takes care of register allocation,
operation packing, and flow analysis.
4.1.1
Custom Operation Motivation
For both general-purpose and embedded microprocessor-based applications, programming in a high-level language is desirable. To effectively support optimizing
compilers and a simple programming model, certain microprocessor architecture features are needed, such as
a large, linear address space, general-purpose registers,
and register-to-register operations that directly support
the manipulation of linear address pointers. A common
choice in microprocessor architectures is 32-bit linear
addresses, 32-bit registers, and 32-bit integer operations. PNX1300 is such a microprocessor architecture.
For the data manipulation in many algorithms, however,
32-bit data and operations are wasteful of expensive silicon resources. Important multimedia applications, such
as the decompression of MPEG video streams, spend
significant amounts of execution time dealing with eightbit data items. Using 32-bit operations to manipulate
small data items makes inefficient use of 32-bit execution
hardware in the implementation. If these 32-bit resources
could be used instead to operate on four eight-bit data
items simultaneously, performance would be improved
by a significant factor with only a tiny increase in implementation cost.
Getting the highest execution rate from standard microprocessor resources is one of the motivations behind
custom operations in PNX1300. A range of custom operations is provided that each processes—simultaneously—four 8-bit or two 16-bit data items. There is little cost
difference between a standard 32-bit ALU and one that
can process either one pair of 32-bit operands or four
pairs of eight-bit operands, but there is a big performance difference for PNX1300’s target applications.
PNX1300’s custom operations go beyond simply making
the best use of standard resources. Some custom operations combine several simple operations. These combinations are tailored specifically to the needs of important
multimedia applications. Some high-function custom operations eliminate conditional branches, which helps the
scheduler make effective use of all five operation slots in
each PNX1300 instruction. Filling up all five slots is especially important in the inner loops of computational intensive multimedia applications.
In short, custom operations help PNX1300 reach its
goals of extremely high multimedia performance at the
lowest possible cost.
4.1.2
Introduction to Custom Operations
Table 4-1 and Table 4-2 contain two listings of the custom operations available in the PNX1300 architecture.
Table 4-1 groups the custom operations by type of function while Table 4-2 lists the operations by operand size.
For more detailed information about the custom operations, Appendix A, “PNX1300/01/02/11 DSPCPU Operations.”
Some operations exist in several versions that differ in
the treatment of their operands and results, and the mnemonics for these versions make it easy to select the appropriate operation. For example, the sum of products
operations all have “fir” in their mnemonics; the prefix
and suffix of the mnemonic expresses the treatment of
the operands and result. The ifir8ii operation treats both
of its operands as signed (ifir8ii) and produces a signed
result (ifir8ii). The ifir8iu operation treats its first operand
as signed (ifir8iu), the second as unsigned (ifir8i u), and
produces a signed result (ifir8iu). The ume8ii operation
implements an eight-bit motion-estimation; it treats both
operands as signed but produces an unsigned result.
The operations beginning with “dsp” implement a clipping (sometimes called saturating) function before storPRELIMINARY SPECIFICATION
4-1
PNX1300/01/02/11 Data Book
Philips Semiconductors
Table 4-1. Key Multimedia Custom Operations Listed
by Function Type
Function
Custom Op
Description
DSP
absolute
value
dspiabs
Clipped signed 32-bit absolute
value
dspidualabs
Dual clipped absolute values of
signed 16-bit halfwords
Shift
dualasr
dual-16 arithmetic shift right
Clip
dualiclipi
dual-16 clip signed to signed
dualuclipi
dual-16 clip signed to unsigned
quadumax
Unsigned bytewise quad max
quadumin
Unsigned bytewise quad min
Min,max
DSP add
DSP
multiply
DSP
subtract
Sum of
products
Merge,
pack
dspiadd
Clipped signed 32-bit add
Table 4-2. Key Multimedia Custom Operations Listed
by Operand Size
Op. Size
32-bit
Custom Op
dspiabs
Description
Clipped signed 32-bit abs value
dspuadd
Clipped unsigned 32-bit add
dspiadd
Clipped signed 32-bit add
dspidualadd
Dual clipped add of signed 16bit halfwords
dspuadd
Clipped unsigned 32-bit add
dspimul
Clipped signed 32-bit multiply
dspuquadaddui
Quad clipped add of unsigned/
signed bytes
dspumul
Clipped unsigned 32-bit multiply
dspimul
Clipped signed 32-bit multiply
dspisub
Clipped signed 32-bit subtract
dspumul
Clipped unsigned 32-bit multiply
dspusub
Clipped unsigned 32-bit subtract
dspidualmul
Dual clipped multiply of signed
16-bit halfwords
mergedual16lsb
Merge dual-16 least-significant
bytes
dspisub
Clipped signed 32-bit subtract
dualasr
dual-16 arithmetic shift right
dspusub
Clipped unsigned 32-bit subtract
dualiclipi
dual-16 clip signed to signed
dspidualsub
Dual clipped subtract of signed
16-bit halfwords
dualuclipi
dual-16 clip signed to unsigned
dspidualmul
ifir16
Signed sum of products of
signed 16-bit halfwords
Dual clipped multiply of signed
16-bit halfwords
dspidualabs
ifir8ii
Signed sum of products of
signed bytes
Dual clipped absolute values of
signed 16-bit halfwords
dspidualadd
ifir8iu
Signed sum of products of
signed/unsigned bytes
Dual clipped add of signed 16bit halfwords
dspidualsub
ufir16
Unsigned sum of products of
unsigned 16-bit halfwords
Dual clipped subtract of signed
16-bit halfwords
ifir16
ufir8uu
Unsigned sum of products of
unsigned bytes
Signed sum of products of
signed 16-bit halfwords
ufir16
Unsigned sum of products of
unsigned 16-bit halfwords
pack16lsb
Pack least-significant 16-bit
halfwords
pack16msb
Pack most-significant 16-bit
halfwords
mergedual16lsb Merge dual-16 least-significant
bytes
mergelsb
Merge least-significant bytes
mergemsb
Merge most-significant bytes
pack16lsb
Pack least-significant 16-bit
halfwords
pack16msb
Pack most-significant 16-bit
halfwords
packbytes
Pack least-significant bytes
Byte
averages
quadavg
Unsigned byte-wise quad average
Byte
multiplies
quadumulmsb
Unsigned quad 8-bit multiply
most significant
Motion
estimation
ume8ii
Unsigned sum of absolute values of signed 8-bit differences
ume8uu
Unsigned sum of absolute values of unsigned 8-bit differences
4-2
ing the result(s) in the destination register. Otherwise,
their naming follows the rules given above where appropriate. For example, the dspuquadaddui operation implements four 8-bit additions; it treats the first operand of
each addition as unsigned, the second operand as
signed, and produces an unsigned result for each addition. Each result, which is computed with no loss of precision, is clipped into the representable range of a byte
(0..255).
PRELIMINARY SPECIFICATION
16-bit
Philips Semiconductors
Custom Operations for Multimedia
Table 4-2. Key Multimedia Custom Operations Listed
by Operand Size
Memory
Location
31
Op. Size
8-bit
Custom Op
Description
0
n+0:
a
b
c
d
31
i
m
Unsigned bytewise quad max
n+4:
e
f
g
h
b
f
j
n
quadumin
Unsigned bytewise quad min
n+8:
i
j
k
l
c
g
k
o
dspuquadaddui
Quad clipped add of unsigned/
signed bytes
m n o
p
d
h
l
p
ifir8ii
Signed sum of products of
signed bytes
ifir8iu
Signed sum of products of
signed/unsigned bytes
ufir8uu
Unsigned sum of products of
unsigned bytes
mergelsb
Merge least-significant bytes
mergemsb
Merge most-significant bytes
packbytes
Pack least-significant bytes
quadavg
Unsigned byte-wise quad average
quadumulmsb
Unsigned quad 8-bit multiply
most significant
ume8ii
Unsigned sum of absolute values of signed 8-bit differences
ume8uu
Unsigned sum of absolute values of unsigned 8-bit differences
Example Uses of Custom Ops
The next three sections illustrate the advantages of using
custom operations. Also, the more complex examples illustrate how custom operations can be integrated into
application code by providing listings of C-language program fragments. The examples progress in complexity
from simple to intricate; the most interesting examples
are taken from actual multimedia codes, such as MPEG
decompression.
4.2
e
quadumax
n+12:
Transpose
Row Major
4.1.3
0
a
EXAMPLE 1: BYTE-MATRIX
TRANSPOSITION
The goal of this example is to provide a simple, introductory illustration of how custom operations can significantly increase processing speed in small kernels of applications. As in most uses of custom operations, the power
of custom operations in this case comes from their ability
to operate on multiple data items in parallel.
Imagine that our task is to transpose a packed, 4-by-4
matrix of bytes in memory; the matrix might, for example,
contain 8-bit pixel values. Figure 4-1 illustrates both the
organization of the matrix in memory and the task to be
performed in standard mathematical notation.
Performing this operation with traditional microprocessor
instructions is straight forward but time consuming. One
way to perform the manipulation is to perform 12 loadbyte instructions (since only 12 of the 16 bytes need to
be repositioned) and 12 store-byte instructions that place
the bytes back in memory in their new positions. Another
way would be to perform four load-word instructions, re-
a
e
i
m
b
f
j
n
c
g
k
o
d
h
l
p
Column Major
Transpose
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
Figure 4-1. Byte-matrix transposition. Top shows
byte matrices packed into memory words; bottom
shows mathematical matrix representation.
position the bytes in registers, and then perform four
store-word instructions. Unfortunately, repositioning the
bytes in registers would require a large number of instructions to properly shift and mask the bytes. Performing the 24 loads and stores makes implicit use of the
shifting and masking hardware in the load/store units and
thus yields a shorter instruction sequence.
The problem with performing 24 loads and stores is that
loads and stores are inherently slow operations because
they must access at least the cache and possibly slower
layers in the memory hierarchy. Further, performing byte
loads and stores when 32-bit word-wide accesses run
just as fast wastes the power of the cache/memory interface. We would prefer a fast algorithm that takes full advantage of cache/memory bandwidth while not requiring
an inordinate number of byte-manipulation instructions.
PNX1300 has instructions that merge and pack bytes
and 16-bit halfwords directly and in parallel. Four of
these instructions can be applied in this case to speed up
the manipulation of bytes that are packed into words.
Figure 4-2 shows the application of these instructions to
the byte-matrix transposition problem, and the left side of
Figure 4-3 shows a list of the operations needed to implement the matrix transpose. When assembled into actual PNX1300 instructions, these custom operations
would be packed as tightly as dependencies allow, up to
five operations per instruction.
Note that a programmer would not need to program at
this level (PNX1300 assembler). The matrix transpose
would be expressed just as efficiently in C-language
source code, as shown on the right side of Figure 4-3.
The low-level code is shown here for illustration purposes only.
The first sequence of four load-word operations in
Figure 4-3 brings the packed words of the input matrix
into registers R10, R11, R12, and R13. The next sequence of four merge operations produces intermediate
results into registers R14, R15, R16, and R17. The next
sequence of four pack operations could then replace the
original operands or place the transposed matrix in separate registers if the original matrix operands were needPRELIMINARY SPECIFICATION
4-3
PNX1300/01/02/11 Data Book
Philips Semiconductors
ld32d(0) r100 → r10
ld32d(4) r100 → r11
ld32d(8) r100 → r12
ld32d(12) r100 → r13
char matrix[4][4];
.
.
.
int *m = (int *) matrix;
mergemsb r10 r11 → r14
mergemsb r12 r13 → r15
mergelsb r10 r11 → r16
mergelsb r12 r13 → r17
pack16msb r14 r15 → r18
pack16lsb r14 r15 → r19
pack16msb r16 r17 → r20
pack16lsb r16 r17 → r21
temp0
temp1
temp2
temp3
m[0]
m[1]
m[2]
m[3]
st32d(0) r101 r18
st32d(4) r101 r19
st32d(8) r101 r20
st32d(12) r101 r21
=
=
=
=
=
=
=
=
MERGEMSB(m[0], m[1]);
MERGEMSB(m[2], m[3]);
MERGELSB(m[0], m[1]);
MERGELSB(m[2], m[3]);
PACK16MSB(temp0, temp1);
PACK16LSB(temp0, temp1);
PACK16MSB(temp2, temp3);
PACK16LSB(temp2, temp3);
.
.
.
Figure 4-3. On the left is a complete list of operations to perform the byte-matrix transposition of Figure 4-1
and Figure 4-2. On the left is an equivalent C-language fragment.
ed for further computations (the PNX1300 optimizing C
compiler performs this analysis automatically). In this example, the transpose matrix is placed in registers R18,
R19, R20, and R21. The final four store-word operations
put the transposed matrix back into memory.
Thus, using the PNX1300 custom operations, the bytematrix transposition requires four load-word operations
and four store-word operations (the minimum possible)
and eight register-to-register data-manipulation operations. The result is 16 operations, or byte-matrix transposition at the rate of one operation per byte.
While the advantage of the custom-operation-based algorithm over the brute-force code that uses 24 load- and
store-byte instruction seems to be only eight operations
(a 33% reduction), the advantage is actually much greater. First, using custom operations, the number of memory references is reduced from 24 to eight (a factor of
three). Since memory references are slower than register-to-register operations (such as the custom operations
in this example), the reduction in memory references is
significant.
Further, the ability of the PNX1300 VLIW compilation
system to exploit the performance potential of the
PNX1300 microprocessor hardware is enhanced by the
custom-operation-based code. This is because it is easier for the compilation system to produce an optimal
schedule (arrangement) of the code when the number of
memory references is in balance with the number of register-to-register operations. The PNX1300 CPU (like all
high-performance microprocessors) has a limit on the
number of memory references that can be processed in
a single cycle (two is the current limit). A long sequence
of code that contains only memory references can result
in empty operation slots in the long PNX1300 instructions. Empty operation slots waste the performance potential of the PNX1300 hardware.
As this example has shown, careful use of custom operations has the potential to not only reduce the absolute
number of operations needed to perform a computation
but can also help the compilation system produce code
that fully exploits the performance potential of the
PNX1300 CPU.
4.3
EXAMPLE 2: MPEG IMAGE
RECONSTRUCTION
The complete MPEG video decoding algorithm is composed of many different phases, each with computational
intensive kernels. One important kernel deals with reconstructing a single image frame given that the forwardand backward-predicted frames and the inverse discrete
cosine transform (IDCT) results have already been computed. This kernel provides an excellent opportunity to illustrate of the power of PNX1300’s specialized custom
operators.
In the code fragments that follow, the backward-predicted block is assumed to have been computed into an array back[], the forward-predicted block is assumed to
have been computed into forward[], and the IDCT results
are assumed to have been computed into idct[].
Row Major
a
e
i
m
b
f
j
n
c
g
k
o
d
h
l
p
Column Major
mergemsb
a e b f
pack16msb
mergemsb
i m j n
pack16lsb
mergelsb
c g d h
pack16msb
mergelsb
k o l p
pack16lsb
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
Figure 4-2. Application of merge and pack instructions to the byte-matrix transposition of Figure 4-1.
4-4
PRELIMINARY SPECIFICATION
Philips Semiconductors
Custom Operations for Multimedia
void reconstruct (unsigned char *back,
unsigned char *forward,
char *idct,
unsigned char *destination)
{
int i, temp;
for (i = 0; i < 64; i += 1)
{
temp = ((back[i] + forward[i] + 1) >> 1) + idct[i];
if (temp > 255)
temp = 255;
else if (temp < 0)
temp = 0;
destination[i] = temp;
}
}
Figure 4-4. Straightforward code for MPEG frame reconstruction.
A straightforward coding of the reconstruction algorithm
might look as shown in Figure 4-4. This implementation
shares many of the undesirable properties of the first example of byte-matrix transposition. The code accesses
memory a byte at a time instead of a word at a time,
which wastes 75% of the available bandwidth. Also, in
light of the many quad-byte-parallel operations introduced in Section 4.1.2, “Introduction to Custom Operations,” it seems inefficient to spend three separate additions and one shift to process a single eight-bit pixel.
Perhaps even more unfortunate for a VLIW processor
like PNX1300 is the branch-intensive code that performs
the saturation testing; eliminating these branches could
reap a significant performance gain.
After some experience is gained with custom operations,
it is not necessary to unroll loops to discover situations
where custom operations are useful. Often, a good programmer with knowledge of the function of the custom
operations can see by simple inspection opportunities to
exploit custom operations.
Since MPEG decoding is the kind of task for which
PNX1300 was created, there are two custom operations—quadavg and dspuquadaddui—that exactly fit this
important MPEG kernel (and other kernels). These custom operations process four pairs of 8-bit pixel values in
parallel. In addition, dspuquadaddui performs saturation
tests in hardware, which eliminates any need to execute
explicit tests and branches.
takes arguments in registers rsrc1 and rsrc2, and it computes a result into register rdest. rsrc1 = [abcd], rsrc2 =
[wxyz], and rdest = [pqrs] where a, b, c, d, w, x, y, z, p, q,
r, and s are all unsigned eight-bit values. Then, quadavg
computes the output vector [pqrs] as follows:
For readers familiar with the details of MPEG algorithms,
the use of eight-bit IDCT values later in this example may
be confusing. The standard MPEG implementation calls
for nine-bit IDCT values, but extensive analysis has
shown that values outside the range [–128..127] occur
so rarely that they can be considered unimportant. Pursuant to this observation, the IDCT values are clipped
into the eight-bit range [–128..127] with saturating arithmetic before the frame reconstruction code runs. The assumption that this saturation occurs permits some of
PNX1300’s custom operations to have clean, simple definitions.
The first step in seeing how custom operations can be of
value in this case, is to unroll the loop by a factor of four.
The unrolled code is shown in Figure 4-5. This creates
code that is parallel with respect to the four pixel computations. As it is easily seen in the code, the four groups of
computations (one group per pixel) do not depend on
each other.
To understand how quadavg and dspuquadaddui can be
used in this code, we examine the function of these custom operations.
The quadavg custom operation performs pixel averaging
on four pairs of pixels in parallel. Formally, the operation
of quadavg is as follows:
quadavg rscr1 rsrc2 -> rdest
p
q
r
s
=
=
=
=
(a
(b
(c
(d
+
+
+
+
w
x
y
z
+
+
+
+
1)
1)
1)
1)
>>
>>
>>
>>
1
1
1
1
The pixel averaging in Figure 4-5 is evident in the first
statement of each of the four groups of statements. The
rest of the code—adding idct[i] value and performing the
saturation test—can be performed by the dspuquadaddui operation. Formally, its function is as follows:
dspuquadaddui rsrc1 rsrc2 -> rdest
takes arguments in registers rsrc1 and rsrc2, and it computes a result into register rdest. rsrc1 = [efgh], rsrc2 =
[stuv], and rdest = [ijkl] where e, f, g, h, i, j, k, and l are
unsigned 8-bit values; s, t, u, and v are signed 8-bit values. Then, dspuquadaddui computes the output vector
[ijkl] as follows:
i
j
k
l
=
=
=
=
uclipi(e
uclipi(f
uclipi(g
uclipi(h
+
+
+
+
s,
t,
u,
v,
255)
255)
255)
255)
The uclipi operation is defined in this case as it is for the
separate PNX1300 operation of the same name described in Appendix A, “PNX1300/01/02/11 DSPCPU
Operations,”. Its definition is as follows:
PRELIMINARY SPECIFICATION
4-5
PNX1300/01/02/11 Data Book
Philips Semiconductors
void reconstruct (unsigned char *back,
unsigned char *forward,
char *idct,
unsigned char *destination)
{
int i, temp;
for (i = 0; i < 64; i += 4)
{
temp = ((back[i+0] + forward[i+0] + 1) >> 1) + idct[i+0];
if (temp > 255) temp = 255;
else if (temp < 0) temp = 0;
destination[i+0] = temp;
temp = ((back[i+1] + forward[i+1] + 1) >> 1) + idct[i+1];
if (temp > 255) temp = 255;
else if (temp < 0) temp = 0;
destination[i+1] = temp;
temp = ((back[i+2] + forward[i+2] + 1) >> 1) + idct[i+2];
if (temp > 255) temp = 255;
else if (temp < 0) temp = 0;
destination[i+2] = temp;
temp = ((back[i+3] + forward[i+3] + 1) >> 1) + idct[i+3];
if (temp > 255) temp = 255;
else if (temp < 0) temp = 0;
destination[i+3] = temp;
}
}
Figure 4-5. MPEG frame reconstruction code using PNX1300 custom operations; compare with Figure 4-4.
uclipi (m, n)
{
if (m < 0) return 0;
else if (m > n) return n;
else return m;
}
To make is easier to see how these operations can subsume all the code in Figure 4-5, Figure 4-6 shows the
same code rearranged to group the related functions.
Now it should be clear that the quadavg operation can replace the first four lines of the loop assuming that we can
get the individual 8-bit elements of the back[] and forward[] arrays positioned correctly into the bytes of a 32bit word. That, of course, is easy: simply align the byte arrays on word boundaries and access them with word (integer) pointers.
Similarly, it should now be clear that the dspuquadaddui
operation can replace the remaining code (except, of
course, for storing the result into the destination[] array)
assuming, as above, that the 8-bit elements are aligned
and packed into 32-bit words.
Figure 4-7 shows the new code. The arrays are now accessed in 32-bit (int-sized) chunks, the loop iteration control has been modified to reflect the ‘four-at-a-time’ operations, and the quadavg and dspuquadaddui operations
have replaced the bulk of the loop code. Finally,
Figure 4-8 shows a more compact expression of the loop
code, eliminating the temporary variable. Note that
PNX1300 C compiler does the optimization by itself.
Again, note that the code in Figure 4-7 and Figure 4-8
assumes that the character arrays are 32-bit word
4-6
PRELIMINARY SPECIFICATION
aligned and padded if necessary to fill an integral number
of 32-bit words.
The original code required three additions, one shift, two
tests, three loads, and one store per pixel. The new code
using custom operations requires only two custom operations, three loads, and one store for four pixels, which is
more than a factor of six improvement. The actual performance improvement can be even greater depending on
how well the compiler is able to deal with the branches in
the original version of the code, which depends in part on
the surrounding code. Reducing the number of branches
almost always improves the chances of realizing maximum performance on the PNX1300 CPU.
The code in Figure 4-8 illustrates several aspects of using custom operations in C-language source code. First,
the custom operations require no special declarations or
syntax; they appear to be simple function calls. Second,
there is no need to explicitly specify register assignments
for sources, destinations, and intermediate results; the
compiler and scheduler assign registers for custom operations just as they would for built-in language operations
such as integer addition. Third, the scheduler packs custom operations into PNX1300 VLIW instructions as effectively as it packs operations generated by the compiler
for native language constructs.
Thus, although the burden of making effective use of
custom operations falls on the programmer, that burden
consists only of discovering the opportunities for exploiting the operations and then coding them using standard
C-language notation. The compiler and scheduler take
care of the rest.
Philips Semiconductors
Custom Operations for Multimedia
void reconstruct (unsigned char *back,
unsigned char *forward,
char *idct,
unsigned char *destination)
{
int i, temp0, temp1, temp2, temp3;
for (i = 0;
{
temp0 =
temp1 =
temp2 =
temp3 =
i < 64; i += 4)
((back[i+0]
((back[i+1]
((back[i+2]
((back[i+3]
+
+
+
+
forward[i+0]
forward[i+1]
forward[i+2]
forward[i+3]
+
+
+
+
1)
1)
1)
1)
>>
>>
>>
>>
1);
1);
1);
1);
temp0 += idct[i+0];
if (temp0 > 255) temp0 = 255;
else if (temp0 < 0) temp0 = 0;
temp1 += idct[i+1];
if (temp1 > 255) temp1 = 255;
else if (temp1 < 0) temp1 = 0;
temp2 += idct[i+2];
if (temp2 > 255) temp2 = 255;
else if (temp2 < 0) temp2 = 0;
temp3 += idct[i+3];
if (temp3 > 255) temp3 = 255;
else if (temp3 < 0) temp3 = 0;
destination[i+0]
destination[i+1]
destination[i+2]
destination[i+3]
=
=
=
=
temp 0;
temp1;
temp2;
temp3;
}
}
Figure 4-6. Re-grouped code of Figure 4-5.
void reconstruct (unsigned char *back,
unsigned char *forward,
char *idct,
unsigned char *destination)
{
int i, temp;
int
int
int
int
*i_back
*i_forward
*i_idct
*i_dest
=
=
=
=
(int
(int
(int
(int
*)
*)
*)
*)
back;
forward;
idct;
destination;
for (i = 0; i < 16; i += 1)
{
temp = QUADAVG(i_back[i], i_forward[i]);
temp = DSPUQUADADDUI(temp, i_idct[i]);
i_dest[i] = temp;
}
}
Figure 4-7. Using the custom operation dspquadaddui to speed up the loop of Figure 4-6.
4.4
EXAMPLE 3: MOTION-ESTIMATION
KERNEL
Another part of the MPEG coding algorithm is motion estimation. The purpose of motion estimation is to reduce
the cost of storing a frame of video by expressing the
contents of the frame in terms of adjacent frames. A given frame is reduced to small blocks, and a subsequent
frame is represented by specifying how these small
blocks change position and appearance; usually, storing
the difference information is cheaper than storing a
whole block. For example, in a video sequence where
the camera pans across a static scene, some frames can
be expressed simply as displaced versions of their predecessor frames. To create a subsequent frame, most
blocks are simply displaced relative to the output screen.
The code in this example is for a match-cost calculation,
a small kernel of the complete motion-estimation code.
As with the previous example, this code provides an excellent example of how to transform source code to make
the best use of PNX1300’s custom operations.
PRELIMINARY SPECIFICATION
4-7
PNX1300/01/02/11 Data Book
Philips Semiconductors
void reconstruct (unsigned char *back,
unsigned char *forward,
char *idct,
unsigned char *destination)
{
int i;
int
int
int
int
*i_back
*i_forward
*i_idct
*i_dest
=
=
=
=
(int
(int
(int
(int
*)
*)
*)
*)
back;
forward;
idct;
destination;
for (i = 0; i < 16; i += 1)
i_dest[i] = DSPUQUADADDUI(QUADAVG(i_back[i], i_forward[i]), i_idct[i]);
}
Figure 4-8. Final version of the frame-reconstruction code.
unsigned char A[16][16];
unsigned char B[16][16];
.
.
.
for (row = 0; row < 16; row += 1)
{
for (col = 0; col < 16; col += 1)
cost += abs(A[row][col] – B[row][col]);
}
Figure 4-9. Match-cost loop for MPEG motion estimation.
unsigned char A[16][16];
unsigned char B[16][16];
.
.
.
for (row = 0; row < 16; row += 1)
{
for (col = 0; col < 16; col += 4)
{
cost += abs(A[row][col+0] – B[row][col+0]);
cost += abs(A[row][col+1] – B[row][col+1]);
cost += abs(A[row][col+2] – B[row][col+2]);
cost += abs(A[row][col+3] – B[row][col+3]);
Figure 4-10. Unrolled, but not parallel, version of the loop from Figure 4-9.
Figure 4-9 shows the original source code for the matchcost loop. Unlike the previous example, the code is not a
self-contained function. Somewhere early in the code,
the arrays A[][] and B[][] are declared; somewhere between those declarations and the loop of interest, the arrays are filled with data.
4.4.1
A Simple Transformation
First, we will look at the simplest way to use a PNX1300
custom operation.
We start by noticing that the computation in the loop of
Figure 4-9 involves the absolute value of the difference
of two unsigned characters (bytes). By now, we are familiar with the fact that PNX1300 includes a number of
operations that process all four bytes in a 32-bit word simultaneously. Since the match-cost calculation is fundamental to the MPEG algorithm, it is not surprising to find
4-8
PRELIMINARY SPECIFICATION
a custom operation—ume8uu—that implements this operation exactly.
To understand how ume8uu can be used in this case, we
need to transform the code as in the previous example.
Though the steps are presented here in detail, a programmer with a even a little experience can often perform these transformations by visual inspection.
To use a custom operation that processes 4 pixel values
simultaneously, we first need to create 4 parallel pixel
computations. Figure 4-10 shows the loop of Figure 4-9
unrolled by a factor of 4. Unfortunately, the code in the
unrolled loop is not parallel because each line depends
on the one above it. Figure 4-11 shows a more parallel
version of the code from Figure 4-10. By simply giving
each computation its own cost variable and then summing the costs all at once, each cost computation is completely independent.
Philips Semiconductors
Custom Operations for Multimedia
unsigned char A[16][16];
unsigned char B[16][16];
.
.
.
for (row = 0; row < 16; row += 1)
{
for (col = 0; col < 16; col += 4)
{
cost0 = abs(A[row][col+0] – B[row][col+0]);
cost1 = abs(A[row][col+1] – B[row][col+1]);
cost2 = abs(A[row][col+2] – B[row][col+2]);
cost3 = abs(A[row][col+3] – B[row][col+3]);
cost += cost0 + cost1 + cost2 + cost3;
Figure 4-11. Parallel version of Figure 4-10.
unsigned char
unsigned char
.
.
.
unsigned char
unsigned char
A[16][16];
B[16][16];
*CA = A;
*CB = B;
for (row = 0; row < 16; row += 1)
{
int rowoffset = row * 16;
for (col = 0; col < 16; col
{
cost0 = abs(CA[rowoffset
cost1 = abs(CA[rowoffset
cost2 = abs(CA[rowoffset
cost3 = abs(CA[rowoffset
+= 4)
+
+
+
+
col+0]
col+1]
col+2]
col+3]
–
–
–
–
CB[rowoffset
CB[rowoffset
CB[rowoffset
CB[rowoffset
+
+
+
+
col+0]);
col+1]);
col+2]);
col+3]);
cost += cost0 + cost1 + cost2 + cost3;
Figure 4-13. The loop of Figure 4-11 recoded with one-dimensional array accesses.
Excluding the array accesses, the loop body in
Figure 4-11 is now recognizable as the function performed by the ume8uu custom operation: the sum of 4
absolute values of 4 differences. To use the ume8uu operation, however, the code must access the arrays with
32-bit word pointers instead of with 8-bit byte pointers.
Figure 4-13 shows the loop recoded to access A[][] and
B[][] as one-dimensional instead of two-dimensional arrays. We take advantage of our knowledge of C-language array storage conventions to perform this code
transformation. Recoding to use one-dimensional arrays
prepares the code for transformation to 32-bit array accesses.
(From here on, until the final code is shown, the declarations of the A and B arrays will be omitted from the code
fragments for the sake of brevity.)
unsigned int *IA = (unsigned int *) A;
unsigned int *IB = (unsigned int *) B;
for (i = 0; i < 64; i += 1)
cost += UME8UU(IA[i], IB[i]);
Figure 4-12. The loop of Figure 4-14 with the inner
loop eliminated.
Figure 4-14 shows the loop of Figure 4-13 recoded to
use ume8uu. Once again taking advantage of our knowledge of the C-language array storage conventions, the
one-dimensional byte array is now accessed as a one-dimensional 32-bit-word array. The declarations of the
pointers IA and IB as pointers to integers is the key, but
also notice that the multiplier in the expression for row
offset has been scaled from 16 to 4 to account for the fact
that there are 4 bytes in a 32-bit word.
Of course, since we are now using one-dimensional arrays to access the pixel data, it is natural to use a single
for loop instead of two. Figure 4-12 shows this streamlined version of the code without the inner loop. Since Clanguage arrays are stored as a linear vector of values,
we can simply increase the number of iterations of the
outer loop from 16 to 64 to traverse the entire array.
The recoding and use of the ume8uu operation has resulted in a substantial improvement in the performance
of the match-cost loop. In the original version, the code
executed 1280 operations (including loads, adds, subtracts, and absolute values); in the restructured version,
there are only 256 operations—128 loads, 64 ume8uu
operations, and 64 additions. This is a factor of five reduction in the number of operations executed. Also, the
PRELIMINARY SPECIFICATION
4-9
PNX1300/01/02/11 Data Book
Philips Semiconductors
unsigned int *IA = (unsigned int *) A;
unsigned int *IB = (unsigned int *) B;
for (row = 0; row < 16; row += 1)
{
int rowoffset = row * 4;
for (col4 = 0; col4 < 4; col4 += 1)
cost += UME8UU(IA[rowoffset + col4], IB[rowoffset + col4]);
}
Figure 4-14. The loop of Figure 4-13 recoded with 32-bit array accesses and the ume8uu custom operation.
overhead of the inner loop has been eliminated, further
increasing the performance advantage.
4.4.2
More Unrolling
The code transformations of the previous section
achieved impressive performance improvements, but
given the VLIW nature of the PNX1300 CPU, more can
be done to exploit PNX1300’s parallelism.
The code in Figure 4-12 has a loop containing only 4 operations (excluding loop overhead). Since PNX1300’s
branches have a 3-instruction delay and each instruction
can contain up to 5 operations, a fully utilized minimumsized loop can contain 16 operations (20 minus loop
overhead).
The PNX1300 compilation system performs a wide variety of powerful code transformation and scheduling optimizations to ensure that the VLIW capabilities of the
CPU are exploited. It is still wise, however, to make program parallelism explicit in source code when possible.
Explicit parallelism can only help the compiler produce a
fast running program.
To this end, we can unroll the loop of Figure 4-12 some
number of times to create explicit parallelism and help
the compiler create a fast running loop. In this case,
where the number of iterations is a power-of-two, it
makes sense to unroll by a factor that is a power-of-two
to create clean code.
Figure 4-15 shows the loop unrolled by a factor of eight.
The compiler can apply common sub-expression elimination and other optimizations to eliminate extraneous
operations in the array indexing, but, again, improvements in the source code can only help the compiler produce the best possible code and fastest-running program.
Figure 4-16 shows one way to modify the code for simpler array indexing.
unsigned int *IA = (unsigned int *) A;
unsigned int *IB = (unsigned int *) B;
for (i = 0;
{
cost0 =
cost1 =
cost2 =
cost3 =
cost4 =
cost5 =
cost6 =
cost7 =
i < 64; i += 8)
UME8UU(IA[i+0],
UME8UU(IA[i+1],
UME8UU(IA[i+2],
UME8UU(IA[i+3],
UME8UU(IA[i+4],
UME8UU(IA[i+5],
UME8UU(IA[i+6],
UME8UU(IA[i+7],
IB[i+0]);
IB[i+1]);
IB[i+2]);
IB[i+3]);
IB[i+4]);
IB[i+5]);
IB[i+6]);
IB[i+7]);
cost += cost0 + cost1 + cost2 +
cost3 + cost4 + cost5 +
cost6 + cost7;
}
Figure 4-15. Unrolled version of Figure 4-12. This
code makes good use of PNX1300’s VLIW capabilities.
unsigned char A[16][16];
unsigned char B[16][16];
.
.
.
unsigned int *IA = (unsigned int *) A;
unsigned int *IB = (unsigned int *) B;
for (i = 0;
8)
{
cost0 =
cost1 =
cost2 =
cost3 =
cost4 =
cost5 =
cost6 =
cost7 =
i < 64; i += 8, IA += 8, IB +=
UME8UU(IA[0],
UME8UU(IA[1],
UME8UU(IA[2],
UME8UU(IA[3],
UME8UU(IA[4],
UME8UU(IA[5],
UME8UU(IA[6],
UME8UU(IA[7],
IB[0]);
IB[1]);
IB[2]);
IB[3]);
IB[4]);
IB[5]);
IB[6]);
IB[7]);
cost += cost0 + cost1 + cost2 +
cost3 + cost4 + cost5 +
cost6 + cost7;
}
Figure 4-16. Code from Figure 4-15 with simplified
array index calculations.
4-10
PRELIMINARY SPECIFICATION
Cache Architecture
Chapter 5
by Eino Jacobs
5.1
MEMORY SYSTEM OVERVIEW
In this document, the generic PNX1300 name refers
to the PNX1300 Series, or the PNX1300/01/02/11
products.
The separate on-chip data and instruction caches serve
only the DSPCPU since the data access patterns of the
autonomous I/O and graphics units exhibit little or no locality of reference (they access each piece of the multimedia data stream only once in each operation).
The high-performance video and audio throughput of
PNX1300 is implemented by its DSPCPU and autonomous I/O and co-processing units, but the foundation of
this processing is the PNX1300 memory hierarchy. To
get the full potential of the chip’s processing units, the
memory hierarchy must read and write data (and DSP
CPU instructions) fast enough to keep the units busy.
Without the caches, the CPU would not be able to
achieve its performance potential. SDRAM has enough
bandwidth to handle serial streams of multimedia data,
but its bandwidth and latency are insufficient to satisfy
the CPU’s high rate of random data accesses and repeated instruction accesses.
To meet the requirements of its target applications,
PNX1300’s memory hierarchy must satisfy the conflicting goals of low cost, simple system design (e.g., low
parts count), and high performance. Since multimedia
video streams can require relatively large temporary
storage, a significant amount of external DRAM is required. Minimizing the cost of bulk memory is important.
Table 5-1. 100-MHz PNX1300 memory bandwidth
parameters
PNX1300’s memory system achieves a good compromise between cost and performance by coupling substantial on-chip caches with a glueless interface to synchronous DRAM (SDRAM). SDRAM provides higher
bandwidth than standard DRAM for only a small cost premium. A block diagram of the memory system is shown
in Figure 5-1. SDRAM permits PNX1300 to use a narrower and simpler interface than would be required to
achieve similar performance with standard DRAM.
Three sets, each has address,
opcode, condition, and guard
Three
Branch
Units
Decompressor
VLIW
CPU
Magnitude
2800 MB/s
800 MB/s
Data bandwidth (two 32-bit memory ports)
400 MB/s
Main-memory bandwidth (one 32-bit port)
Table 5-1 shows bandwidth parameters for the PNX1300
DSPCPU and the main-memory interface. Although 400
MB/s is a lot of bandwidth, it is clear that the SDRAM
alone cannot keep up with the CPU’s maximum requirements for instructions and data. Luckily, multimedia algorithms resemble other computer programs in terms of locality of reference, so the on-chip caches typically supply
Internal data highway:
32-bit address, 32-bit
data
32KB, 8-way
Instruction
Cache
Main
Memory
Interface
224 bits of decompressed
instruction
Two
Memory
Units
Use
Instruction bandwidth (224 bits/instruction)
SDRAM
Main
Memory
16KB, 8-way
Data
Cache
Two sets, each has a guard,
opcode, data, and two
address components
To on-chip
peripherals
Main-memory bus:
glueless, SDRAM
control with 32-bit
data
Figure 5-1. The main components of the PNX1300 memory system.
PRELIMINARY SPECIFICATION
5-1
PNX1300/01/02/11 Data Book
Philips Semiconductors
PNX1300’s processing units access the external
SDRAM through the on-chip central “data highway” bus.
The highway consists of separate 32-bit address and
data buses, and use of the bus is mediated by the mainmemory interface unit. The main-memory interface contains the SDRAM controller and a central arbiter that determines how much of the available SDRAM memory
bandwidth is allocated to each unit. Unused bandwidth is
always made available to the VLIW CPU for cache refill
and memory accesses that bypass the caches.
the majority of instructions and data to the DSPCPU. The
wide paths to the caches are matched to the bandwidth
requirements of the DSPCPU.
Table 5-2. Summary of memory system
characteristics
Unit
Branch units
Decompression unit
Description
Branch units execute branch operations. Up to
three branch operations can be executed in
parallel, but the program must guarantee that
only one branch is taken.
Table 5-2 gives a summary description of each component of PNX1300’s memory system.
Instructions are stored in memory and in the
instruction cache in a space-saving, compressed format. The decompression unit
expands instructions to their full, 28-byte size
before they are issued to the CPU.
Instruction
cache
The instruction cache holds 32 KB, is 8-way
set-associative, and has a 64-byte block size.
A miss in a block causes the entire block to be
read from SDRAM. The cache can sustain an
issue rate of one instruction per cycle on
cache hits.
Memory units
Memory units execute load and store operations. The data cache is dual ported to allow
the memory units to operate concurrently.
Data cache
The data cache holds 16 KB, is 8-way setassociative, has a 64-byte block size, and
implements a copyback, allocate-on-write policy. A miss in a block causes the entire block
to be read from SDRAM. The cache supports
memory-mapped I/O through non-cacheable
address regions.
Data highway
The on-chip data highway bus serves all onchip units. The highway has separate 32-bit
data and address buses. Bus bandwidth is
allocated by the highway arbiter according to
one of several modes.
5.2
PNX1300 implements a 32-bit linear address space of
bytes. Within that address space, PNX1300 supports
several different apertures for specific purposes. The
DRAM aperture describes the part of the address space
into which the external SDRAM is mapped. SDRAM
must consist of a single, contiguous region of memory,
which is the most practical configuration for PNX1300
systems.
The location and size of the DRAM aperture is defined by
two registers, DRAM_BASE and DRAM_LIMIT. These
registers are both readable and writeable as MMIO registers and as PCI configuration space registers. The view
of the registers in MMIO space is shown in Figure 5-2.
The view of the registers in PCI configuration space is
described in Chapter 11, “PCI Interface.” In normal operation, the base address registers are assigned once during boot and not changed when the DSPCPU is running.
Refer to Chapter 11, “PCI Interface,” and Chapter 13,
“System Boot,” for a description of this process.
DRAM_LIMIT must be set equal to DRAM_BASE plus
the actual size of SDRAM present. The amount of the
SDRAM is not required to be a power of 2, but it must be
a multiple of 64 KB. Note that the size of the aperture as
set in the PCI configuration space can be larger, because it must be a power of 2.
Main-memory The main-memory interface contains the datainterface
highway access arbiter, the SDRAM controller, and MMIO logic.
SDRAM main
memory
External SDRAM connects gluelessly to
PNX1300 over the 32-bit main-memory bus.
A memory operation will access SDRAM if its address
satisfies:
To improve cache behavior and thus program performance, the caches have a locking mechanism. In addition, the instruction cache is coupled with an instruction
decompression unit. The compressed instruction format
improves the cache hit rate and reduces the bus bandwidth required between main memory and cache. Instructions in main memory and cache use the compressed format.
MMIO_BASE
offset:
31
0x10 0000
DRAM_BASE (r/w)
0x10 0004
DRAM_LIMIT (r/w)
27
DRAM APERTURE
[DRAM_BASE] ≤ address < [DRAM_LIMIT]
Any address outside this range cannot access SDRAM.
When PNX1300 is reset, DRAM_BASE_FIELD is set to
0x0 and DRAM_LIMIT is set to 0x0010 0000 (1-MB
DRAM aperture starting at address 0x0). The boot process described in Chapter 13, “System Boot,” overrides
these initial settings.
23
DRAM_BASE_FIELD
19
DRAM_LIMIT_FIELD
Figure 5-2. Formats of the DRAM_BASE and DRAM_LIMIT registers.
5-2
PRELIMINARY SPECIFICATION
15
11
7
3
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Philips Semiconductors
5.3
Cache Architecture
5.3.1
DATA CACHE
The PNX1300 data cache is 16 KB in size with a 64-byte
block size. Thus, it contains 256 blocks each with its own
address tag. The cache is 8-way set-associative, so
there are 32 sets, each containing 8 tags. A single valid
bit is associated with a block, so each block and associated address tag is either entirely valid in the cache or invalid. On a cache miss, 64 bytes are read from SDRAM
to make the entire block valid.
The data cache serves only the DSPCPU and is controlled by two memory units that execute the load and
store operations issued by the DSPCPU. The following
sections describe the data cache and its operation;
Table 5-3 summarizes the important characteristics for
easy reference.
Table 5-3. Summary of data cache characteristics
Characteristic
Cache size
Each block also contains a dirty bit, which is set whenever a write to the block occurs. Each set contains 10 bits
to support the hierarchical LRU replacement policy.
PNX1300 Implementation
16 KB
Cache associativity
8-way set-associative
Block size
64 bytes
Valid bits
One valid bit per 64-byte block
Dirty bits
One dirty bit per 64-byte block
Miss transfer order
Miss transfers begin with the critical
word first
Replacement policies
Copyback, allocate on write, hierarchical
LRU
Endianness
Either little- or big-endian, determined
by PCSW bit
Ports
The cache is quasi dual ported; two
accesses can proceed concurrently if
they reference different banks (determined by bits [4:2] of the computed
addresses)
Alignment
General Cache Parameters
The geometry of the data cache is available to software
by reading the MMIO register DC_PARAMS. Figure 5-3
shows the format of the DC_PARAMS register;
Table 5-4 lists its field values. The product of block size,
associativity, and number of sets gives the total cache
size (16 KB in this case).
Table 5-4. DC_PARAMS field values
Field Name
Value
BLOCK SIZE
8
NUMBER_OF_SETS
32
5.3.2
Access must be naturally aligned (32-bit
words on 32-bit boundaries, 16-bit halfwords on 16-bit boundaries); the appropriate number of LSBs of un-naturally
aligned addresses are set to zero.
For misaligned stores, PCSW.MSE is
asserted to generate an exception
64
ASSOCIATIVITY
Address Mapping
PNX1300 data addresses are mapped onto the data
cache storage structure as shown in Figure 5-4. A data
address is partitioned into four fields as described in
Table 5-5.
Table 5-5. Data address field partitioning
Partial word operations
The cache implements 8-bit and 16-bit
accesses with the same performance as
32-bit accesses
Field
Operation latency
Three cycles for both load and store
operations
Address
Bits
Byte
1..0
Coherency enforce- Software uses special operations to
ment
enforce cache coherency
Byte offset within a word for byte or halfword accesses
Cache locking
Non-cacheable
region
Purpose
Word
5..2
Up to 1/2 (four out of 8 blocks of each
set) of the cache contents can be
locked; granularity is 64-byte
Selects one of the words in a set (one of
16 words in the case of PNX1300)
Set
10..6
Selects one of the sets in the cache (one
of 32 in the case of PNX1300)
One non-cacheable aperture in the
DRAM address space is supported.
Tag
31..11
Compared against address tags of set
members
MMIO_BASE
offset:
31
27
23
0x10 001C DC_PARAMS (r/o)
19
BLOCKSIZE
15
11
ASSOCIATIVITY
7
3
0
NUMBER_OF_SETS
Figure 5-3. Format of the DC_PARAMS register.
31
Data Cache Address
11
Tag
10
6
Set
5
2
Word
1
0
Byte
Figure 5-4. Data cache address partitioning.
PRELIMINARY SPECIFICATION
5-3
PNX1300/01/02/11 Data Book
5.3.3
Miss Processing Order
When a miss occurs, the data cache fills the block containing the requested word from the critical word first.
The CPU is stalled until the first word is transferred. The
block is then filled up while the CPU keeps running.
5.3.4
Replacement Policies, Coherency
The cache implements a copyback replacement policy
with one dirty bit per 64-byte block. Thus, when a miss
occurs and the block selected for replacement has its
dirty bit set, the dirty block must be written to main memory to preserve its modified contents. On PNX1300, the
dirty block is written to memory before the needed block
is fetched.
Coherency is not maintained in any way by hardware between the data cache, the instruction cache, and main
memory. Special operations are available to implement
cache coherency in software. See Section 5.6, “Cache
Coherency,” for a discussion of coherency issues.
Write misses are handled with an allocate-on-write policy—the write that caused the miss stores its data in the
cache after the missing block is fetched into the cache.
The cache implements a hierarchical LRU replacement
algorithm to determine which of the eight elements
(blocks) in a set is replaced. The algorithm partitions the
eight set elements into four groups, each group with two
elements. The hierarchical LRU replacement victim is
determined by selecting the least-recently used group of
two elements and then selecting the least-recently used
element in that group. This hierarchical algorithm yields
performance close to full LRU but is simpler to implement.
See Section 5.5, “LRU Algorithm,” for a full discussion of
the LRU algorithm.
5.3.5
Alignment, Partial-Word Transfers,
Endian-ness
The cache implements 32-bit word, 16-bit half-word, and
8-bit byte transfers. All transfers, however, must be to
addresses that are naturally aligned; that is, 32-bit words
must be aligned on 32-bit boundaries, and 16-bit halfwords must be aligned on 16-bit boundaries.
Like other PNX1300 processing units, the CPU has the
capability to use either big- or little-endian byte order. It
is recommended that all units and the CPU run with the
same endian-ness. Detailed endian-ness description
can be found in Appendix C, “Endian-ness.”
5.3.6
Dual Ports
To allow two accesses to proceed in parallel, the data
cache is quasi-dual ported. The cache is implemented as
eight banks of single-ported memory, but the hardware
allows each bank to operate independently. Thus, when
the addresses of two simultaneous accesses select two
different banks, both accesses can complete simultaneously. Bank selection is determined by the three loworder address bits [4..2] of each address. Thus, the
5-4
PRELIMINARY SPECIFICATION
Philips Semiconductors
words in a 64-byte cache block are distributed among the
eight blocks, which prevents conflicts between two simultaneously issued accesses to adjacent words in a cache
block. The PNX1300 compiling system attempts to avoid
bank conflicts as much as possible.
The dual-ported cache can execute the load and store
opcodes (ild8d, uld8d, ild16d, uld16d, ld32d, h_st8d,
h_st16d, h_st32d, ild8r, uld8r, ild16r, uld16r, ld32r,
ild16x, uld16x, ld32x) in either or both of the two ports.
The special opcodes alloc, dcb, dinvalid, pref, rdtag and
rdstatus can only be executed in the second port, not in
the first port. Whenever any of these special opcodes is
issued in the second port, there should not be a concurrent load or store operation in the first. This is a special
scheduling constraint.
5.3.7
Cache Locking
The data cache allows the contents of up to one-half of
its blocks to be locked. Thus, on PNX1300, up to 8 KB of
the cache can be used as a high-speed local data memory. Only four out of eight blocks in any set can be
locked.
A locked block is never chosen as a victim by the replacement algorithm; its contents remain undisturbed until either (1) the block’s locked status is changed explicitly
by software, or (2) a dinvalid operation is executed that
targets the locked block.
Cache locking occurs only for the data in the address
range
described
by
the
MMIO
registers
DC_LOCK_ADDR and DC_LOCK_SIZE. The granularity of the address range is one 64-byte cache block. The
MMIO register DC_LOCK_CTL contains the cache-locking enable bit DC_LOCK_ENABLE. Figure 5-5 shows
the layout of the data-cache lock registers. Locking will
occur for an address if locking is enabled and both of the
following are true:
1. The address is greater than or equal to the value in
DC_LOCK_ADDR.
2. The address is less than the sum of the values in
DC_LOCK_ADDR and DC_LOCK_SIZE.
Programmers (or compilers) must combine all data that
needs to be locked into this single linear address range.
Setting DC_LOCK_ENABLE to ‘1’ causes the following
sequence of events:
1. All blocks that are in cache locations that will be used
for locking are copied back to main memory (if they
are dirty) and removed from the cache.
2. All blocks in the lock range are fetched from main
memory into the cache. If any block in the lock range
was already in the cache, it’s first copied back into
main memory (if it’s dirty) and invalidated.
3. The LRU status of any set that contains locked blocks
is set to the initialization value.
4. Cache locking is activated so that the locked blocks
cannot be victims of the replacement algorithm.
This sequence of events is triggered by writing ‘1’ to
DC_LOCK_ENABLE even if the enable is already set to
Philips Semiconductors
MMIO_BASE
offset:
0x10 0010
DC_LOCK_CTL (r/w)
Cache Architecture
APERTURE_CONTROL
31
27
23
19
15
11
6
7
5
3
0
reserved
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
DC_LOCK_ENABLE
0x10 0014
DC_LOCK_ADDR (r/w)
0x10 0018
DC_LOCK_SIZE (r/w)
DC_LOCK_ADDRESS
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
DC_LOCK_SIZE
0 0 0 0 0 0
Figure 5-5. Formats of the registers in charge of data-cache locking.
Table 5-6. Aperture control field
‘1’. Setting DC_LOCK_ENABLE to ‘0’ causes no action
except to allow the previously locked blocks to be replacement victims.
Value
To program a new lock range, the following sequence of
operations is used:
Memory map properties
00 (RESET) Normal operation memory map (Section 3.4.1):
• loads to 0..0xff always return 0 and cause no
PCI read (memory hole is enabled)
• PCI aperture(s) are enabled
1. Disable cache locking by writing ‘0’ to
DC_LOCK_ENABLE.
2. Define a new lock range by writing to
DC_LOCK_ADDR and DC_LOCK_SIZE.
3. Enable cache locking by writing ‘1’ to
DC_LOCK_ENABLE.
Dirty locked blocks can be written back to main memory
while locking is enabled by executing copyback operations in software.
01
• loads to address 0..0xff cause a PCI read, i.e.
the memory hole is disabled
• PCI aperture(s) are enabled
10
PCI apertures are disabled for loads
• loads return a 0 and cause no PCI read
11
RESERVED for future extensions
5.3.9
Programmer’s note: Software should not execute dinvalid operations on a locked block. If it does, the block
will be removed from the cache, creating a ‘hole’ in the
lock range (and the data cache) that cannot be reused
until locking is deactivated.
Non-cacheable Region
Cache locking is disabled by default when PNX1300 is
reset.
The data cache supports one non-cacheable address region within the DRAM address space aperture. The base
address of this region is determined by the value in the
DRAM_CACHEABLE_LIMIT MMIO register, which is
shown in Figure 5-6. Since uncached memory operations always incur many stall cycles, the non-cacheable
region should be used sparingly.
The RESERVED field in DC_LOCK_CTL should be ignored on reads and written as all zeroes.
A memory operation is non-cacheable if its target address satisfies:
Locking should not be enabled by PCI accesses to the
MMIO registers.
[dram_cacheable_limit]