PNX1301EH,557

厂商：
NXP(恩智浦)
封装：
HBGA292
描述：
IC MEDIA PROC 180MHZ 292-HBGA

数据手册：

下载PNX1301EH,557.pdf

立即购买

数据手册
价格&库存

PNX1301EH,557 数据手册

INTEGRATED CIRCUITS PNX1300 Series Media Processors Preliminary Specification Supersedes PNX1300 data of 2001 Oct 12 File under INTEGRATED CIRCUITS, TR1 2002 Feb 15 Philips Semiconductors Media Processors 2002 Feb 15 Preliminary Specification PNX1300 Series PNX1300 Series Data Book Foreword 13 System Boot Table of Contents 14 Image Coprocessor 1 Pin List 15 Variable Length Decoder 2 Overview 16 I2C Interface 3 DSPCPU Architecture 17 Synchronous Serial Interface 4 Custom Operations for Multimedia 18 JTAG Functional Specification 5 Cache Architecture 19 On-Chip Semaphore Assist Device 6 Video In 20 Arbiter 7 Enhanced Video Out 21 Power Management 8 Audio In 22 PCI-XIO Bus Functional Specification 9 Audio Out A DSPCPU Operations 10 SPDIF Out B MMIO Register Summary 11 PCI Interface C Endian-ness 12 SDRAM Memory System Index  2001 Philips Electronics North America Corporation All rights reserved. See Terms and Conditions on the next page. 2002 Feb 15 Preliminary Specification Terms and Conditions TERMS AND CONDITIONS Philips Semiconductors and Philips Electronics North America Corporation reserve the right to make changes, without notice, in the products, including circuits, standard cells, and/or software, described or contained herein in order to improve design and/or performance. Philips Semiconductors assumes no responsibility or liability for the use of any of these products, conveys no license or title under any patent, copyright, or most work right to these products, and makes no representations or warranties that these products are free from patent, copyright, or most work right infringement, unless otherwise specified. Applications that are described herein for any of these products are for illustrative purposes only. Philips Semiconductors makes no representation or warranty that such applications will be suitable for the specified use without further testing or modification. LIFE SUPPORT APPLICATIONS Philips Semiconductors and Philips Electronics North America Corporation products are not designed for use in life support appliances, devices, or systems where malfunction of a Philips Semiconductors and Philips Electronics North America Corporation product can reasonably be expected to result in a personal injury. Philips Semiconductors and Philips Electronics North America Corporation customers using or selling Philips Semiconductors and Philips Electronics North America Corporation products for use in such applications do so at their own risk and agree to fully indemnify Philips Semiconductors and Philips Electronics North America Corporation for any damages resulting from improper use or sale. Philips Semiconductors and Philips Electronics North America Corporation register eligible circuits under the Semiconductor Chips Protection Act. DEFINITIONS Data Sheet Identification Product Status Definition Objective Specification Formative or in Design This data sheet contains the design target or goal specifications for product development. Specifications may change in any manner without notice. Preliminary Specification Preproduction Product This data sheet contains preliminary data, and supplementary data will be published at a later date. Philips Semiconductors reserves the right to make changes at any time without notice in order to improve design and supply the best possible product. Product Specification Full Production This data sheet contains Final Specifications. Philips Semiconductors reserves the right to make changes at any time without notice, in order to improve the design and supply the best possible product.  2001, 2002 Philips Electronics North America Corporation All rights reserved. Printed in U.S.A. Business Line Media Processing, 811 E. Arques Avenue, Sunnyvale, CA 94088 Foreword The TriMedia PNX1300 Series is an enhanced version of the TM-1300 family of media processor. The PNX1300 Series contains an ultra-high performance Very Long Instruction Word processor, as well as a complete intelligent video and audio input/output subsystem. The processor has an instruction set that is optimized for processing audio, video and graphics. It includes powerful SIMD multimedia operators for eight- and 16-bit signal datatypes as well as a full complement of 32-bit IEEE compatible floating point operations. The PNX1300 Series is intended as a multi-standard programmable video, audio and graphics processor. It can either be used standalone, or as an accelerator to a general purpose processor. The architecture of the TriMedia family came about as the result of many years of effort of many dedicated individuals. Going back in history, the origin of TriMedia was laid by the LIFE-1 VLIW processor, designed by Junien Labrousse and myself in 1987. Work continued afterwards in Philips Research Labs, Palo Alto. My special thanks go to the entire Palo Alto research team: Mike Ang, Uzi Bar-Gadda, Peter Donovan, Martin Freeman, Eino Jacobs, Beomsup Kim, Bob Law, Yen Lee, Vijay Mehra, Pieter van der Meulen, Ross Morley, Mariette Parekh, Bill Sommer, Artur Sorkin and Pierre Uszynski. The Palo Alto period matured the architecture—we ported all video and audio algorithms that we could find to the compiler/simulator and refined the operation set. In addition, we learned how to give the architecture a market direction. In May 1994, Philips management—in particular Cees-Jan Koomen, Eddy Odijk, Theo Claasen and Doug Dunn—decided to develop TriMedia into a major Philips Semiconductors product line. Under the guidance of Keith Flagler, the TriMedia team was built. All of them contributed to take this from a set of interesting ideas to a reliable and competitive product in a short period of time. The initial TriMedia team included Fuad Abu Nofal, Karel Allen, Mike Ang, Robert Aquino, Manju Asthana, Patrick de Bakker, Shiv Balakrishnan, Jai Bannur, Marc Berger, Sunil Bhandari, Rusty Biesele, Ahmet Bindal, David Blakely, Hans Bouwmeester, Steve Bowden, Robert Bradfield, Nancy Breede, Shawn Brown, Sujay Chari, Catherine Chen, Howen Chen, Yan-ming Chen, Yong Cho, Scott Clapper, Matthew Clayson, Paul Coelho, Richard Dodds, Marc Duranton, Darcia Eding, Aaron Emigh, Li Chi Feng, Keith Flagler, Jean Gobert, Sergio Golombek, Mike Grimwood, Yudi Halim, Hari Hampapuram, Carl Hartshorn, Judy Heider, Laura Hrenko, Jim Hsu, Eino Jacobs, Marcel Janssens, Patricia Jones, Hann-Hwan Ju, Jayne Keith, Bhushan Kerur, Ayub Khan, Keith Knowles, Mike Kong, Ashok Krishnamurti, Yen Lee, Patrick Leong, Bill Lin, Laura Ling, Chialun Lu, Naeem Maan, Nahid Mansipur, Mike Maynard, Vijay Mehra, Jun Mejia, Derek Meyer, Prabir Mohanty, Saed Muhssin, Chris Nelson, Stephen Ness, Keith Ngo, Francis Nguyen, Kathleen Nguyen, Derek Noonburg, Ciaran O’Donnel, Sang-Ju Park, Charles Peplinski, Gene Pinkston, Maryam Pirayou, Pardha Potana, Bill Price, Victor Ramamoorthy, Babu Rao Kandamilla, Ehsan Rashid, Selliah Rathnam, Margaret Redmond, Donna Richardson, Alan Rodgers, Tilakray Roychoudhury, Hani Salloum, Chris Salzmann, Bob Seltzer, Ravi Selvaraj, Jim Shimandle, Deepak Singh, Bill Sommer, Juul van der Spek, Manoj Srivastava, Renga Sundararajan, Ken-Sue Tan, Ray Ton, Steve Tran, Cynthia Tripp, Ching-Yih Tseng, Allan Tzeng, Barbara Vendelin, John Vivit, Rudy Wang, Rogier Wester, Wayne Wonchoba, Anthony Wong, Sara Wu, David Wyland, Ken Xie, Vincent Xie, Bettina Yeung, Robert Yin, Charles Young, Grace Yun, Elena Zelayeta and Vivian Zhu. Expert help and feedback was received from many. In particular, I’d like to mention Kees van Zon of Philips Eindhoven for the help with filtering-related issues, and Craig Clapp of PictureTel for excellent feedback on all aspects of the architecture. My special thanks go to Joe Kostelec. He made me understand that my ambitions could better be realized in California than in Europe. Furthermore, his vision and his wisdom are credited with keeping this project alive and growing until the ‘investment decision.’ The vision of a universal media accelerator is credited to Jaap de Hoog. Jaap, I wish you were here to see it come to fruition. –Gerrit Slavenburg After the initial TM-1000 product, the TM-1100, TM-1300 and now PNX1300 Series chips have been successfully integrated in many video and audio products. It has been my pleasure to have been involved in these designs and would like to thank the people involved in TM-1300 and PNX1300 Series projects under the guidande of Cees Hartgring and Simon Wegerif. The team included Karel Allen, Tien-Cheng Bau, Jim Campbell, Anitamk Chan, John Chang, Roel Coppoolse, Taufik Dakhil, Mitch Daniil, Nam Dao, Patrick Debaumarche, Thuy Duong, Torsten Fink, Jan Grotenbreg, Mohammad Hafeez, Feng Hao, Farah Jubran, Babu Rao Kandamalla, Aki Kaniel, Yan-Ling Li, Ying-Chao Liu, Naeem Maan, Don Marshal, Thomas Meyer, Javed Mukarram, Long Nguyen, Tu Nghiem, Elaine Outler, Charles Peplinski, Duc T. Pham, Thorwald Rabeler, Raquel Ruiz, Ensieh Saffari, Hani Salloum, Wenyi Song, Stephen Tomasello, Tran Tung, Maria F. Wangsahamidjaja, Chang-Ming Yang, Mohammed I. Yousuf, Hui Zhang and Gerrit Slavenburg. - Luis Lucas PRELIMINARY INFORMATION 1 PNX1300/01/02/11 Data Book 2 PRELIMINARY INFORMATION Philips Semiconductors Table of Contents Foreword 1 Pin List 1.1 PNX1300 Series versus TM-1300 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.2 Boundary Scan Notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.3 I/O Circuit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.4 Signal Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 1.5 Power Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8 1.6 Pin Reference Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 1.7 Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 1.8 Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 1.9 Parametric Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 1.9.1 PNX1300/01/02/11 Absolute Maximum Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 1.9.2 PNX1300/01/02 Operating Range and Thermal Characteristics . . . . . . . . . . . . . . . . . . . . . . . 1-11 1.9.3 PNX1311 Operating Range and Thermal Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 1.9.4 PNX1300/01/02/11 Power Supply Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 1.9.5 PNX1300/01/02 DC/AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12 1.9.6 PNX1311 DC/AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12 1.9.7 PNX1300 Series Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13 1.9.7.1 Power Consumption for Applications on PNX1300 Series . . . . . . . . . . . . . . . . . . . . . . 1-13 1.9.7.2 PNX1300/01/02 DSPCPU Core Current and Power Consumption . . . . . . . . . . . . . . . . 1-14 1.9.7.3 PNX1311 DSPCPU Core Current and Power Consumption Details . . . . . . . . . . . . . . . 1-14 1.9.7.4 PNX1300/01/02 Current Consumption For On-Chip Peripherals . . . . . . . . . . . . . . . . . 1-15 1.9.7.5 PNX1311 Current Consumption For On-Chip Peripherals . . . . . . . . . . . . . . . . . . . . . . 1-16 1.9.7.6 STRG3, STRG5 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17 1.9.7.7 NORM3 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17 1.9.7.8 WEAK5 type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17 1.9.7.9 IICOD (I2c) type I/O circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17 1.9.7.10 SDRAM interface timing for PNX1300/01/02/11 speed grades. . . . . . . . . . . . . . . . . . 1-18 1.9.7.11 PCI Bus timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18 1.9.7.12 JTAG I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19 1.9.7.13 I2C I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19 1.9.7.14 Video In I/O Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19 1.9.7.15 Video Out I/O Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19 1.9.7.16 AudioIn I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20 1.9.7.17 Audio Out I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20 1.9.7.18 SSI I/O timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20 PRELIMINARY SPECIFICATION 3 PNX1300/01/02/11 Data Book Philips Semiconductors 2 Overview 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 2.2 PNX1300 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 2.3 PNX1300 Chip Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 2.4 Brief Examples of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 2.4.1 Video Decompression in a PC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 2.4.2 Video Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 2.5 Introduction to PNX1300 Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 2.5.1 Internal ‘Data Highway’ Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 2.5.2 VLIW Processor Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 2.5.3 Video In Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 2.5.4 Enhanced Video Out Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 2.5.5 Image Coprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 2.5.6 Variable-Length Decoder (VLD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5 2.5.7 Audio In and Audio Out Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 2.5.8 S/PDIF Out Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 2.5.9 Synchronous Serial Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 2.5.10 I2C Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 2.6 New In PNX1300 (Versus TM-1300) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 2.7 New In PNX1300 (Versus TM-1100) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 2.8 New In PNX1300 (Versus TM-1000) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 3 DSPCPU Architecture 3.1 Basic Architecture Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 3.1.1 Register Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 3.1.2 Basic DSPCPU Execution Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 3.1.3 PCSW Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 3.1.4 SPC and DPC—Source and Destination Program Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3 3.1.5 CCCOUNT—Clock Cycle Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3 3.1.6 Boolean Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3 3.1.7 Integer Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 3.1.8 Floating Point Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 3.1.9 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 3.1.10 Software Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 3.2 Instruction Set Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 3.2.1 Guarding (Conditional Execution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 3.2.2 Load and Store Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 3.2.3 Compute Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 3.2.4 Special-Register Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 3.2.5 Control-Flow Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 4 PRELIMINARY SPECIFICATION Philips Semiconductors 3.3 PNX1300 Instruction Issue Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 3.4 Memory and MMIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 3.4.1 Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 3.4.2 The Memory Hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 3.4.3 MMIO Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 3.5 Special Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8 3.5.1 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 3.5.2 EXC (Exceptions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 3.5.3 INT and NMI (Maskable and Non-Maskable Interrupts) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 3.5.3.1 Interrupt vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 3.5.3.2 Interrupt modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10 3.5.3.3 Device interrupt acknowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10 3.5.3.4 Interrupt priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10 3.5.3.5 Interrupt masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10 3.5.3.6 Software interrupts and acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11 3.5.3.7 NMI sequentialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11 3.5.3.8 Interrupt source assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11 3.6 PNX1300 to Host Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11 3.7 Host to PNX1300 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12 3.8 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12 3.9 Debug Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13 3.9.1 Instruction Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13 3.9.2 Data Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14 4 Custom Operations for Multimedia 4.1 Custom OperationS Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 4.1.1 Custom Operation Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 4.1.2 Introduction to Custom Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 4.1.3 Example Uses of Custom Ops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3 4.2 Example 1: Byte-Matrix Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3 4.3 Example 2: MPEG Image Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4 4.4 Example 3: Motion-Estimation Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7 4.4.1 A Simple Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8 4.4.2 More Unrolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10 5 Cache Architecture 5.1 Memory System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 5.2 DRAM Aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 5.3 Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 5.3.1 General Cache Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 5.3.2 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 PRELIMINARY SPECIFICATION 5 PNX1300/01/02/11 Data Book Philips Semiconductors 5.3.3 Miss Processing Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 5.3.4 Replacement Policies, Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 5.3.5 Alignment, Partial-Word Transfers, Endian-ness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 5.3.6 Dual Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 5-4 5.3.7 Cache Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 5.3.8 Memory Hole and PCI Aperture Disable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 5.3.9 Non-cacheable Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 5.3.10 Special Data Cache Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 5.3.10.1 Copyback and invalidate operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 5.3.10.2 Data cache tag and status operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 5.3.10.3 Data cache allocation operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 5.3.10.4 Data cache prefetch operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 5.3.11 Memory Operation Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 5.3.12 Operation Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 5.3.13 MMIO Register References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 5.3.14 PCI Bus References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 5.3.15 CPU Stall Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 5.3.16 Data Cache Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 5.4 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 5-8 5.4.1 General Cache Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 5.4.2 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 5.4.3 Miss Processing Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 5.4.4 Replacement Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 5.4.5 Location of Program Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 5.4.6 Branch Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 5.4.7 Coherency: Special iclr Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 5.4.8 Reading Tags and Cache Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 5.4.9 Cache Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 5.4.10 Instruction Cache Initialization and Boot Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 5.5 LRU Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.5.1 Two-Way Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.6 Cache Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.6.1 Example 1: Data-Cache/Input-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.6.2 Example 2: Data-Cache/Output-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.6.3 Example 3: Instruction-Cache/Data-Cache Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.6.4 Example 4: Instruction-Cache/Input-Unit Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.6.5 Four-Way Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 5.6.6 LRU Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 5.6.7 LRU Bit Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 5.6.8 LRU for the Dual-Ported Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 6 PRELIMINARY SPECIFICATION Philips Semiconductors 5.7 Performance Evaluation Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 5.8 MMIO Register Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13 6 Video In 6.1 video in overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 6.1.1 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 6.1.2 Diagnostic Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2 6.1.3 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2 6.1.4 Hardware and Software Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2 6.2 Clock Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4 6.3 Fullres Capture Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4 6.4 Halfres Capture Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9 6.5 Raw Capture Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10 6.6 Message-Passing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11 6.6.1 VI_DVALID in Message Passing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12 6.7 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13 7 Enhanced Video Out 7.1 Enhanced Video Out Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 7.2 About This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 7.3 Backward Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 7.4 Function summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 7.4.1 Detailed Feature Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2 7.4.2 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2 7.5 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2 7.6 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3 7.7 Clock System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3 7.8 Image Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4 7.8.1 CCIR 656 Pixel Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4 7.8.2 CCIR 656 Line Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4 7.8.3 SAV and EAV Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5 7.8.4 Video Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6 7.8.5 CCIR 656 Frame Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6 7.9 Enhanced Video Out Timing Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6 7.9.1 Active Video Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6 7.9.2 SAV and EAV Overlap Period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7 7.9.3 Control of Frame and Image Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7 7.9.4 Horizontal and Frame Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7 7.10 Genlock Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8 7.11 Data Transfer Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9 7.12 Image Data Memory Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9 PRELIMINARY SPECIFICATION 7 PNX1300/01/02/11 Data Book Philips Semiconductors 7.12.1 Video Image Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9 7.12.2 Planar Storage of Video Image Data in Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10 7.12.3 Graphics Overlay Image Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10 7.13 Video Image Conversion Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10 7.13.1 YUV 4:2:2 Interspersed to YUV 4:2:2 Co-sited Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11 7.13.2 YUV 4:2:0 to YUV 4:2:2 Co-sited Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11 7.13.3 YUV-2x Upscaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11 7.13.4 Pixel Mirroring for Four-tap Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11 7.14 EVO Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13 7.15 Video Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13 7.15.1 Alpha Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13 7.15.2 Chroma Keying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14 7.15.3 Programmable Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14 7.16 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 7-14 7.16.1 VO Status Register (VO_STATUS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16 7.16.2 VO Control Register (VO_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17 7.16.3 VO-Related Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18 7.16.4 EVO Control Register (EVO_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20 7.16.5 EVO-Related Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21 7.17 Enhanced Video Out Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21 7.17.1 Video Refresh Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21 7.18 Frame and field timing control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23 7.18.1 Recommended values for timing registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23 7.18.2 Data-transfer Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23 7.18.3 Interrupts and Error Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23 7.18.4 Latency and Bandwidth Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24 7.18.5 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-24 7.19 DDS and PLL Filter Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25 8 Audio In 8.1 Audio In Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 8.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 8-1 8.3 Clock System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.3.1 PNX1300 Improved Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.3.2 TM-1000 Compatibility Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.4 Clock System Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.5 Serial Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 8.6 Memory Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4 8.7 Audio In Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6 8.8 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 8.9 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 8 PRELIMINARY SPECIFICATION Philips Semiconductors 8.10 Error Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 8.11 Diagnostic Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 9 Audio Out 9.1 Audio Out Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1 9.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1 9.3 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 9.4 Internal Clock Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 9.4.1 PNX1300 Standard Improved Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3 9.4.2 TM-1000 Compatibility Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4 9.5 Clock System Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4 9.6 Serial Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4 9.6.1 Serial Frame Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5 9.6.2 I2S Serial Framing Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6 9.7 Codec Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6 9.8 Memory Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7 9.9 Audio Out Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-8 9.10 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 9-9 9.11 Timestamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 9-10 9.12 powerdown and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10 9.13 Highway Latency and HBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10 9.14 Error Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11 10 SPDIF Out 10.1 SPDIF Out Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.2 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.3 Summary of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.3.1 SPDIF Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.3.2 Transparent DMA Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.4 IEC-958 Serial Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2 10.5 IEC-958 Bit Cell and Pre-amble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2 10.6 IEC-958 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3 10.7 IEC-958 Memory Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3 10.8 Sample Rate Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3 10.9 Transparent Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4 10.10 DMA Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4 10.11 DMA Error Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4 10.12 Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4 10.13 Timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4 10.14 MMIO Register Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5 10.15 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 10-6 PRELIMINARY SPECIFICATION 9 PNX1300/01/02/11 Data Book Philips Semiconductors 10.16 Power Down and Sleepless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6 10.17 HBE and Highway Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6 10.18 Literature References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7 11 PCI Interface 11.1 PCI Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 11-1 11.2 PCI Interface as an Initiator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2 11.2.1 DSPCPU Single-Word Loads/Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2 11.2.2 I/O Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2 11.2.3 Configuration Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2 11.2.4 DMA Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2 11.3 PCI Interface as a Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3 11.4 Transaction Concurrency, Priorities, and Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3 11.5 Registers Addressed in PCI Configuration Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3 11.5.1 Vendor ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3 11.5.2 Device ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3 11.5.3 Command Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3 11.5.4 Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-5 11.5.5 Revision ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6 11.5.6 Class Code Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6 11.5.7 Cache Line Size Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7 11.5.8 Latency Timer Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7 11.5.9 Header Type Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7 11.5.10 Built-In Self Test Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7 11.5.11 Base Address Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7 11.5.12 Subsystem ID, Subsystem Vendor ID Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.5.13 Expansion ROM Base Address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.5.14 Interrupt Line Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.5.15 Interrupt Pin Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.5.16 Max_Lat, Min_Gnt Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.6 Registers in MMIO Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.6.1 DRAM_BASE Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.6.2 MMIO_BASE Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9 11.6.3 MMIO/DRAM_BASE updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-10 11.6.4 BIU_STATUS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11 11.6.5 BIU_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-11 11.6.6 PCI_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12 11.6.7 PCI_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12 11.6.8 CONFIG_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-12 11.6.9 CONFIG_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13 11.6.10 CONFIG_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13 10 PRELIMINARY SPECIFICATION Philips Semiconductors 11.6.11 IO_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13 11.6.12 IO_DATA Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13 11.6.13 IO_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13 11.6.14 SRC_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14 11.6.15 DEST_ADR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14 11.6.16 DMA_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-14 11.6.17 INT_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15 11.7 PCI Bus Protocol Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15 11.7.1 Single-Data-Phase Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-16 11.7.2 Multi-Data-Phase Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-16 11.8 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17 11.8.1 Bus Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17 11.8.2 No Expansion ROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17 11.8.3 No Cacheline Wrap Address Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17 11.8.4 No Burst for I/O or Configuration Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17 11.8.5 Word-Only MMIO Register Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17 12 SDRAM Memory System 12.1 New in PNX1300/01/02/11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1 12.2 PNX1300 Main Memory Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1 12.3 Main-Memory Address Aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1 12.4 Memory Devices Supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2 12.4.1 SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2 12.4.2 SGRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2 12.5 Memory Granularity and Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2 12.6 Memory System Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3 12.6.1 MM_CONFIG Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3 12.6.2 PLL_RATIOS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-4 12.7 Memory Interface Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5 12.8 Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5 12.8.1 Address Mapping in 32-bit mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5 12.8.2 Address Mapping in 16-bit mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6 12.9 Memory Interface and SDRAM Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6 12.10 On-Chip SDRAM Interleaving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6 12.11 Refresh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 12-6 12.12 Power-Down Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7 12.13 Output Driver Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7 12.14 Signal Propagation Delay Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7 12.15 Circuit Board Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7 12.15.1 General Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7 12.15.2 Specific Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8 PRELIMINARY SPECIFICATION 11 PNX1300/01/02/11 Data Book Philips Semiconductors 12.15.3 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8 12.16 Timing Budget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8 12.16.1 Main AC Parameter requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9 12.17 Example Block Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9 12.17.1 Block Diagrams for a 32-bit interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9 12.17.1.1 16-Mbit Devices or Less . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9 12.17.1.2 64-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-10 12.17.1.3 128-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-13 12.17.1.4 256-Mbit Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16 12.17.2 Block Diagrams for a 16-bit interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-17 13 System Boot 13.1 Boot Sequence Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-1 13.2 Boot Hardware Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2 13.2.1 Boot Procedure Common to Both Autonomous and Host-Assisted Bootstrap . . . . . . . . . . . . 13-2 13.2.2 Initial DSPCPU Program Load for Autonomous Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5 13.3 Host-Assisted Boot Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6 13.3.1 Stage 1: PNX1300 System Boot Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6 13.3.2 Stage 2: Host-System PCI Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6 13.3.3 Stage 3: PNX1300 Driver Executing on the Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6 13.4 Detailed EEPROM Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-7 13.5 EEPROM Access Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-9 14 Image Coprocessor 14.1 Image Coprocessor Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1 14.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 14-1 14.2.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 14-1 14.2.2 Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1 14.2.3 Image Size and Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3 14.3 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 14-3 14.4 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 14-3 14.4.1 Image Input Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3 14.4.1.1 YUV 4:2:2 Co-Sited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3 14.4.1.2 YUV 4:2:2 Interspersed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3 14.4.1.3 YUV 4:2:0 XY Interspersed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3 14.4.1.4 YUV 4:1:1 Co-Sited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3 14.4.2 Image Overlay Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5 14.4.3 Alpha Blending Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5 14.4.4 Output Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5 14.5 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6 14.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6 12 PRELIMINARY SPECIFICATION Philips Semiconductors 14.5.2 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6 14.5.3 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6 14.5.4 YUV to RGB Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9 14.5.5 Overlay and Alpha Blending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-9 14.5.6 Dithering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-10 14.5.7 Implementation Overview: Horizontal Scaling and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 14-11 14.5.7.1 Loading the extra pixels in the filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12 14.5.7.2 Mirroring pixels at the ends of a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12 14.5.7.3 Horizontal filter SDRAM timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-12 14.5.8 Implementation Overview: Vertical Scaling and Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-13 14.5.8.1 Mirroring lines at the ends of an image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15 14.5.8.2 Vertical filter SDRAM block timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15 14.5.9 Horizontal Scaling and Filtering for RGB Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15 14.5.9.1 YUV sequence counter in YUV 4:2:2 output Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-15 14.5.9.2 PCI output block timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-16 14.6 Operation and Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-16 14.6.1 ICP Register Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-17 14.6.2 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-17 14.6.3 ICP Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18 14.6.4 ICP Microprogram Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18 14.6.5 ICP Processing Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-18 14.6.6 Priority Delay and ICP Minimum Bus Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-21 14.6.7 ICP Parameter Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22 14.6.8 Load Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22 14.6.9 Horizontal Filter - SDRAM to SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22 14.6.9.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22 14.6.9.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-22 14.6.9.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-23 14.6.10 Vertical Filter - SDRAM to SDRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24 14.6.10.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24 14.6.10.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-24 14.6.10.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-25 14.6.11 Horizontal Filter with RGB/YUV Conversion to PCI or SDRAM . . . . . . . . . . . . . . . . . . . . . . 14-25 14.6.11.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-25 14.6.11.2 Parameter table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-26 14.6.11.3 Control word format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-27 15 Variable Length Decoder 15.1 VLD Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1 15.2 VLD Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1 15.3 Decoding up to A slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2 PRELIMINARY SPECIFICATION 13 PNX1300/01/02/11 Data Book Philips Semiconductors 15.4 VLD Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2 15.5 VLD Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 15-3 15.5.1 Macroblock Header Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3 15.5.2 Run-Level Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4 15.6 VLD Time Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4 15.7 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15-4 15.7.1 VLD Status (VLD_STATUS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4 15.7.2 VLD Interrupt Enable (VLD_IMASK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4 15.7.3 VLD Control (VLD_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5 15.8 VLD DMA Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5 15.8.1 DMA Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5 15.8.2 Macroblock Header Output DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5 15.8.3 Run-Level Output DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5 15.9 VLD Operational Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7 15.9.1 VLD Command (VLD_COMMAND) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7 15.9.2 VLD Shift Register (VLD_SR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7 15.9.3 VLD Quantizer Scale (VLD_QS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7 15.9.4 VLD Picture Info (VLD_PI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8 15.10 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8 15.11 Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 15-8 15.12 RESET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 15-8 15.13 Endian-ness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 15-8 15.14 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 15-8 15.15 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-8 16 I2C Interface 16.1 I2C Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1 16.2 Compared TO TM-1000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1 16.3 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1 16.4 I2C Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16-1 16.4.1 IIC_AR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1 16.4.2 IIC_DR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2 16.4.3 IIC_SR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-3 16.4.4 IIC_CR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-4 16.5 I2C Software Operation Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-5 16.6 I2C Hardware Operation Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-5 16.6.1 Slave NAK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-6 16.7 I2C Clock Rate Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-7 17 Synchronous Serial Interface 17.1 Synchronous Serial Interface Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-1 14 PRELIMINARY SPECIFICATION Philips Semiconductors 17.2 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 17-1 17.3 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-1 17.3.1 General Purpose I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2 17.3.2 Frame Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.3.3 SSI Transmit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.3.4 SSI Receive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3 17.4 SSI Transmit operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5 17.4.1 Setup SSI_CTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5 17.4.2 Operation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5 17.4.3 Interrupt and Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5 17.5 SSI Receive Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6 17.5.1 Setup SSI_CTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6 17.5.2 Operation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6 17.5.3 Interrupt and Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6 17.6 Frame Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6 17.7 Interrupt Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-7 17.8 16-bit Endian-ness and Shift Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-7 17.9 SSI Test Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8 17.9.1 Remote Loopback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8 17.9.2 Local Loopback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8 17.10 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8 17.10.1 SSI Control Register (SSI_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-9 17.10.2 SSI Control/Status Register (SSI_CSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-11 17.11 Timing Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-12 17.12 Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-12 18 JTAG Functional Specification 18.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 18-1 18.2 Test Access Port (TAP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1 18.2.1 TAP Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1 18.2.2 PNX1300 JTAG Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2 18.3 Using JTAG for PNX1300 Debug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3 18.3.1 JTAG Instruction and Data Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-4 18.3.2 JTAG Communication Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5 18.3.3 Example Data Transfer Via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5 18.3.3.1 Transferring data to TriMedia via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-5 18.3.3.2 Transferring data from TriMedia via JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6 18.3.4 JTAG Interface Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6 19 On-Chip Semaphore Assist Device 19.1 OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 19-1 PRELIMINARY SPECIFICATION 15 PNX1300/01/02/11 Data Book Philips Semiconductors 19.2 SEM Device Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1 19.3 Constructing a 12-Bit ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1 19.4 Which SEM to Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1 19.5 Usage Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-1 20 Arbiter 20.1 Arbiter Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 20-1 20.2 Dual Priorities with Priority Raising Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-1 20.3 Round Robin Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2 20.3.1 Weighted Round Robin Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2 20.3.2 Arbitration Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3 20.4 Arbiter Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-4 20.5 Arbiter programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-5 20.5.1 Latency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-5 20.5.2 Bandwidth Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-6 20.6 Extended Behavior Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7 20.6.1 Extended Bandwidth Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7 20.6.2 Extended Latency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-7 20.6.3 Raising Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-8 20.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-8 21 Power Management 21.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 21-1 21.2 Entering and Exiting Global Power Down Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1 21.3 Effect Of Global Power Down On Peripherals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1 21.4 Detailed Sequence of Events For Global Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2 21.5 MMIO Register POWER_DOWN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2 21.6 Block Power Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2 22 PCI-XIO External I/O Bus 22.1 Summary Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1 22.1.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1 22.2 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 22-3 22.3 Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 22-5 22.4 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 22-5 22.4.1 PCI-XIO Bus Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-5 22.4.1.1 Flash EEPROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6 22.4.1.2 68K Bus I/O device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6 22.4.1.3 x86/ISA Bus I/O device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6 22.4.1.4 Multiple Flash EEPROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-6 22.5 XIO_CTL MMIO Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7 16 PRELIMINARY SPECIFICATION Philips Semiconductors 22.5.1 PCI_CLK Bus Clock Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-7 22.5.2 Wait State Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-8 22.6 PCI-XIO Bus Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-8 22.7 PCI-XIO Bus Controller Operation and Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-12 A PNX1300/01/02/11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DSPCPU Operations A.1 Alphabetic Operation List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1 A.2 Operation List By Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2 alloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4 allocd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-5 allocr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-6 allocx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-7 asl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-8 asli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-9 asr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-10 asri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-11 bitand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-12 bitandinv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-13 bitinv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14 bitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15 bitxor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-16 borrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17 carry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18 curcycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-19 cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-20 dcb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-21 dinvalid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-22 dspiabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-23 dspiadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-24 dspidualabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-25 dspidualadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-26 dspidualmul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-27 dspidualsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-28 dspimul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-29 dspisub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-30 dspuadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-31 dspumul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-32 dspuquadaddui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-33 dspusub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-34 dualasr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-35 PRELIMINARY SPECIFICATION 17 PNX1300/01/02/11 Data Book Philips Semiconductors dualiclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-36 dualuclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-37 fabsval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-38 fabsvalflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-39 fadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-40 faddflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-41 fdiv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-42 fdivflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-43 feql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-44 feqlflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-45 fgeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-46 fgeqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-47 fgtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-48 fgtrflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-49 fleq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-50 fleqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-51 fles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-52 flesflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-53 fmul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-54 fmulflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-55 fneq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-56 fneqflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-57 fsign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-58 fsignflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-59 fsqrt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-60 fsqrtflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-61 fsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-62 fsubflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-63 funshift1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-64 funshift2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-65 funshift3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-66 h_dspiabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-67 h_dspidualabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-68 h_iabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-69 h_st16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-70 h_st32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-71 h_st8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-72 hicycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-73 iabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-74 iadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-75 18 PRELIMINARY SPECIFICATION Philips Semiconductors iaddi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-76 iavgonep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-77 ibytesel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-78 iclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-79 iclr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-80 ident . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-81 ieql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-82 ieqli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-83 ifir16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-84 ifir8ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-85 ifir8ui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-86 ifixieee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-87 ifixieeeflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-88 ifixrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-89 ifixrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-90 iflip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-91 ifloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-92 ifloatflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-93 ifloatrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-94 ifloatrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . A-95 igeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-96 igeqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-97 igtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-98 igtri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-99 iimm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-100 ijmpf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-101 ijmpi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-102 ijmpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-103 ild16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-104 ild16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-105 ild16r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-106 ild16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-107 ild8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-108 ild8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-109 ild8r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-110 ileq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-111 ileqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-112 iles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-113 ilesi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-114 imax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-115 PRELIMINARY SPECIFICATION 19 PNX1300/01/02/11 Data Book Philips Semiconductors imin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-116 imul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-117 imulm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-118 ineg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-119 ineq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-120 ineqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-121 inonzero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-122 isub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-123 isubi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-124 izero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-125 jmpf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-126 jmpi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-127 jmpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-128 ld32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-129 ld32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-130 ld32r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-131 ld32x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-132 lsl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-133 lsli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-134 lsr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-135 lsri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-136 mergedual16lsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. A-137 mergelsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-138 mergemsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . A-139 nop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-140 pack16lsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-141 pack16msb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . A-142 packbytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-143 pref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-144 pref16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-145 pref32x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-146 prefd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-147 prefr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-148 quadavg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . A-149 quadumax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . A-150 quadumin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-151 quadumulmsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . A-152 rdstatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-153 rdtag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-154 readdpc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-155 20 PRELIMINARY SPECIFICATION Philips Semiconductors readpcsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-156 readspc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-157 rol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-158 roli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-159 sex16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-160 sex8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-161 st16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-162 st16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-163 st32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-164 st32d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-165 st8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . A-166 st8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-167 ubytesel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-168 uclipi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-169 uclipu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-170 ueql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-171 ueqli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-172 ufir16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-173 ufir8uu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-174 ufixieee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-175 ufixieeeflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-176 ufixrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-177 ufixrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-178 ufloat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-179 ufloatflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-180 ufloatrz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-181 ufloatrzflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-182 ugeq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-183 ugeqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-184 ugtr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . A-185 ugtri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-186 uimm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-187 uld16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-188 uld16d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-189 uld16r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-190 uld16x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-191 uld8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-192 uld8d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . A-193 uld8r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-194 uleq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-195 PRELIMINARY SPECIFICATION 21 PNX1300/01/02/11 Data Book Philips Semiconductors uleqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-196 ules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-197 ulesi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-198 ume8ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . A-199 ume8uu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-200 umin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-201 umul . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-202 umulm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-203 uneq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-204 uneqi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-205 writedpc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-206 writepcsw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-207 writespc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . A-208 zex16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-209 zex8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-210 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-212 B MMIO Register Summary B.1 MMIO Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 C Endian-ness C.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 C.2 Little and Big Endian Addressing Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 C.3 Test to Verify the Correct Operation of PNX1300 in Big and Little Endian Systems . . . . . . . . . . . . . . C-2 C.4 Requirement for the PNX1300 to Operate in Either Little Endian or Big Endian Mode . . . . . . . . . . . . C-2 C.4.1 Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2 C.4.2 Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3 C.4.3 PNX1300 PCI Interface Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3 C.4.4 Image Coprocessor (ICP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3 C.4.5 Video In (VI) and Video Out (VO) Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7 C.4.6 Audio In (AI), Audio-Out (AO), and SPDIF Out (SDO) Units . . . . . . . . . . . . . . . . . . . . . . . . . . C-7 C.4.7 Variable Length Encoder (VLD) Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-7 C.4.8 Synchronous Serial Interface (SSI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-8 C.4.9 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9 C.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . C-9 C.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9 Index 22 PRELIMINARY SPECIFICATION Pin List Chapter 1 by John Chang, Wenyi Song, Thorwald Rabeler, Luis Lucas 1.1 PNX1300 SERIES VERSUS TM-1300 The following summarizes differences between TM-1300 and PNX1300/01/02/11: • • • • • • • • • Lower core voltage for PNX1311 (2.2V core voltage) and therefore lower power consumption. DSPCPU speed of up to 200 MHz. SDRAM speed of up to 183 MHz. Support for 256 Mbit SDRAM organized in x16. The REFRESH counter must be changed. Refer for in Chapter 12, “SDRAM Memory System” for details. Support for 16- and 32-bit Main Memory Interface. Simplified power supplies sequencing (see Section 1.9.4). Additional mode where VI_DATA[9:8] in message passing mode are not affected by the VI_DVALID signal. Bug fixed for PCI Special Cycles. PNX1300 Series discards PCI Special Cycles issued by some PCI chipsets. Autonomous boot bug in non 1:1 ratio is fixed, resulting in 2KB boot EEPROM size for all CPU:SDRAM ratios. In the document, ‘PNX1300 Series’ is used interchangebly with ‘PNX1300/01/02/11’, and it always refers to PNX1300, PNX1301, PNX1302 and PNX1311 products. Any exception will be noted. 1.2 BOUNDARY SCAN NOTICE PNX1300 Series implements full IEEE 1149.1 boundary scan. Any PNX1300 Series pin designated “IN” only (from a functionality point of view) can become an output during boundary scan. 1.3 I/O CIRCUIT SUMMARY PNX1300 Series has a total of 169 functional pins, excluding VDDQ, VSSQ, VREF_PCI and VREF_PERIPH and digital power/ground. PNX1300 Series uses the types of I/O circuits shown in the table below. Pad Type Pad Type Description PCI PCI2.1 compliant I/O, capable of using 3.3-V or 5-V PCI signaling conventions. PCIOD PCI2.1 compliant Open Drain I/O, capable of using 3.3-V or 5-V PCI signaling conventions. IICOD Open drain 3.3-V or 5-V I2C I/O (for I2C pins). 3.3-V only low impedance I/O. Requires board level 27-33 ohm series terminator resistor to match 50 ohm PCB trace. 3.3-V only I/O circuit with regular drive strength and board trace matched drive impedance. STRG3 NORM3 STRG5 3.3-V low impedance output, combined with 5-V tolerant input. If used as output, it requires a board level 27-33 ohm series terminator resistor to match 50-ohm PCB trace. WEAK5 3.3-V regular impedance output, with slow rise/fall, combined with 5-V tolerant input. For the pins with 5-V input capability, the special pins VREF_PCI or VREF_PERIPH determine 3.3- or 5-V input tolerance, as per the table in Section 1.6. The above pad types are used in the modes listed in the following table. Modes Description IN Input only, except during boundary scan OUT OD Output only, except during boundary scan Open drain output - active pull low, no active drive high, requires external pull-up I/O Output or input I/OD Open drain output with input - active pull low, no active drive high, requires external pull-up Unused pins may remain floating, i.e. unconnected. All pins that drive a clock should drive a series resistor. PRELIMINARY SPECIFICATION 1-1 PNX1300/01/02/11 Data Book 1.4 Philips Semiconductors SIGNAL PIN LIST In the table below, a pin name ending in a ‘#’ designates an active-low signal (the active state of the signal is a low voltage level). All other signals have active-high polarity. Pin Name BGA Ball Pad Type Mode Description Main Clock Interface TRI_CLKIN L20 NORM3 IN Main input clock. The SDRAM clock outputs (MM_CLK0 and MM_CLK1) can be set to 2x or 3x this frequency. The on-chip DSPCPU clock (DSPCPU_CLK) can be set to 1x, 5/4, 4/3, 3/2 or 2x the SDRAM clock frequency. Maximum recommended ppm level is +/- 100 ppm or lower to improve jitter on generated clocks. Duty cycle should not exceed 30/70% asymmetry. The operating limits of the internal PLLs are: • 27 MHz < Output of the SDRAM PLL < 200 MHz • 33 MHz < Output of the CPU PLL < 266 MHz These are not the speed grades of the chips, just the PLL limits. VDDQ K20 N/A PWR Quiet VDD for the PLL subsystem. This pin should be supplied from VDD through a low-Q series inductor. It should be bypassed for AC to VSSQ, using a dual capacitor bypass (hi and low frequency AC bypass). VSSQ L19 N/A GND Quiet VSS for the PLL subsystem. Should be AC bypassed to VDDQ, but should otherwise be left DC floating. It is connected on-chip to VSS. No external coil or other connection to board ground is needed, such connection would create a ground loop. Miscellaneous System Interface TRI_RESET# G19 WEAK5 IN PNX1300/01/02/11 RESET input. This pin can be tied to the PCI RST# signal in PCI bus systems. Upon releasing RESET, PNX1300/01/02/11 initiates its boot protocol. BOOT_CLK T20 NORM3 IN Used for testing purposes. Must be connected to TRI_CLKIN for normal operation. TESTMODE P19 NORM3 IN Used for testing purposes. Must be connected to VSS for normal operation. SCANCPU D20 NORM3 IN Used for testing purposes. Must be connected to VSS for normal operation. RESERVED1 E19 NORM3 I/O Reserved pin. Has to be left unconnected for normal operation. RESERVED2 D19 STRG5 I/O Reserved pin. Has to be left unconnected for normal operation. F2 N/A PWR VREF_PCI determines the mode of operation of the PCI pins listed in Section 1.6. VREF_PCI must be connected to 5V for use in a 5-V PCI signaling environment or to VSS (0 V) for use in 3.3-V PCI signaling environment. The supply to this pin should be AC bypassed and provide 40 mA of DC sink or source capability. Note that this pin can not be directly connected to the PCI ‘I/O designated power pins’ in a dual voltage PCI plug-in card. Board level conversion circuitry is required. VREF_PERIPH C18 N/A PWR VREF_PERIPH determines the mode of operation of the I/O pins listed in Section 1.6. VREF_PERIPH should be connected to 5V if any of the listed I/O pins provided should be 5-V input voltage capable. VREF_PERIPH should be connected to VSS (0-V) if all listed I/O pins are 3.3-V only inputs. The supply to this pin should be AC bypassed and provide 40 mA of DC sink or source capability. TRI_USERIRQ G20 WEAK5 IN General purpose level/edge interrupt input. Vectored interrupt source number 4. TRI_TIMER_CLK H19 WEAK5 IN External general purpose clock source for timers. Max. 40 MHz. VREF_PCI 1-2 PRELIMINARY SPECIFICATION Philips Semiconductors Pin List BGA Ball Pad Type Mode Description MM_CLK0 MM_CLK1 Y10 W10 STRG3 OUT SDRAM output clock at 2x or 3x TRI_CLKIN frequency. Two identical outputs are provided to reliably drive several small memory configurations without external glue. A series terminating resistor close to PNX1300/01/02/11 is required to reduce ringing. For driving a 50-ohm trace, a resistor of 27 to 33 ohm is recommended. It is recommended against using higher impedance traces in the SDRAM signals. MM_A00 MM_A01 MM_A02 MM_A03 MM_A04 MM_A05 MM_A06 MM_A07 MM_A08 MM_A09 MM_A10 MM_A11 MM_A12 MM_A13 W12 Y12 W11 Y11 Y9 W9 V9 Y8 W8 Y7 V12 Y13 W13 Y14 NORM3 OUT Main memory address bus; used for row and column addresses MM_DQ00 MM_DQ01 MM_DQ02 MM_DQ03 MM_DQ04 MM_DQ05 MM_DQ06 MM_DQ07 MM_DQ08 MM_DQ09 MM_DQ10 MM_DQ11 MM_DQ12 MM_DQ13 MM_DQ14 MM_DQ15 MM_DQ16 MM_DQ17 MM_DQ18 MM_DQ19 MM_DQ20 MM_DQ21 MM_DQ22 MM_DQ23 MM_DQ24 MM_DQ25 MM_DQ26 MM_DQ27 MM_DQ28 MM_DQ29 MM_DQ30 MM_DQ31 Y20 V18 W19 W20 U18 V19 V20 T18 W18 V17 Y18 W17 Y17 W16 Y16 V15 W7 Y6 W6 V6 Y5 W5 Y4 W4 V2 V3 W1 W2 Y1 Y2 W3 Y3 NORM3 I/O 32-bit data I/O bus. The Main Memory Interface unit also supports a 16-bit I/O interface. Refer to Chapter 12, “SDRAM Memory System.” MM_CKE0 MM_CKE1 Y19 U1 NORM3 OUT Clock enable output to SDRAMs. Two identical outputs are provided in order to reliably drive several small memory configurations without external glue. MM_CS0# MM_CS1# MM_CS2# MM_CS3# U2 U20 U3 U19 NORM3 OUT Chip select for DRAM rank n; active low In PNX1300/01/02/11 the chip selects pins may be used as address pins to support the 256 Mbit SDRAM device organized in x16. Refer to Chapter 12, “SDRAM Memory System.” MM_RAS# W14 NORM3 OUT Row address strobe; active low MM_CAS# Y15 NORM3 OUT Column address strobe; active low MM_WE# W15 NORM3 OUT Write enable; active low Pin Name Main Memory Interface WARNING: MM_A[13:11] DO NOT CONNECT DIRECTLY TO SDRAM A[13:11] pins. Refer to Chapter 12, “SDRAM Memory System ” for accurate connection diagrams. PRELIMINARY SPECIFICATION 1-3 PNX1300/01/02/11 Data Book Pin Name MM_DQM0 MM_DQM1 MM_DQM2 MM_DQM3 Philips Semiconductors BGA Ball Pad Type Mode T19 R18 V1 V4 NORM3 OUT Description MM_DQ Mask Enable; these are byte enable signals for the 32-bit MM_DQ bus PCI Interface (Note: current buffer design allows drive/receive from either 3.3 or 5V PCI bus) PCI_CLK T2 PCI IN All PCI input signals are sampled with respect to the rising edge of this clock. All PCI outputs are generated based on this clock. Clock is required for normal operation of the PCI block. PCI_AD00 PCI_AD01 PCI_AD02 PCI_AD03 PCI_AD04 PCI_AD05 PCI_AD06 PCI_AD07 PCI_AD08 PCI_AD09 PCI_AD10 PCI_AD11 PCI_AD12 PCI_AD13 PCI_AD14 PCI_AD15 PCI_AD16 PCI_AD17 PCI_AD18 PCI_AD19 PCI_AD20 PCI_AD21 PCI_AD22 PCI_AD23 PCI_AD24 PCI_AD25 PCI_AD26 PCI_AD27 PCI_AD28 PCI_AD29 PCI_AD30 PCI_AD31 T1 R3 R2 R1 P2 P1 N2 N1 M2 M1 L2 L1 K1 K2 J1 J2 D1 D3 C1 B2 B1 C2 C3 A1 A3 C4 B4 A4 A5 C6 B6 A6 PCI I/O Multiplexed address and data. PCI_C/BE#0 PCI_C/BE#1 PCI_C/BE#2 PCI_C/BE#3 M3 J3 D2 B3 PCI I/O Multiplexed bus commands and byte enables. High for command, low for byte enable. PCI_PAR H1 PCI I/O Even parity across AD and C/BE lines. PCI_FRAME# E2 PCI I/O Sustained tri-state. Frame is driven by a master to indicate the beginning and duration of an access. PCI_IRDY# E1 PCI I/O Sustained tri-state. Initiator Ready indicates that the bus master is ready to complete the current data phase. PCI_TRDY# F3 PCI I/O Sustained tri-state. Target Ready indicates that the bus target is ready to complete the current data phase. PCI_STOP# G2 PCI I/O Sustained tri-state. Indicates that the target is requesting that the master stop the current transaction. PCI_IDSEL A2 PCI IN Used as chip select during configuration read/write cycles. PCI_DEVSEL# F1 PCI I/O Sustained tri-state. Indicates whether any device on the bus has been selected. PCI_REQ# B7 PCI I/O Driven by PNX1300/01/02/11 as PCI bus master to request use of the PCI bus. PCI_GNT# B5 PCI IN Indicates to PNX1300/01/02/11 that access to the bus has been granted. PCI_PERR# G1 PCI I/O Sustained tri-state. Parity error generated/received by PNX1300/01/02/11. PCI_SERR# H2 PCI OD System error. This signal is asserted when operating as target and detecting an address parity error. 1-4 PRELIMINARY SPECIFICATION Philips Semiconductors Pin List BGA Ball Pad Type PCI_INTA# PCI_INTB# PCI_INTC# PCI_INTD# C9 A8 B8 A7 PCIOD PCI PCIOD PCIOD JTAG_TDI F20 WEAK5 IN JTAG test data input JTAG_TDO F18 WEAK5 I/O JTAG test data output. This pin can either drive active low, high or float. JTAG_TCK F19 WEAK5 IN JTAG test clock input JTAG_TMS E20 WEAK5 IN JTAG test mode select input Pin Name Mode Description I/OD • Can operate as input (power up default) or output, as determined by direction control bits in PCI MMIO register INT_CTL. I/O/OD I/OD • As input, a PCI_INT# pin can be used to receive PCI interrupt requests (normal PCI use is active low, level sensitive mode, but the VIC can be set to treat these as I/OD positive edge triggered mode). As input, a PCI_INT# pin can also be used as a general interrupt request pin if not needed for PCI. • As output, the value of a PCI_INT# can be programmed through PCI MMIO registers to generate interrupts for other PCI masters. • Whenever XIO bus functionality is active, PCI_INTB# is a push-pull CMOS I/O pin. When the XIO bus is not active and regular PCI bus functionality is activated, then PCI_INTB# has a PCI compatible open drain output. JTAG Interface (debug access port and 1149.1 boundary scan port) Video In VI_CLK C20 STRG5 I/O • If configured as input (power up default):a positive transition on this incoming video clock pin samples all other VI_DATA input signals below if VI_DVALID is HIGH. If VI_DVALID is LOW, VI_DATA is ignored. Clock and data rates of up to 81 MHz are supported. PNX1300 Series supports an additional mode where VI_DATA[9:8] in message passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on page 6-12. • If configured as output: programmable output clock to drive an external video A/D converter. Can be programmed to emit integral dividers of DSPCPU_CLK. If used as output, a board level 27-33 ohm series resistor is recommended to reduce ringing. VI_DVALID A17 WEAK5 IN VI_DVALID indicates that valid data is present on the VI_DATA lines. If HIGH, VI_DATA will be accepted on the next VI_CLK positive edge. If LOW, no VI_DATA will be sampled. PNX1300 Series supports an additional mode where VI_DATA[9:8] in message passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on p age6-12. VI_DATA0 VI_DATA1 VI_DATA2 VI_DATA3 VI_DATA4 VI_DATA5 VI_DATA6 VI_DATA7 D18 C19 B20 B19 A20 A19 C17 B18 WEAK5 IN CCIR656 style YUV 4:2:2 data from a digital camera, or general purpose high speed data input pins. Sampled on VI_CLK if VI_DVALID HIGH. VI_DATA8 VI_DATA9 A18 B17 WEAK5 IN Extension high speed data input bits to allow use of 10 bit video A/D converters in raw10 modes. VI_DATA[8] serves as START and VI_DATA[9] as END message input in message passing mode. Sampled on positive transitions of VI_CLK if VI_DVALID HIGH. PNX1300 Series supports an additional mode where VI_DATA[9:8] in message passing mode are not affected by the VI_DVALID signal, Section 6.6.1 on p age6-12. I2C Interface IIC_SDA R19 IICOD I/OD I2C serial data IIC_SCL R20 IICOD I/OD I2C clock VO_DATA0 VO_DATA1 VO_DATA2 VO_DATA3 VO_DATA4 VO_DATA5 VO_DATA6 VO_DATA7 P20 N19 N20 M18 M19 M20 K19 J20 WEAK5 OUT CCIR656 style YUV 4:2:2 digital output data, or general purpose high speed data output channel. Output changes on positive edge of VO_CLK. Video Out PRELIMINARY SPECIFICATION 1-5 PNX1300/01/02/11 Data Book Philips Semiconductors BGA Ball Pad Type Mode VO_IO1 J18 WEAK5 I/O This pin can function as HS output or as STMSG (Start Message) output. • If set as HS output, it outputs the horizontal sync signal • In message passing mode, this pin acts as STMSG output. VO_IO2 H20 WEAK5 I/O This pin can function as FS (frame sync) input, FS output or as ENDMSG output. • If set as FS input, it can be set to respond to positive or negative edge transitions. • If the Video Out (VO) unit operates in external sync mode and the selected transition occurs, the VO unit sends two fields of video data. Note: this works only once after a reset. • In message passing mode, this pin acts as ENDMSG output. VO_CLK J19 STRG5 I/O The VO unit emits VO_DATA on a positive edge of VO_CLK. VO_CLK can be configured as input (reset default) or output. • If configured as input: VO_CLK is received from external display clock master circuitry. • If configured as output, PNX1300/01/02/11 emits a programmable clock frequency. The emitted frequency can be set between approx. 4 and 81 MHz with a sub-Hertz resolution. The clock generated is frequency accurate and has low jitter properties due to a combination of an on-chip DDS (Direct Digital Synthesizer) and VCO/PLL. If used as output, a board level 27-33 ohm series resistor is recommended to reduce ringing. Pin Name Description Audio In (always acts as receiver, but can be master or slave for A/D timing) AI_OSCLK B15 STRG3 OUT Over-sampling clock. This output can be programmed to emit any frequency up to 40 MHz with a sub-Hertz resolution. It is intended for use as the 256fs or 384fs over sampling clock by external A/D subsystem. A board level 27-33 ohm series resistor is recommended to reduce ringing. AI_SCK A16 STRG5 I/O • When the Audio In (AI) unit is programmed as a serial-interface timing slave (power-up default), AI_SCK is an input. AI_SCK receives the serial bit clock from the external A/D subsystem. This clock is treated as fully asynchronous to the PNX1300/01/02/11 main clock. • When the AI unit is programmed as the serial-interface timing master, AI_SCK is an output. AI_SCK drives the serial clock for the external A/D subsystem. The frequency is a programmable integral divisors of the AI_OSCLK frequency. AI_SCK is limited to 22 MHz. The sample rate of valid samples embedded within the serial stream is variable. If used as output, a board level 27-33 ohm series resistor is recommended to reduce ringing. AI_SD C15 WEAK5 IN Serial data from external A/D subsystem. Data on this pin is sampled on positive or negative edges of AI_SCK as determined by the CLOCK_EDGE bit in the AI_SERIAL register. AI_WS B16 WEAK5 I/O • When the AI unit is programmed as the serial-interface timing slave (power-up default), AI_WS acts as an input. AI_WS is sampled on the same edge as selected for AI_SD. • When Audio In is programmed as the serial-interface timing master, AI_WS acts as an output. It is asserted on the opposite edge of the AI_SD sampling edge. AI_WS is the word-select or frame-synchronization signal from/to the external A/D subsystem. 1-6 PRELIMINARY SPECIFICATION Philips Semiconductors Pin Name BGA Ball Pad Type Pin List Mode Description Audio Out (always acts as sender, but can be master or slave for D/A timing) AO_OSCLK B14 STRG3 OUT Over sampling clock. This output can be programmed to emit any frequency up to 40 MHz, with a sub-Hertz resolution. It is intended for use as the 256 or 384fs over sampling clock by the external D/A conversion subsystem. A board level 27-33 ohm series resistor is recommended to reduce ringing. AO_SCK A14 STRG5 I/O • When the Audio Out (AO) unit is programmed to act as the serial interface timing slave (power up default), AO_SCK acts as input. It receives the Serial Clock from the external audio D/A subsystem. The clock is treated as fully asynchronous to the PNX1300/01/02/11 main clock. • When the AO unit is programmed to act as serial interface timing master, AO_SCK acts as output. It drives the serial clock for the external audio D/A subsystem. The clock frequency is a programmable integral divisor of the AO_OSCLK frequency. AO_SCK is limited to 22 MHz. The sample rate of valid samples embedded within the serial stream is variable. If used as output, a board level 27-33 ohm series resistor is recommended to reduce ringing. AO_SD1 B13 WEAK5 OUT Serial data to external stereo audio D/A subsystem for first 2 of 8 channels. The timing of transitions on this output is determined by the CLOCK_EDGE bit in the AO_SERIAL register, and can be on positive or negative AO_SCK edges. AO_SD2 A13 WEAK5 OUT Serial data. AO_SD3 C12 WEAK5 OUT Serial data. AO_SD4 B12 WEAK5 OUT Serial data. AO_WS A15 WEAK5 I/O • When the AO unit is programmed as the serial-interface timing slave (power-up default), AO_WS acts as an input. AO_WS is sampled on the opposite AO_SCK edge at which AO_SDx are asserted. • When the AO unit is programmed as serial-interface timing master, AO_WS acts as an output. AO_WS is asserted on the same AO_SCK edge as AO_SDx. AO_WS is the word-select or frame-synchronization signal from/to the external D/A subsystem. Each audio channel receives 1 sample for every WS period. S/PDIF Output (Output) SPDO A12 STRG3 OUT Self clocking serial data stream as per IEC958, with 1937 extensions. Note that the low impedance output buffer requires a 27 to 33 ohm series terminator close to PNX1300/01/02/11 in order to match the board trace impedance. This series terminator can be/must be part of the voltage divider needed to create the coaxial output through the AC isolation transformer. Synchronous Serial Interface (SSI) to an off-chip modem front-end SSI_CLK B11 WEAK5 IN Clock signal of the synchronous serial interface to an off-chip modem analog frontend or ISDN terminal adapter; provided by the receive channel of an external communication device. SSI_RXFSX A11 WEAK5 IN Receive frame sync reference of the synchronous serial interface, provided by the receive channel of an external communication device. SSI_RXDATA A10 WEAK5 IN Receive serial data input; provided by the receive channel of an external communication device. SSI_TXDATA B10 WEAK5 OUT Transmit serial data output; sent to the transmit channel of the external communication device. SSI_IO1 A9 WEAK5 I/O General purpose programmable I/O. Set to input on power up. SSI_IO2 B9 WEAK5 I/O General purpose programmable I/O. Set to input on power up. Can also be programmed to function as the transmit channel frame synchronization reference output. PRELIMINARY SPECIFICATION 1-7 PNX1300/01/02/11 Data Book 1.5 POWER PIN LIST VSS (ground) C5 C16 D4 D5 D16 D17 E3 E4 E17 E18 T3 T4 T17 U4 U5 U16 U17 V5 V16 1-8 Philips Semiconductors H8 H9 H10 H11 H12 H13 J8 J9 J10 J11 J12 J13 K8 K9 K10 K11 K12 K13 L8 VCC (3.3V I/O supply) L9 L10 L11 L12 L13 M8 M9 M10 M11 M12 M13 N8 N9 N10 N11 N12 N13 C7 C10 C11 C14 D6 D7 D10 D11 D14 D15 F4 F17 G3 G4 PRELIMINARY SPECIFICATION G17 G18 K3 K4 K17 K18 L3 L4 L17 L18 P3 P4 P17 P18 R4 R17 U6 U7 U10 U11 U14 U15 V7 V10 V11 V14 VDD (2.5V core supply) C8 C13 D8 D9 D12 D13 H3 H4 H17 H18 J4 J17 M4 M17 N3 N4 N17 N18 U8 U9 U12 U13 V8 V13 Philips Semiconductors 1.6 Pin List PIN REFERENCE VOLTAGE With the exception of Open Drain mode outputs, outputs always drive to a level determined by the 3.3-V I/O voltage. VREF_PERIPH and VREF_PCI purely determine input voltage clamping, not input signal thresholds or output levels. Inputs always in 3.3-V mode TRI_CLKIN BOOT_CLK TESTMODE SCANCPU RESERVED1 VREF_PCI determined mode PCI_AD00 PCI_AD01 PCI_AD02 PCI_AD03 PCI_AD04 PCI_AD05 PCI_AD06 PCI_AD07 PCI_AD08 PCI_AD09 PCI_AD10 PCI_AD11 PCI_AD12 PCI_AD13 PCI_AD14 PCI_AD15 PCI_AD16 PCI_AD17 PCI_AD18 PCI_AD19 PCI_AD20 PCI_AD21 PCI_AD22 PCI_AD23 PCI_AD24 PCI_AD25 PCI_AD26 PCI_AD27 PCI_AD28 PCI_AD29 PCI_AD30 PCI_AD31 PCI_CLK PCI_C/BE#0 PCI_C/BE#1 PCI_C/BE#2 PCI_C/BE#3 PCI_PAR PCI_FRAME# PCI_IRDY# PCI_TRDY# PCI_STOP# PCI_IDSEL PCI_DEVSEL# PCI_REQ# PCI_GNT# PCI_PERR# PCI_SERR# PCI_INTA# PCI_INTB# PCI_INTC# PCI_INTD# TRI_RESET# VREF_PERIPH determined mode TRI_USERIRQ TRI_TIMER_CLK JTAG_TDI JTAG_TDO JTAG_TCK JTAG_TMS VI_CLK VI_DVALID VI_DATA0 VI_DATA1 VI_DATA2 VI_DATA3 VI_DATA4 VI_DATA5 VI_DATA6 VI_DATA7 VI_DATA8 VI_DATA9 IIC_SDA IIC_SCL VO_IO1 VO_IO2 VO_CLK Output only pins VO_DATA0 VO_DATA1 VO_DATA2 VO_DATA3 VO_DATA4 VO_DATA5 VO_DATA6 VO_DATA7 AI_SCK AI_SD AI_WS AO_SCK AO_WS SSI_CLK SSI_RXFSX SSI_RXDATA SSI_IO1 SSI_IO2 RESERVED2 AI_OSCLK AO_OSCLK AO_SD1 AO_SD2 AO_SD3 AO_SD4 SSI_TXDATA SPDO SDRAM i/f (always 3.3-Volt mode) MM_CLK0 MM_CLK1 MM_A00 MM_A01 MM_A02 MM_A03 MM_A04 MM_A05 MM_A06 MM_A07 MM_A08 MM_A09 MM_A10 MM_A11 MM_A12 MM_A13 MM_DQ00 MM_DQ01 MM_DQ02 MM_DQ03 MM_DQ04 MM_DQ05 MM_DQ06 MM_DQ07 MM_DQ08 MM_DQ09 MM_DQ10 MM_DQ11 MM_DQ12 MM_DQM0 MM_DQM1 PRELIMINARY SPECIFICATION MM_DQM2 MM_DQM3 MM_DQ13 MM_DQ14 MM_DQ15 MM_DQ16 MM_DQ17 MM_DQ18 MM_DQ19 MM_DQ20 MM_DQ21 MM_DQ22 MM_DQ23 MM_DQ24 MM_DQ25 MM_DQ26 MM_DQ27 MM_DQ28 MM_DQ29 MM_DQ30 MM_DQ31 MM_CKE0 MM_CKE1 MM_CS0# MM_CS1# MM_CS2# MM_CS3# MM_RAS# MM_CAS# MM_WE# 1-9 PNX1300/01/02/11 Data Book 1.7 Philips Semiconductors PACKAGE HBGA292: plastic, heatsink ball grid array package; 292 balls; body 27 x 27 x 1.75 mm SOT553-1 B D A D1 ball A1 index area A ∅ j E1 E A2 A1 detail X k k e1 C v M B b e ∅w M v M A y y1 C Y W V U e T R P N M L e1 K J H G F E D C B A 2 1 4 3 6 5 8 7 10 9 12 11 14 13 16 15 18 17 20 19 X 0 10 scale DIMENSIONS (mm are the original dimensions) A UNIT max. mm 1.8 To To To To 2.51 A1 A2 b D D1 E E1 e e1 ∅j k v w y y1 0.70 0.50 1.83 1.63 0.90 0.60 27.2 26.8 24.1 23.9 27.2 26.8 24.1 23.9 1.27 24.13 21.0 15.4 4.2 3.8 0.2 0.2 0.15 0.25 nc product nc product nc product nc product code code code code 7097 7097 7098 7098 6557. 9557. 2557. 5557. ORDERING INFORMATION order 143-MHz/2.5V product, part number is order 180-MHz/2.5V product, part number is order 200-MHz/2.5V product, part number is order 166-MHz/2.2V product, part number is 1-10 20 mm ‘PNX1300’, ‘PNX1301’, ‘PNX1302’, ‘PNX1311’, PRELIMINARY SPECIFICATION 12 12 12 12 9352 9352 9352 9352 Philips Semiconductors 1.9 Pin List PARAMETRIC CHARACTERISTICS 1.9.1 PNX1300/01/02/11 Absolute Maximum Ratings Permanent damage may occur if these conditions are exceeded Symbol Parameter Min. Max Units Notes VDDMAX 2.5-V core supply voltage (PNX1300/01/02/11) -0.5 3.5 V VCCMAX 3.3-V I/O supply voltage -0.5 4.6 V VI-5V DC input voltage on all 5-V pins -0.5 VX+0.5 V VI-3.3V DC input voltage on all 3.3-V pins -0.5 VCC+0.3 V Tstg Storage temperature range -65 150 Deg. C T Operating case temperature range 0 120 Deg. C HBMESD Human Body Model Electrostatic handling for all pins - - CLASS 1C 2 MMESD Machine Model Electrostatic handling for all pins - - CLASS A 3 case 1 Notes: 1. VX in the 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6. 2. JEDEC Standard, June 2000 3. JEDEC Standard, October 1997 1.9.2 PNX1300/01/02 Operating Range and Thermal Characteristics Functional operation, long-term reliability and AC/DC characteristics are guaranteed for the operating conditions below. Symbol Parameter Minimum Typical Maximum Units VDD PNX1300/01/02 Core supply voltage 2.375 2.50 2.625 VCC I/O supply voltage 3.135 3.30 3.465 V T case Operating case temperature range 85 °C Ψ jt junction to case thermal resistance 3.8 °C/W ϑja junction to ambient thermal resistance (natural convection) 15 °C/W 1.9.3 0 V PNX1311 Operating Range and Thermal Characteristics Functional operation, long-term reliability and AC/DC characteristics are guaranteed for the operating conditions below. Symbol Parameter Minimum Typical Maximum Units VDD PNX1311 Core supply voltage 2.090 2.20 2.310 VCC I/O supply voltage 3.135 3.30 3.465 V T Operating case temperature range 85 °C case 0 V Ψ jt junction to case thermal resistance 3.8 °C/W ϑja junction to ambient thermal resistance (natural convection) 15 °C/W 1.9.4 PNX1300/01/02/11 Power Supply Sequencing Power application and power removal should obey the following rule: VDD should never exceed V CC by more than 0.5 V Permanent damage may occur if this rule is not observed. PRELIMINARY SPECIFICATION 1-11 PNX1300/01/02/11 Data Book 1.9.5 PNX1300/01/02 DC/AC Characteristics Symbol V Philips Semiconductors Parameter Condition/Notes Core supply voltage DD Max Units 2.625 V 3.135 3.465 V I DD-typ Core supply current 200 MHz CPU operation (Max. application) 1400 mA I CC-typ I/O supply current 183 MHz SDRAM operation (Max. application) 160 mA I DD-pdn Core supply current CPU power down mode; 200 MHz 300 mA I CC-pdn I/O supply current CPU power down mode; 183 MHz 50 mA V Input HIGH voltage for I/O-5 V Note 1. All I/O’s except IICOD 2.0 VX+ 0.5 V VIH-3.3v Input HIGH voltage for I/O-3.3 V All I/Os except IICOD 2.0 V Input LOW voltage for I/O-5 V All I/Os except IICOD -0.5 0.8 Input LOW voltage for I/O-3.3 V All I/Os except IICOD -0.3 0.8 V Input leakage current for I/O-5 V 0 < VIN < 2.7V -70 70 uA Input leakage current for I/O-3.3 V 0 < VIN < 2.7V -0 10 uA 8 pF Units V V I I I/O supply voltage Min. 2.375 CC IH-5v IL-5v IL-3.3v IL-5v IL--3.3v C Input pin capacitance IN V CC + 0.3 V V Notes: 1. VX for a 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6. 1.9.6 PNX1311 DC/AC Characteristics Symbol V V DD CC Min. Max Core supply voltage Parameter Condition/Notes 2.090 2.310 V I/O supply voltage 3.135 3.465 V I DD-typ Core supply current 166 MHz CPU operation (Max. application) 1110 mA I CC-typ I/O supply current 166 MHz SDRAM operation (Max. application) 145 mA I DD-pdn Core supply current CPU power down mode; 166 MHz 215 mA I CC-pdn I/O supply current CPU power down mode; 166 MHz 46 mA V Input HIGH voltage for I/O-5 V Note 1. All I/O’s except IICOD 2.0 VIH-3.3v Input HIGH voltage for I/O-3.3 V All I/Os except IICOD 2.0 V Input LOW voltage for I/O-5 V All I/Os except IICOD -0.5 0.8 V Input LOW voltage for I/O-3.3 V All I/Os except IICOD -0.3 0.8 V Input leakage current for I/O-5 V 0 < VIN < 2.7V -70 70 uA Input leakage current for I/O-3.3 V 0 < VIN < 2.7V -0 10 uA 8 pF V I I IH-5v IL-5v IL-3.3v IL-5v IL--3.3v C IN Input pin capacitance Notes: 1. VX for a 5V mode pin is either VREF_PCI or VREF_PERIPH, see Section 1.6. 1-12 PRELIMINARY SPECIFICATION VX+ 0.5 V CC + 0.3 V V Philips Semiconductors 1.9.7 Pin List PNX1300 Series Power Consumption The power consumption of PNX1300 Series is dependent on the activity of the DSPCPU, the amount of peripherals being used, the frequency at which the system is running as well as the loads on the pins. • The first section presents the power consumption for known applications. The other power related sections present the maximum power consumption. These maximum values are obtained with a ‘fake’ application that turns on all the peripherals and runs intensive compute on the CPU. 1.9.7.1 Power Consumption for Applications on PNX1300 Series The Table 1-1 and Table 1-2 present the power consumption for two typical applications: • • The DVD playback includes video display using the VO peripheral and audio streaming using AO peripheral. The bitstream is brought into the TM-1300 system over the PCI peripheral. The VLD co-processor is used to perform the bitstream parsing. The bitstream is not scrambled therefore the DVDD co-processor is not used and it is turned off. The MPEG4 application includes video and audio playback of an enocded CIF stream. The bit stream is brought into the PNX1300 system over the PCI peripheral. The Video and Audio subsystems of the PNX1300 were used to render the video and sound from the decoded stream into the video monitor and speakers. The H263 video conferencing application includes the following steps. It captures a CCIR656 video stream at 30 frames/second using the VI peripheral. The incoming video stream is downscaled, on the fly, to SIF resolution by VI. The captured frames are then downscaled to a QSIF resolution using the ICP coprocessor. The resulting QSIF image is sent over the PCI bus via the ICP co-processor to a SVGA card (PC monitor display) and encoded by the DSPCPU. The resulting bitstream is then decoded by the DSPCPU and displayed as a SIF image on the same PC monitor (also using the ICP co-processor). All the encoding/decoding part is done in the YUV color space. The display is in the RGB16 color space. Software is not optimized. Three main technics may be applied to reduce the ‘Out of the Box’ power consumption. • • • Turn off the unused peripherals. Refer to Section 21.6 on pag e21-2. Run the system at the required speed, i.e. some application may not require to run at the full speed grade of the chip. Powerdown the system or the DSPCPU each time the DSPCPU reached the Idle task. A more detailed description can be found in the application note ‘TM-1300 Power Saving Features’ available at the following website: http://www.semiconductors.philips.com/trimedia/ Table 1-1. Power Consumption of Example Applications for PNX1300/01/02 (Vdd = 2.5V) Optimizations APPLICATIONS AFTER POWER OPTIMIZATIONS WITHOUT POWER OPTIMIZATIONS Unused Peripherals Turned Off System Speed Adjustment Idle task power management DVD Playback 2.2 W 3.0 W @ 180 MHz 2.6 W @ 180 MHz 2.6 W @ 180 MHz 2.2 W @ 180 MHz H.263 Vconf 1.7 W 2.9 W @ 166 MHz 2.7 W @ 166 MHz 1.9 W @ 111 MHz 1.7 W @ 111 MHz Table 1-2. Power Consumption of Example Applications for PNX1311(Vdd = 2.2V) Optimizations APPLICATIONS AFTER POWER OPTIMIZATIONS MPEG4 (CIF) A/V Playback 1.2 W H.263 Vconf 1.5 W WITHOUT POWER OPTIMIZATIONS Unused Peripherals Turned Off System Speed Adjustment Idle task power management 2.5 W @ 166 MHz 2.1 W @ 166 MHz 1.3 W @ 70 MHz 1.2 W @ 70 MHz 2.4 W @ 166 MHz 2.2 W @ 166 MHz 1.7 W @ 111 MHz 1.5 W @ 111 MHz As previously mentioned the Table 1-1 and Table 1-2 show that the final power consumption for a realistic application may be lower than the values reported in the next section. Based on these results and the following section, the power consumption of PNX1300 Series, using an artifi- cial scenario depicting an extremely demanding application, for commonly used speeds, is as follows: • • • PNX1300/01/02 is < 3.4 W @ 166:133 MHz PNX1311 is < 2.9 W @ 166:133 MHz PNX1302 is < 4.0 W @ 200:133 MHz PRELIMINARY SPECIFICATION 1-13 PNX1300/01/02/11 Data Book 1.9.7.2 Philips Semiconductors PNX1300/01/02 DSPCPU Core Current and Power Consumption PNX1300 143:143 Symbol Current/Notes PNX1302 192:144 PNX1302 200:133 Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units 225 1125 1200 250 1200 1300 300 1380 1475 300 1400 1525 mA ICC 40 125 135 40 120 135 40 130 135 36 125 130 mA Total Power Dissipation IDD , DSPCPU Only 0.8 - 3.2 820 3.5 920 0.8 - 3.4 900 3.7 1030 0.9 - 3.9 1030 4.1 1200 0.9 - 4.0 1050 4.2 1250 W mA PNX130x IDD (note 1) PNX1301 166:133 ICC , DSPCPU Only - 55 45 - 50 45 - 55 45 - 55 45 mA Power DSPCPU Only - 2.2 2.5 - 2.4 2.7 - 2.8 3.1 - 2.8 3.3 W PNX130x IDD , Standby - 550 - - 615 - - 720 - - 740 - mA (note 1,2) Power Standby IDD , Standby + bpwd - 1.5 405 - - 1.7 450 - - 1.9 525 - - 2.0 540 - W mA Power Standby + bpwd - 1.1 - - 1.2 - - 1.4 - - 1.5 - W Notes: 1. Consumption for PNX1300/01/02 is organized in several categories. The “Typ” column shows current consumption for a typical application with a CPI (Clocks Per Instruction) of 1.4. The “Max” column provides current consumption for an application with a CPI of 1.1. The measurements were taken with all the peripheral units turned on (peripherals run on a random data pattern at the specified frequencies, except for VO which runs at 27 MHz). This “Max” data represnts an application that heavily uses the DSPCPU and does not reflect a realistic application; it is used to determine peak currents. The “Typ” measurements reflect real applications. The “Pwd” column shows current consumption when Global Powerdown mode is activated. See Chapter 21, “Power Management.” 2. Standby rows indicate current consumption when DSPCPU is maintained under RESET (See Section 11.6.5, “BIU_CTL Register”), all peripherals turned off (i.e. not enabled) and all peripherals powered down (+ bpwd row). 3. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.5V and Vcc set to 3.3V. 4. Currents do not scale with frequency unless the CPU to SDRAM ratio is maintained. As an example, the data for CPU to SDRAM ratio 1:1 for 183:183 MHz can be calculated by using the data from the 143:143 MHz column, and scaling the currents by a factor of 1.279. 1.9.7.3 PNX1311 DSPCPU Core Current and Power Consumption Details PNX1311 100:100 Symbol Current/Notes PNX1311 166:166 PNX1311 166:133 Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units 129 670 720 185 955 1025 215 1110 1200 200 1032 1100 mA ICC 28 87 100 40 125 140 46 145 170 37 123 130 mA Total Power Dissipation IDD , DSPCPU Only 0.4 - 1.8 490 1.9 550 0.5 - 2.5 700 2.7 785 0.6 - 2.9 815 3.2 915 0.6 - 2.7 756 2.9 880 W mA PNX131x IDD (note 1) PNX1311 143:143 ICC , DSPCPU Only - 38 31 - 55 45 - 65 55 - 50 45 mA - 1.2 325 1.3 - - 1.7 460 1.9 - - 2.0 535 2.2 - - 1.8 518 2.1 - W mA IDD , Standby + bpwd - 0.8 240 - - 1.1 340 - - 1.3 395 - - 1.3 375 - W mA Power Standby + bpwd - 0.6 - - 0.9 - - 1.0 - - 0.9 - W Power DSPCPU Only PNX131x IDD , Standby (note 1,2) Power Standby Notes: 1. Consumption for PNX1311 is organized in several categories. The “Typ” column shows current consumption for a typical application with a CPI (Clocks Per Instruction) of 1.4. The “Max” column provides current consumption for an application with a CPI of 1.1. The measurements were taken with all the peripheral units turned on (peripherals run on a random data pattern at the specified frequencies, except for VO which runs at 27 MHz). This “Max” data represnts an application that heavily uses the DSPCPU and does not reflect a realistic application; it is used to determine peak currents. The “Typ” measurements reflect real applications. The “Pwd” column shows current consumption when Global Powerdown mode is activated. See Chapter 21, “Power Management.” 2. Standby rows indicate current consumption when DSPCPU is maintained under RESET (See Section 11.6.5, “BIU_CTL Register”), all peripherals turned off (i.e. not enabled) and all peripherals powered down (+ bpwd row). 3. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.2V and Vcc set to 3.3V. 4. Currents do not scale with frequency unless the CPU to SDRAM ratio is maintained. 1-14 PRELIMINARY SPECIFICATION Philips Semiconductors 1.9.7.4 Pin List PNX1300/01/02 Current Consumption For On-Chip Peripherals PNX1300 143:143 Symbol PNX1301 166:133 PNX1302 192:144 PNX1302 200:133 Current/Notes Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units VO 27 MHz IDD , running raw mode 50 28 39 55 29 38 65 16 26 72 27 36 mA ICC , running raw mode - 9 17 - 12 17 - 12 17 - 12 17 mA VO 81 MHz IDD , running raw mode - 23 75 - 33 54 - 30 58 - 47 72 mA ICC , running raw mode - 33 51 - 37 51 - 36 52 - 36 52 mA VI 27 MHz IDD , running raw mode 6 8 18 6 6 18 7 8 18 7 6 18 mA ICC , running raw mode - 7 14 - 6 14 - 8 15 - 9 15 mA AO 44 KHz IDD , stereo 16-bit 2 3 1 1 3 1 1 3 4 5 3 3 mA ICC , stereo 16-bit - 2 1 - 1 1 - 1 1 - 1 1 mA AI 44 KHz IDD , stereo 16-bit 1 2 2 1 3 3 1 3 2 1 3 3 mA ICC , stereo 16-bit - 1 1 - 1 1 - 1 1 - 1 1 mA SPDIF 48 KHz IDD running PCM audio 2 3 2 2 3 1 3 3 3 4 2 2 mA ICC running PCM audio - 3 3 - 2 2 - 2 2 - 2 2 mA ICP IDD , mem. block move 61 95 176 67 95 170 80 105 188 86 106 184 mA ICC , mem. block move - 28 28 - 27 54 - 30 61 - 29 59 mA PCI 33 MHz IDD , DMA transfer - 37 83 - 34 80 - 32 83 - 40 53 mA ICC , DMA transfer - 58 102 - 58 102 - 58 104 - 58 82 mA VLD IDD 3 - - 5 - - 6 - - 6 - - mA ICC - - - - - - - - - - - - mA IDD 4 - - 5 - - 6 - - 6 - - mA ICC - - - - - - - - - - - - mA IDD 18 - - 21 - - 24 - - 24 - - mA ICC - - - - - - - - - - - - mA SSI 10 MHz DVDD Notes: 1. Pwd. column for peripheral units indicates current savings when block powerdown is activated compared to when it is idle. See Chapter 21, “Power Management” for block powerdown activation. 2. Typ. column for peripheral units indicates current required when data pattern is random. The Max. column indicates current ratings when data is switching from high to low level each cycle. Again that Max. column is to show peak current and does not represent a real application. For both columns the current reported is the current required by the peripheral as well as the internal bus and MMI to transfer the data to/from the peripheral unit. 3. Some currents are not reported due to the difficulty to measure it or because they are not relevant. For example SSI current is difficult to measure because it heavily involves the DSPCPU and thus makes it almost impossible to separate the current consumed by the SSI or the DSPCPU. 4. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.5V and Vcc set to 3.3V. 5. Currents do not scale with frequency if the CPU:SDRAM ratio are different. Same ratio must be used. PRELIMINARY SPECIFICATION 1-15 PNX1300/01/02/11 Data Book 1.9.7.5 Philips Semiconductors PNX1311 Current Consumption For On-Chip Peripherals Symbol PNX1311-100:100 PNX1311-143:143 PNX1311-166:166 PNX1311-166:133 Current/Notes Pwd Typ Max Pwd Typ Max Pwd Typ Max Pwd Typ Max Units VO 27 MHz IDDL , running raw mode 33 17 23 47 25 33 56 29 38 48 24 31 mA ICC , running raw mode - 8 12 - 12 17 - 14 20 - 25 17 mA VO 81 MHz IDDL , running raw mode - 14 31 - 20 44 - 23 51 - 33 54 mA ICC , running raw mode - 25 36 - 36 52 - 42 60 - 37 51 mA IDDL , running raw mode 3 5 8 5 7 11 6 8 13 5 7 15 mA VI 27 MHz AO 44 KHz AI 44 KHz SPDIF 48 KHz ICP PCI 33 MHz VLD SSI 10 MHz DVDD ICC , running raw mode - 6 10 - 9 15 - 10 17 - 8 15 mA IDDL , stereo 16-bit 4 2 1 6 3 2 7 3 2 1 2 2 mA ICC , stereo 16-bit - 1 1 - 1 1 - 1 1 - 1 1 mA IDDL , stereo 16-bit 1 1 1 1 2 2 1 2 2 1 2 3 mA ICC , stereo 16-bit - 1 1 - 1 1 - 1 1 - 1 1 mA IDDL running PCM audio 2 2 1 3 3 2 3 3 2 2 2 2 mA ICC running PCM audio - 1 1 - 2 2 - 2 2 - 2 2 mA IDDL , mem. block move 40 55 101 57 79 144 66 92 167 60 76 136 mA ICC , mem. block move - 19 38 - 27 55 - 31 64 - 26 54 mA IDDL , DMA transfer - 17 36 - 25 51 - 29 59 - 20 50 mA ICC , DMA transfer - 41 57 - 58 82 - 67 95 - 45 81 mA IDDL 3 - - 4 - - 5 - - 4 - - mA ICC - - - - - - - - - - - - mA IDDL 2 - - 3 - - 3 - - 4 - - mA ICC - - - - - - - - - - - - mA IDDL 11 - - 16 - - 19 - - 18 - - mA ICC - - - - - - - - - - - - mA Notes: 1. The “Pwd” column for peripheral units indicates current savings when block powerdown is activated, compared to when it is idle. See Chapter 21, “Power Management” for block powerdown activation. 2. The “Typ” column for peripheral units indicates current required when data pattern is random. The “Max” column indicates current ratings when data is switching from high to low level each cycle. Again that “Max” column is to show peak current and does not represent a real application. For both columns the current reported is the current required by the peripheral as well as the internal bus and MMI to transfer the data to/from the peripheral unit. 3. Some currents are not reported due to the difficulty to measure it or because they are not relevant. For example SSI current is difficult to measure because it heavily involves the DSPCPU and thus makes it almost impossible to separate the current consumed by the SSI or the DSPCPU. 4. Measurements accuracy is +/- 5%. Measurements are done with Vdd set to 2.2V and Vcc set to 3.3V. 5. Currents do not scale with frequency if the CPU:SDRAM ratio are different. Same ratio must be used. 1-16 PRELIMINARY SPECIFICATION Philips Semiconductors 1.9.7.6 Pin List STRG3, STRG5 type I/O circuit PNX1300/01/02/11 Symbol Parameter Condition/Notes Min. Nominal Output HIGH voltage I OUT = V OL Output LOW voltage I OUT = -16.0 mA Z Output AC impedance HIGH level output state 11 11 OH Max Units 0.9VCC V OH 16.0 mA V 0.1VCC V ohm Output AC impedance LOW level output state tr Output rise time Test load of Figure 1-1. 2.0 ns tr Output fall time Test load of Figure 1-1. 2.0 ns Z OL 1.9.7.7 ohm NORM3 type I/O circuit PNX1300/01/02/11 Symbol V OH V OL Z OH Parameter Condition/Notes Min. Nominal Max. Units 0.9VCC Output HIGH voltage I OUT = Output LOW voltage I OUT = -8.0 mA Output AC impedance HIGH level output state 23 23 8.0 mA V 0.1VCC V ohm Output AC impedance LOW level output state tr Output rise time Test load of Figure 1-2. 4.0 ns tr Output fall time Test load of Figure 1-2. 4.0 ns Z OL 1.9.7.8 ohm WEAK5 type I/O circuit PNX1300/01/02/11 Symbol Parameter Condition/Notes Min. Nominal Max. Units 0.9VCC V OH Output HIGH voltage I OUT = 6.0 mA V OL Output LOW voltage I OUT = -6.0 mA Z Output AC impedance HIGH level output state 33 ohm Output AC impedance LOW level output state 33 ohm Z OH OL V 0.1VCC V tr Output rise time Test load of Figure 1-3. 4.0 ns tr Output fall time Test load of Figure 1-3. 4.0 ns 1.9.7.9 IICOD (I2c) type I/O circuit Symbol V V V IL-IIC IH-IIC HYS Parameter Condition/Notes Input LOW voltage Input HIGH voltage VX is 3.3V or 5V depending on VREF_PERIPH value Input Schmitt trigger hysteresis Min. Nominal Max. 1.0 V 2.3 VX+0.5 V 0.25 V OL Output LOW voltage I OUT tf Output fall time 10 - 400 pF load Units -0.5 = -6.0 mA 1.5 PRELIMINARY SPECIFICATION V 0.6 V 250 ns 1-17 PNX1300/01/02/11 Data Book 1.9.7.10 Philips Semiconductors SDRAM interface timing for PNX1300/01/02/11 speed grades. Symbol Parameter N o t e Max Min Max Min Max Units s PNX1300 PNX1301 PNX1301 143 166 180 Min Max Min Max Min PNX1311 166 PNX1302 200 f SDRAM MM_CLK frequency 143 166 166 166 183 MHz 1 TCS Skew between MM_CLK0, CLK1 0.05 0.05 0.05 0.05 0.05 ns 2 TPD Propagation delay of data, address, control TOH Output hold time of data, address and control TSU TIH ns 3 1.5 4.7 1.5 4.2 1.5 4.2 1.5 4.2 1.5 3.7 ns 3 Input data setup time 0 0 0 0 0 ns 4 Input data hold time 2.0 1.5 1.5 1.5 1.5 ns 4 Notes: 1. For best high speed SDRAM operation, 50-ohm matched PCB traces are recommended for all MM_xxx signals. Use 27-33 ohm series terminator resistors close to PNX1300/01/02/11 in the MM_CLK0 and MM_CLK1 line only. 2. Equal load circuit. MM_CLK0 and MM_CLK1 are matched output buffers. 3. The center of the two rising edges on MM_CLK0, MM_CLK1 are used as the clock reference point. Propagation delay guarantee is defined from 50% point of clock edge to 50% level on D/A/C. Output hold time guarantee is defined from 50% point of clock edge to 50% level on D/A/C. 4. MM_CLK0 is used as a reference clock. Input setup time requirement is defined as data value 50% complete to 50% level on clock. Input hold time requirement is defined as minimum time from 50% level on clock to 50% change on data. 1.9.7.11 PCI Bus timing The following specifications meet the PCI Specifications, Rev. 2.1 for 33-MHz bus operation. Min. Max Units Notes Tval-PCI (Bus) Symbol Clk to signal valid delay, bused signals Parameter 2 11 ns 1,2,3 Tval-PCI (ptp) Clk to signal valid delay, point-to-point signals 2 12 ns 1,2,3 Ton-PCI Float to active delay 2 TOff-PCI Active to float delay Tsu-PCI Input setup time to CLK - bused signals Tsu-PCI (ptp) Input setup time to CLK - point-to-point signals Th-PCI Input hold time from CLK Trst-PCI Reset active time after power stable Trst-clk-PCI Reset active time after CLK stable Trst-off-PCI Reset active to output float delay ns 1 ns 1,7 7 ns 3,4 12 ns 3,4 ns 4 28 0.2 1 1 ms 5 100 µs 5 ns 5,6,7 40 1. PCI Clock skew between two PCI devices must be lower than 1.8ns instead of the 2 ns as specified in PCI 2.1 specification Notes: 1. See the timing measurement conditions in Figure 1-4. 2. Minimum times are measured at the package pin with the load circuit shown in Figure 1-8. Maximum times are measured with the load circuit shown in Figure 1-6 and Figure 1-7. 3. REG# and GNT# are point-to-point signals and have different input setup times. All other signals are bused. 4. See the timing measurement conditions in Figure 1-5. 5. RST# is asserted and de-asserted asynchronously with respect to CLK. 6. All output drivers are floated when RST# is active. 7. For the purpose of Active/Float timing measurements, the Hi-Z or ‘off’ state is defined to be when the total current delivered through the component pin is less than or equal to the leakage current specification. 1-18 PRELIMINARY SPECIFICATION Philips Semiconductors 1.9.7.12 Pin List JTAG I/O timing Symbol Parameter Min. Max Units 20 MHz Notes f JTAG-CLK JTAG clock frequency Tclk-TDO JTAG_TCK to JTAG_TDO valid delay 2 ns 1 Tsu-TCK Input setup time to JTAG_TCK 3 ns 2 Th-TCK Input hold time from JTAG_TCK 7 ns 2 Max Units 400 kHz 1 10 Notes: 1. See the timing measurement conditions in Figure 1-10. 2. See the timing measurement conditions in Figure 1-9. I2C I/O timing 1.9.7.13 Symbol Parameter Min. Notes f SCL SCL clock frequency TBUF Bus free time 1 µs 2 Tsu-STA Start condition set up time 1 µs 3 Th-STA Start condition hold time 1 µs 3 TLOW SCL LOW time 1 µs 1 THIGH SCL HIGH time 1 µs 1 Tf SCL and SDA fall time (Cb = 10-400 pF, from VIH-IIC to VIL-IIC) ns 1 Tsu-SDA Data setup time 100 ns 4 Th-SDA Data hold time 0 ns 4 Tdv-SDA SCL LOW to data out valid Tdv-STO SCL HIGH to data out Notes: 1. 2. 3. 4. 5. See See See See See 1.9.7.14 the timing measurement conditions in Figure the timing measurement conditions in Figure the timing measurement conditions in Figure the timing measurement conditions in Figure the timing measurement conditions in Figure 20+0.1Cb 250 0.5 1 µs 5 ns 5 1-11. 1-12. 1-13. 1-14. 1-15. Video In I/O Timing Symbol Parameter Min. Max Units 81 MHz Notes f VI-CLK Video In clock frequency Tsu-CLK Input setup time to VI_CLK 2 ns 1 Th-CLK Input hold time from VI_CLK 2 ns 1 Max Units Notes 81 MHz Notes: 1. See the timing measurement conditions in Figure 1-16. 1.9.7.15 Video Out I/O Timing Symbol Parameter Min. f VO-CLK Video Out clock frequency TCLK-DV VO_CLK to VO_DATA (or VO_IO*) out 3 7.5 ns 1,3 TCLK-DV VO_CLK to VO_DATA (or VO_IO*) out 3 7.5 ns 1,4 Tsu-CLK VO_IO* setup time to VO_CLK 10 ns 2 Th-CLK VO_IO* hold time from VO_CLK 3 ns 2 Notes: 1. 2. 3. 4. See the timing measurement conditions in Figure 1-17. See the timing measurement conditions in Figure 1-18. CLKOUT asserted, i.e. the VO unit is the source of VO_CLK CLKOUT negated, i.e. the external world is the source of VO_CLK PRELIMINARY SPECIFICATION 1-19 PNX1300/01/02/11 Data Book 1.9.7.16 Philips Semiconductors AudioIn I/O timing Symbol Parameter Min. Max Units 22 MHz Notes f AI-SCK Audio In AI_SCK clock frequency Tsu-SCK Input setup time to AI_SCK 3 ns 1,2 Th-SCK Input hold time from AI_SCK 2 ns 1,2 TSCK-WS AI_SCK to AI_WS ns 3 10 Notes: 1. See the timing measurement conditions in Figure 1-19. 2. The timing measurements are done with respect to the clock edge according to CLOCK_EDGE 3. SER_MASTER asserted, i.e. Audio In is the source of AI_WS. See the timing measurement condition in Figure 1-20. 1.9.7.17 Audio Out I/O timing Symbol Parameter f AO-SCK Audio Out AO_SCK clock frequency TSCK-DV AO_SCK to AO_SDx valid TSCK-DV AO_SCK to AO_SDx valid Tsu-SCK Input setup time to AO_SCK Th-SCK Input hold time from AO_SCK TSCK-WS AO_SCK to AO_WS Notes: 1. 2. 3. 4. 5. 6. Min. Max Units 22 MHz 2 12 ns 1,3,4 2 12 ns 1,3,5 4 ns 2,3,5 2 ns 2,3,5 ns 3,4,6 10 See the timing measurement conditions in Figure 1-21. See the timing measurement conditions in Figure 1-23. The timing measurements are done with respect to the AO_SCK clock edge according to CLOCK_EDGE PNX1300/01/02/11 is the serial interface master, i.e. AO_SCK, AO_WS are outputs PNX1300/01/02/11 is serial interface slave, i.e. AO_SCK, AO_WS are inputs See the timing measurement conditions in Figure 1-22. 1.9.7.18 Symbol SSI I/O timing Parameter Min. Max Units Notes 20 MHz 1 12 ns 2 3 ns 3 2 ns 3 f SSI-CLK SSI_CLK clock frequency TCLK-DV SSI_CLK to data valid 2 Tsu-CLK Input setup time to SSI_CLK Th-CLK Input hold time from SSI_CLK Notes: 1. Interrupt latency limits SSI to a practical use at a bit rate of 1.5 Mbit/sec. 2. See the timing measurement conditions in Figure 1-24. 3. See the timing measurement conditions in Figure 1-25. 1-20 Notes PRELIMINARY SPECIFICATION Philips Semiconductors PNX1300 pin Pin List rise/fall test point 2” true length CLK V_th V_tl V_test 30-ohm Output T_su T_h 50-ohm Buffer 12 pF Input Figure 1-1. STRG3, STRG5 test load circuit PNX1300 pin V_th V_test V_tl 1/2 in. max Output Buffer 10 pF 25 Ω 30 pF Figure 1-6. PCI T val(max) Rising Edge Figure 1-2. NORM3 test load circuit PNX1300 pin V_max pin 50-ohm Buffer V_test Figure 1-5. PCI Input Timing Measurement Conditions rise/fall test point 2” true length Output inputs valid pin rise/fall test point 2” true length 1/2 in. max Output Output Buffer 50-ohm Buffer Vcc 10 pF 15 pF 25 Ω Figure 1-7. PCI T val(max) Falling Edge Figure 1-3. WEAK5 test load circuit pin 1/2 in. max CLK Output Buffer V_th V_tl V_test 10 pF 1K Ω T_fval Output Delay Vcc 1K Ω V_tfall Figure 1-8. PCI T val(min) and Slew Rate T_rval Output Delay V_trise Tri-State Output TCK Tsu_TCK T_on T_off TDI, TMS Figure 1-4. PCI Output Timing Measurement Conditions Th_TCK valid Figure 1-9. JTAG Input Timing PRELIMINARY SPECIFICATION 1-21 PNX1300/01/02/11 Data Book Philips Semiconductors SCL TCK Tclk_TDO Tdv_SDA Figure 1-15. I2C I/O Timing Figure 1-10. JTAG Output Timing THIGH valid SDA valid TDO Tdv_STO TLOW VI_CLK SCL Tf Tsu_CLK Tr Th_CLK valid VI_DATA, VI_IO Figure 1-16. VideoI n I/O Timing Figure 1-11. I2C I/O Timing SCL VO_CLK TTBUF TCLK_DV SDA VO_DATA Figure 1-12. I2C I/O Timing Figure 1-17. Video Out I/O Timing SCL VO_CLK Tsu_STA Tsu_CLK Th_STA SDA Th_CLK valid VO_IO Figure 1-13. I2C I/O Timing Figure 1-18. Video Out I/O Timing AI_SCK SCL Tsu_SDA Tsu_SCK Th_SDA valid SDA Figure 1-14. I2C I/O Timing 1-22 valid PRELIMINARY SPECIFICATION AI_SD, AI_WS Th_SCK valid Figure 1-19. Audio In I/O Timing Philips Semiconductors Pin List AI_SCK AO_SCK TSCK_WS AI_WS Tsu_SCK Th_SCK valid valid AO_WS Figure 1-20. Audio In I/O Timing Figure 1-23. Audio Out I/O Timing AO_SCK SSI_CLK TSCK_DV AO_SDx valid Figure 1-21. Audio Out I/O Timing TCLK_DV SSI I/O Figure 1-24. SSI I/O Timing AO_SCK SSI_CLK TSCK_WS AO_WS valid Tsu_CLK SSI_IO Figure 1-22. Audio Out I/O Timing Th_CLK valid valid Figure 1-25. SSI I/O Timing PRELIMINARY SPECIFICATION 1-23 PNX1300/01/02/11 Data Book 1-24 PRELIMINARY SPECIFICATION Philips Semiconductors Overview Chapter 2 by Gert Slavenburg 2.1 INTRODUCTION In this document, the generic PNX1300 name refers to the PNX1300 Series, or the PNX1300/01/02/11 products. PNX1300 is a successor to the TM-1300, TM-1100 and TM-1000 media processors. For those familiar with the TM-1300, the new features specific to the PNX1300 are summarized in Section 2.6. For those familiar with the TM-1100, the new features specific to the PNX1300 are summarized in Section 2.7. For those familiar with the TM-1000, new features for the PNX1300 are summarized in Section 2.8. 2.2 PNX1300 FUNDAMENTALS PNX1300 is a media processor for high-performance multimedia applications that deal with high-quality video and audio. These applications can range from low-cost, dedicated systems such as video phones, video editing, digital television, security systems or set-top boxes to reprogrammable, multipurpose plug-in cards for personal computers. PNX1300 easily implements popular multimedia standards such as MPEG-1 and MPEG-2, but its orientation around a powerful general-purpose CPU (called the DSPCPU) makes it capable of implementing a variety of multimedia algorithms, both open and proprietary. PNX1300 is also easily configured in multiple processor configurations for very high-end applications. More than just an integrated microprocessor with unusual peripherals, the PNX1300 is a fluid computer system controlled by a small real-time OS kernel running on a very-long instruction word (VLIW) processor core. PNX1300 contains a DSPCPU, a high-bandwidth internal bus, and internal bus-mastering DMA peripherals. Software compatibility between current and future Trimedia processor family members is at the source-code and library API level; binary compatibility between family members is not guaranteed. Defining software compatibility at the source-code level gives Philips the freedom to strike the optimum balance between cost and performance for all chips in the family. A powerful compiler and software development environment ensure that programmers never need to resort to non-portable assembler programming. Programmers use the library APIs and multimedia operations from C and C++ source code. PNX1300 is designed both for use as an accelerator in a PC environment or as the sole CPU in cost-effective standalone systems. In standalone system applications, the PNX1300 external bus allows for glueless connection of 8-bit wide ROM, EEPROM, or Flash memory for code storage. The external bus also allows intermixing of PCI2.1 master/slave peripherals and 8-bit simple peripherals, such as UARTs and other 8-bit microprocessor peripherals. This powerful external bus architecture gives system designers a variety of options to configure lowcost, high-performance system solutions. Because it is based on a general-purpose CPU, PNX1300 can also serve as a multifunctional PC enhancement vehicle. Typically, a PC must deal with multi standard video and audio streams; and applications require both decompression and compression. While the CPU chips used in PCs are becoming capable of lowresolution, real-time video decompression, high-quality decompression—not to mention compression—of studio-resolution video is still out of reach. Further, users expect their systems to handle live video and audio without sacrificing system responsiveness. PNX1300 enhances a PC system by providing real-time multimedia with the advantages of a special-purpose, embedded solution—low cost and chip count—and the advantages of a general-purpose processor—reprogrammability. For PC applications, PNX1300 far surpasses the capabilities of fixed-function multimedia chips. Future media processor family members will have different sets of interfaces appropriate for their intended use. 2.3 PNX1300 CHIP OVERVIEW Key features of PNX1300 include: • • • A very powerful, general-purpose VLIW processor core (the DSPCPU) that coordinates all on-chip activities. In addition to implementing the non-trivial parts of multimedia algorithms, the DSPCPU runs a small real-time operating system driven by interrupts from the other units. Independent DMA-driven multimedia I/O units that properly format data to make software media processing efficient. DMA-driven multimedia coprocessors that operate independently and in parallel with the DSPCPU to perform operations specific to important multimedia algorithms. PRELIMINARY SPECIFICATION 2-1 PNX1300/01/02/11 Data Book • • Philips Semiconductors A high-performance bus and memory system that provide communication between PNX1300’s processing units. A flexible external bus interface. 2Mx32 SDRAM Figure 2-1 shows a PNX1300 block diagram. The bulk of a PNX1300 system consists of the PNX1300 microprocessor itself, external synchronous DRAM (SDRAM), and the external circuitry needed to interface to incoming and/or outgoing video and audio data streams and communication lines. PNX1300’s external peripheral bus can gluelessly interface to PC! 2.1 components and/or 8-bit microprocessor peripherals. Figure 2-2 shows a possible minimally configured PNX1300 system. A video input stream might come directly from a CCIR 656-compliant video camera chip in YUV 4:2:2 format through a glueless interface in this case. An analog camera can be connected via a CCIR 656 interface chip (such as the Philips SAA7113H). PNX1300 outputs a CCIR656 video stream to drive a dedicated video monitor. Stereo audio input and up to 8channel audio output require only low-cost external ADC and DAC. The operation of the video and audio interface units is highly customizable through programmable parameters. CCIR656 digital video stereo audio in The glueless PCI interface allows the PNX1300 to display video in a host PC’s video card. The Image Coprocessor (ICP) provides display support for live video input an arbitrary number of arbitrarily overlapped windows. 32-bit data up to 572 MB/sec Huffman decoder Slice-at-a-time MPEG-1 & 2 VLD Coprocessor Stereo digital audio 8 and 16-bit data I2S DC, up to 22 MHz AI_SCK Audio In Video Out 2/4/6/8 ch. digital audio 16 and 32-bit data I2S DC, up to 22 MHz AO_SCK Audio Out Timers IEC958 up to 40 Mbit/sec SPDIF Out Synchronous Serial Interface I2C Interface DVDD VLIW CPU Image Coprocessor PCI-XIO Interface Figure 2-1. PNX1300 block diagram. PRELIMINARY SPECIFICATION modem front end Figure 2-2. PNX1300 system connections. A minimal PNX1300 requires few supporting components. Video In 32K I$ 16K D$ 2 - 8 ch audio out PC I a n d 8 -b i t p e rip he ra l b us CCIR656 dig. video YUV 4:2:2 up to 81 MHz (40 Mpix/sec) I2C bus to camera, etc. DAC ROM Main Memory Interface PNX1300 PNX1300 JTAG SDRAM 2-2 ADC CCIR656 dig. video CCIR656 digital video YUV 4:2:2 up to 81 MHz (40 Mpix/sec) Analog modem or ISDN front end Down & up scaling YUV → RGB 50 Mpix/sec External bus - PC!2.1 (32 bits, 33-MHz) + glueless 24A/8D slaves Philips Semiconductors Finally, the Synchronous Serial Interface (SSI) requires only an external ISDN or analog modem front-end chip and phone line interface to provide remote communication support. It can be used to connect PNX1300-based systems for video phone or videoconferencing applications, or it can be used for general-purpose data communication in PC systems. The PNX1300 JTAG port allows a debugger on a host system to access and control the state of a PNX1300 in a target system. It also implements 1149.1 boundary scan functionality. 2.4 BRIEF EXAMPLES OF OPERATION The key to understanding PNX1300 operation is observing that the DSPCPU and peripherals are time-shared and that communication between units is through SDRAM memory. The DSPCPU switches from one task to the next; first it decompresses a video frame, then it decompresses a slice of the audio stream, then back to video, etc. As necessary, the DSPCPU issues commands to the peripheral function units to orchestrate their operation. The DSPCPU can enlist the ICP and other coprocessors to help with some of the straightforward, tedious tasks associated with video processing. The ICP is very well suited for arbitrary size horizontal and vertical video resizing and color space conversion. The DSPCPU can enlist the input/output peripherals to autonomously receive or transmit digital video and audio data with minimal CPU supervision. The I/O units have been designed to interface to the outside world through industry standard audio and video interfaces, while delivering or taking data in memory in formats suitable for software processing. 2.4.1 Video Decompression in a PC An example PNX1300 implementation is as a video-decompression engine on a PCI card in a PC. In this case, the PC does not need to know the PNX1300 has a powerful, general-purpose CPU; rather, the PC just treats the hardware on the PCI card as a ‘black-box’ engine. Video decompression begins when the PC operating system hands the PNX1300 a pointer to compressed video data in the PC’s memory (the details of the communication protocol are handled by the software driver installed in the PC’s operating system). The DSPCPU fetches data from the compressed video stream via the PCI bus, decompresses frames from the video stream, and places them into local SDRAM. Decompression may be aided by the VLD (variable-length decoder) coprocessor unit, which implements Huffman decoding and is controlled by the DSPCPU. When a frame is ready for display, the DSPCPU gives the ICP a display command. The ICP then autonomously fetches the decompressed frame data from SDRAM and transfers it over the PCI bus to the frame buffer in the Overview PC’s video display card. Alternately, video can be sent to the graphics card using the VO unit. 2.4.2 Video Compression Another typical application for PNX1300 is in video compression. In this case, uncompressed video is usually supplied directly to the PNX1300 system via the Video In (VI) unit. A camera chip connected directly to the VI unit supplies YUV data in 8-bit, 4:2:2 format. The VI unit samples the data from the camera chip and demultiplexes the raw video to SDRAM in three separate areas, one each for Y, U, and V. When a complete video frame has been read from the camera chip by the VI unit, it interrupts the DSPCPU. The DSPCPU compresses the video data in software (using a set of powerful data-parallel multimedia operations) and writes the compressed data to a separate area of SDRAM. The compressed video data can now be transmitted or stored in any of several ways. It can be sent to a host system over the PCI bus for archival on local mass storage, or the host can transfer the compressed video over a network. The data can also be sent to a remote system using the modem/ISDN interface to create, for example, a video phone or videoconferencing system. Since the powerful, general-purpose DSPCPU is available, the compressed data can be encrypted before being transferred for security. 2.5 INTRODUCTION TO PNX1300 BLOCKS The remainder of this chapter provides a brief introduction to the internal components of PNX1300. 2.5.1 Internal ‘Data Highway’ Bus The internal bus (or data highway) connects all internal blocks together and provides access to internal control/ status registers of each block, external SDRAM, and the external bus peripheral chips. The internal bus consists of separate 32-bit data and address buses. Transactions on the bus use a block-transfer protocol. On-chip peripheral units and coprocessors can be masters or slaves on the bus. Access to the internal bus is controlled by a central arbiter, which has a request line from each potential bus master. The arbiter is programmable so that the arbitration algorithm can be tailored for different applications. Peripheral units make requests to the arbiter for bus access and, depending on the arbitration mode, bus bandwidth is allocated to the units in different amounts. Each mode allocates bandwidth differently, but each mode guarantees each unit a minimum bandwidth and maximum service latency. All unused bandwidth is allocated to the DSPCPU. The bus allocation mechanism is one of the features of PNX1300 that makes it a true real-time system instead of just a highly integrated microprocessor with unusual peripherals. PRELIMINARY SPECIFICATION 2-3 PNX1300/01/02/11 Data Book 2.5.2 VLIW Processor Core The heart of PNX1300 is a powerful 32-bit DSPCPU core. The DSPCPU implements a 32-bit linear address space and 128, fully general-purpose 32-bit registers. The registers are not separated into banks; any operation can use any register for any operand. The PNX1300 core uses a VLIW instruction-set architecture and is fully general-purpose. The VLIW instruction length allows five simultaneous operations to be issued every clock cycle. These operations can target any five of the 27 functional units in the DSPCPU, including integer and floating-point arithmetic units and data-parallel multimedia operation units. Although the processor core runs a real-time operating system to coordinate all activities in the PNX1300 system, the core is not intended for true general-purpose computer use. For example, the PNX1300 processor core does not implement demand-paged virtual memory, memory address translation, or 64-bit floating point - all essential features in a general-purpose computer system. PNX1300 uses a VLIW architecture to maximize processor throughput at the lowest possible cost. VLIW architectures have performance exceeding that of superscalar general-purpose CPUs without the cost and complexity of a superscalar CPU implementation. The hardware saved by eliminating superscalar logic reduces cost and allows the integration of multimedia-specific features that enhance the power of the processor core. The PNX1300 operation set includes all traditional microprocessor operations. In addition, multimedia operations are included that dramatically accelerate standard video and audio compression and decompression algorithms. As just one of the five operations issued in a single PNX1300 instruction, a single ‘custom’ or ‘media’ operation can implement up to 11 traditional microprocessor operations. These multimedia operations combined with the VLIW architecture result in tremendous throughput for multimedia applications. The DSPCPU core is supported by separate 16-KB data and 32-KB instruction caches. The data cache is dualported to allow two simultaneous accesses; both caches are 8-way set-associative with a 64-byte block size. 2.5.3 Video In Unit The Video In (VI) unit interfaces directly to any CCIR 601/ 656-compliant device that outputs 8-bit parallel, 4:2:2 YUV time-multiplexed data. Such devices include direct digital camera systems, which can connect gluelessly to PNX1300 or through the standard CCIR 656 connector with only the addition of ECL level converters. A single chip external device can be used to convert to/from serial D1 professional video. Non-CCIR-compliant devices can use a digital video decoder chip, such as the Philips SAA7113H, to interface to PNX1300. The VI unit demultiplexes the captured YUV data before writing it into local PNX1300 SDRAM. Separate planar data structures are maintained for Y, U, and V. 2-4 PRELIMINARY SPECIFICATION Philips Semiconductors The VI unit can be programmed to perform on-the-fly horizontal resolution subsampling by a factor of two if needed. Many camera systems capture a 640-pixel/line or 720-pixel/line image. With subsampling, direct conversion to a 320-pixel/line or a 360-pixel/line image can be performed with no DSPCPU intervention. Performing this function during video input reduces initial storage and bus bandwidth requirements for applications requiring reduced resolution. 2.5.4 Enhanced Video Out Unit The Enhanced Video Out (EVO) unit essentially performs the inverse function of the VI unit. EVO generates an 8-bit, CCIR656 digital video data stream that contains a composited video and graphics overlay image. The video image is taken from separate Y, U, and V planar data structures in SDRAM. The graphics overlay is taken from a pixel-packed YUV data structure in SDRAM. Compositing allows both alpha-blending and chroma keying. The EVO unit can also upscale the video image horizontally by a factor of two to convert from CIF/SIF to CCIR 601 resolution. The overlay image, if enabled, is always in full-pixel resolution. The EVO unit is capable of pixel emission rates up to 40 Mpix/sec and allows full programming of a horizontal and vertical frame/field structure. It is thus capable of refreshing both interlaced and non-interlaced (‘two fh’) video displays with 4:3 or 16:9 or other aspect ratios. The sample rate for EVO unit pixels is independently and dynamically programmable. The high-quality, on-chip sample clock generator circuit allows the programmer subtle control over the sampling frequency so that audio and video synchronization can be achieved in any system configuration. When changing the sample frequency, the instantaneous phase does not change, which allows sample frequency manipulation without introducing audio or video distortion. 2.5.5 Image Coprocessor The ICP off-loads common image scaling or filtering tasks from the DSPCPU. Although these tasks can be easily performed by the DSPCPU, they are a poor use of the relatively expensive CPU resource. When performed in parallel by the ICP, these tasks are performed efficiently by simple hardware, which allows the DSPCPU to continue with more complex tasks. The ICP can operate as either a memory-to-memory or a memory-to-PCI coprocessor device. In memory-to-memory mode, the ICP can perform either horizontal or vertical image filtering and resizing. A high quality algorithm is used (5-tap polyphase filter in each direction). Filtering or scaling is done in either the horizontal or vertical direction in one pass. Two invocations of the ICP are required to filter or resize in both directions. In memory-to-PCI mode, the ICP can perform horizontal resizing followed by color-space conversion. For example, assume an n × m pixel array is to be displayed in a Philips Semiconductors Overview PC Screen In SDRAM Image 2 Y FrameMaker 5 File Edit Format View Image 1 U IMAGE 1 Y V U 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 Calendar File Edit 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 1 1 1 1 1 0 0 00 0 0 1 1 1 1 1 1 1 1 1 1 0 0 00 0 0 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 V 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Image 1 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 111 111 111 111 111 111 111 011 011 011 111 111 111 111 111 111 111 111 111 111 11 1 11 1 11 1 11 1 11 1 11 1 11 1 11 1 11 1 11 1 Image 2 ICP Figure 2-3. ICP - Windows on the PC screen and data structures in SDRAM for two live video windows. window on the PC video screen while the PC is running a graphical user interface. The first step (if necessary) would use the ICP in memory-to-memory mode to perform a vertical resizing. The second step would use the ICP in memory-to-PCI mode to perform horizontal resizing and optional colorspace conversion from YUV to RGB. While sending the final, resampled and converted pixels over the PCI bus to the video frame buffer, the ICP uses a full, per-pixel occlusion bit mask—accessed in destination coordinates—to determine which pixels are actually written to the graphics card frame buffer for display. Conditioning the transfer with the bit mask allows PNX1300 to accommodate an arbitrary arrangement of overlapping windows on the PC video screen. Figure 2-3 illustrates a possible display situation and the data structures in SDRAM that support ICP operation. On the left, the PC video screen has four overlapping windows. Two, Image 1 and Image 2, are being used to display video generated by PNX1300. The right side shows a conceptual view of SDRAM contents. Two data structures are present, one for Image 1 and the other for Image 2. Figure 2-3 represents a point in time during which the ICP is displaying Image 2. When the ICP is displaying an image (i.e., copying it from SDRAM to a frame buffer), it maintains four pointers to the SDRAM data structures. Three pointers locate the Y, U, and V data arrays, the fourth locates the per-pixel occlusion bit map. The Y, U, and V arrays are indexed by source coordinates while the occlusion bit map is accessed with screen coordinates. As the ICP generates pixels for display, it performs horizontal scaling and colorspace conversion. The final RGB pixel value is then copied to the destination address in the screen’s frame buffer only if the corresponding bit in the occlusion bit map is a ‘1’. As shown in the conceptual diagram, the occlusion bit map has a pattern of 1s and 0s corresponding to the shape of the visible area of the destination window in the frame buffer. When the arrangement of windows on the PC screen changes, modifications to the occlusion bit map is performed by PNX1300 or host resident software. It is important to note that there is no preset limit on the number and sizes of windows that can be handled by the ICP. The only limit is the available bandwidth. Thus, the ICP can handle a few large windows or many small windows. The ICP can sustain a transfer rate of 50 megapixels per second, which is more than enough to saturate PCI when transferring images to video frame buffers. 2.5.6 Variable-Length Decoder (VLD) The variable-length decoder (VLD) relieves the DSPCPU of decoding Huffman-encoded video data streams. It can be used to help decode high bitrate MPEG-1 and MPEG2 video streams. The lower bitrate of videoconferencing can be adequately handled by DSPCPU software without coprocessor. The VLD is a memory-to-memory coprocessor. The DSPCPU hands the VLD a pointer to a Huffman-encoded bit stream, and the VLD produces a tokenized bit stream that is very convenient for the PNX1300 image decompression software to use. The format of the output token stream is optimized for the MPEG-2 decompression software so that communication between the DSPCPU and VLD is minimized. PRELIMINARY SPECIFICATION 2-5 PNX1300/01/02/11 Data Book 2.5.7 Audio In and Audio Out Units The Audio In (AI) and Audio Out (AO) units are similar to the video units. They connect to most serial ADC and DAC chips, and are programmable enough to handle most serial bit protocols. These units can transfer MSB or LSB first and left or right channel first. The audio sampling clock is driven by PNX1300 and is software programmable within a wide range. Like the VO unit, AI and AO sample rates are separately and dynamically programmable. The high-quality on-chip sample clock generator circuits allows the programmer subtle control over the sampling frequency so that audio and video synchronization can be achieved in any system configuration. When changing the sample frequency, the instantaneous phase does not change, which allows sample frequency manipulation without introducing audio or video distortion. As with the video units, the audio-in and audio-out units buffer incoming and outgoing audio data in SDRAM. The audio-in unit buffers samples in either 8- or 16-bit format, mono or stereo. The audio-out unit transfers 16- or 32-bit sample data for mono, stereo or up to 8 audio channels from memory to the external DACs. Any manipulation or mixing of sound data is performed by the DSPCPU since this processing will require only a small fraction of its processing capacity. 2.5.8 S/PDIF Out Unit The Sony/Philips Digital Interface Out (SPDO) unit allows output of a 1-bit high-speed serial data stream. The primary application is output of digital audio data in Sony/ Philips Digital Interface (S/PDIF) format to an external electrically isolated transformer. The SPDO unit can also be used as a general purpose high-speed data stream output device such as a UART. The SPDO unit supports 2-channel PCM audio, one or more Dolby Digital six-channel data streams, or one or more MPEG-1 or MPEG-2 audio streams (embedded per Project 1937). It supports arbitrary programmable sample rates independent of and asynchronous to the AO unit sample rate. 2.5.9 Synchronous Serial Interface The on-chip synchronous serial interface (SSI) is specially designed to interface to high integration analog modem frontends or ISDN frontend devices. In the analog modem case, all of the modem signal processing is performed in the PNX1300 DSPCPU. 2.5.10 I2C Interface The I2C bus is a 2-wire multi-master, multi-slave interface capable of transmitting up to 400kbit/sec. PNX1300 implements an I2C master for use in single master environments only. This interface allows PNX1300 to configure and inspect the status of I2C peripheral devices, such as video decoders, video encoders and some camera types. Philips Semiconductors 2.6 NEW IN PNX1300 (VERSUS TM-1300) PNX1300/01/02/11 offers the following improvements over the TM-1300: • • • • • • • • 2.7 PRELIMINARY SPECIFICATION NEW IN PNX1300 (VERSUS TM-1100) In addition to the features described in Section 2.6 PNX1300 offers also the following improvements over the TM-1100: • • • • • • • no external MATCHOUT to MATCHIN delay line. Video output speed improvement: up to 81 MHz. Video input speed improvement: up to 81 MHz. Prefetcheable SDRAM aperture to increase performance. See Chapter 11, “PCI Interface.” Individual powerdown capability for each coprocessor (e.g. ICP, EVO, etc.). New AO coprocessor with four separate channels and support of 16 or 32-bit samples. 8-bit samples are no longer supported. New SPDO coprocessor (for output of SPDIF and other 1-bit high-speed serial data streams) 2.8 NEW IN PNX1300 (VERSUS TM-1000) In addition to the features described in Section 2.7 PNX1300 offers also the following improvements over the TM-1000: • • • • • • • 2-6 Lower core voltage for PNX1311 (2.2V core voltage) and therefore lower power consumption. DSPCPU speed of up to 200 MHz for PNX1302. Support for 256 Mbit SDRAM organized in x16. The REFRESH counter must be changed. Refer for Section 12.11, “Refresh” in Chapter 12, “SDRAM Memory System” for details. Support for 16 and 32-bit Main Memory Interface. Bug fixes in VI message passing mode. Additional VI mode where VI_DATA[9:8] in message passing mode are not affected by the VI_DVALID signal. PCI bug fix on PCI Special Cycles. Autonomous boot in non 1:1 ratio is fixed. New DSPCPU instructions. See Appendix A, “PNX1300/01/02/11 DSPCPU Operations.” Video Output unit improvements (8-bit alpha blending, chroma keying, genlock). See Chapter 7, “Enhanced Video Out.” Capability to intermix PCI2.1 and 8-bit peripherals or ROM/Flash memories on the external bus. See Chapter 22, “PCI-XIO External I/O Bus.” An on-chip DVD authentication/descrambling coprocessor. Information available to DVD product developers on special request. Full 1149.1 boundary scan. Improved PCI DMA read performance. See Chapter 11, “PCI Interface.” Improved clock generation with new DDS blocks. DSPCPU Architecture Chapter 3 by Gert Slavenburg, Marcel Janssens 3.1 BASIC ARCHITECTURE CONCEPTS In the document the generic PNX1300 product name refers to PNX1300 Series, or the PNX1300/01/02/11 products. This section documents the system programmer or ‘bare-machine’ view of the PNX1300 CPU (or DSPCPU). 3.1.1 Register Model Figure 3-1 shows the DSPCPU’s 128 general purpose registers, r0...r127. In addition to the hardware program counter, PC, there are 4 user-accessible special purpose registers, PCSW, DPC (destination program counter), SPC (source program counter), and CCCOUNT. Table 3-1 lists the registers and their purposes. Register r0 always contains the integer value '0', corresponding to the boolean value 'FALSE' or the single-precision floating point value +0.0. Register r1 always contains the integer value '1' ('TRUE'). The programmer is NOT allowed to write to r0 or r1. Note: Writing to r0 or r1 may cause reads from r0 or r1 scheduled in adjacent clock cycles to return unpredictable values. The standard assembler prevents/ forbids the use of r0 or r1 as a destination register. Registers r2 through r127 are true general purpose registers; the hardware does not imply their use in any way, 31 though compiler or programmer conventions may assign particular roles to particular registers. The DPC and SPC relate to interrupt and exception handling and are treated in Section 3.1.4, “SPC and DPC—Source and Destination Program Counter.” The PCSW (Program Control and Status Word) register is treated in Section 3.1.3, “PCSW Overview.” CCCOUNT, the 64-bit clock cycle counter is treated in Section 3.1.5, “CCCOUNT—Clock Cycle Counter.” Table 3-1. DSPCPU registers Register Size Details r0 32 bits Always reads as 0x0; must not be used as destination of operations r1 32 bits Always reads as 0x1; must not be used as destination of operations r2–r127 PC PCSW 32 bits 126 general-purpose registers 32 bits Program counter 32 bits Program control & status word DPC 32 bits Destination program counter; latches target of taken branch that is interrupted SPC 32 bits Source program counter; latches target of taken branch that is not interrupted CCCOUNT 64 bits Counts clock cycles since reset 23 15 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 128 General-Purpose Registers • r0 & r1 fixed • r2–r127 variable • • • r0 r1 r2 r3 • • • r126 r127 31 23 15 7 0 PC PCSW System Status & Control Registers DPC 63 55 47 SPC 39 CCCOUNT Figure 3-1. PNX1300 registers. PRELIMINARY SPECIFICATION 3-1 PNX1300/01/02/11 Data Book 3.1.2 Philips Semiconductors Basic DSPCPU Execution Model 3.1.3 The DSPCPU issues one ‘long instruction’ every clock cycle. Each instruction consists of several operations (five operations for the PNX1300 microprocessor). Each operation is comparable to a RISC machine instruction, except that the execution of an operation is conditional upon the content of a general purpose register. Examples of operations are: IF r10 iadd r11 r12 → r13 (if r10 true, add r11 and r12 and write sum in r13) IF r10 ld32d(4) r15 → r16 (if r10 true, load 32 bits from mem[r15+4] into r16) IF r20 jmpf r21 r22 (if r20 true and r21 false, jump to address in r22) Each operation has a specific, known execution latency in clock cycles. For example, iadd takes 1 cycle; thus the result of an iadd operation started in clock cycle i is available for use as an argument to operations issued in cycle i+1 or later. The other operations issued in cycle i cannot use the result of iadd. The ld32d operation has a latency of 3 cycles. The result of an ld32d operation started in cycle j is available for use by other operations issued in cycle j+3 or later. Branches, such as the jmpf example above have three delay slots. This means that if a branch operation in cycle k is taken, all operations in the instructions in cycle k+1, k+2 and k+3 are still executed. In the above examples, r10 and r20 control conditional execution of the operations. Also known as ‘guarding’, here r10 and r20 contain the operation ‘guard’. See Section 3.2.1, “Guarding (Conditional Execution).” Certain restrictions exist in the choice of what operations can be packed into an instruction. For example, the DSPCPU in PNX1300 allows no more than two load/ store class operations to be packed into a single instruction. Also, no more than five results (of previously started operations) can be written during any one cycle. The packing of operations is not normally done by the programmer. Instead, the instruction scheduler (See Philips TriMedia SDE Reference Manual) takes care of converting the parallel intermediate format code into packed instructions ready for the assembler. The rules are formally described in the machine description file used by the instruction scheduler and other tools. 15 PCSW[15:0] 14 13 MSE WBE RSE 12 11 10 UNDEF CS IEN 9 PCSW Overview Figure 3-2 shows the PCSW register. The PNX1300 value of PCSW on reset is 0x800. For compatibility, any undefined PCSW fields should never be modified. Note that the DSPCPU architecture has no condition codes or integer arithmetic status flags. Integer operations that generate out-of-range results deliver an operation specific bit pattern. For examples, see dspiadd in Appendix A, “PNX1300/01/02/11 DSPCPU Operations.” Predicate operations exist that take the place of integer status flags in a classical architecture. Multiword arithmetic is supported by the ‘carry’ operation which generates a ‘0’ or ‘1’ depending on the carry that would be generated if its arguments were summed. FP-Related Fields.The IEEE mode field determines the IEEE rounding mode of all floating point operations, with the exception of a few floating point conversion operations that use fixed rounding mode. For examples, see ifixrz, ifloatrz, ifixrz, ifloatrz in Appendix A, “PNX1300/01/ 02/11 DSPCPU Operations.” The FP exception flags are ‘sticky bits’ that are set as a side effect of floating-point computations. Each floating point operation can set one or more of the flags if it incurs the corresponding exception. The flags can only be reset by direct software manipulation of the PCSW (using the writepcsw operation). The bits have the meanings shown in Table 3-2. The FP exception trap enable bits determine which FP exception flags invoke CPU exception handling. An exception is requested if the intersection of the exception flags and trap enable flags is non-zero. The acceptance and handling of exceptions is described in Section 3.5, “Special Event Handling.” BSX (Bytesex). The DSPCPU has a switchable bytesex. The BSX flag in the PCSW can be written by software. Load/store operations observe little- or big-endian byte ordering based on the current setting of BSX. IEN (Interrupt Enable). The IEN flag disables or enables interrupt processing for most interrupt sources. Only NMI (non-maskable interrupt) bypasses IEN. The acceptance and handling of interrupts is described in Section 3.5.3, “INT and NMI (Maskable and Non-Maskable Interrupts).” 8 7 6 BSX IEEE MODE OFZ 5 4 3 2 1 0 IFZ INV OVF UNF INX DBZ FP exceptions IEEE rounding mode 0 ⇒ to nearest, 1 ⇒ to zero, 2 ⇒ to positive, 3 ⇒ to negative Misaligned store exception Write back error Reserved exception Byte sex (1 ⇒ little endian) PCSW = 0x800 after RESET Count stalls (1 ⇒ Yes) Interrupt enable (1 ⇒ allow interrupts) 31 PCSW[31:16] 30 29 TRP TRP TRP MSE WBE RSE Misaligned store exception trap enable Write back error trap enable 28 27 UNDEF Reserved exception trap enable 26 25 TFE 23 UNDEFINED Trap on first exit 22 21 20 19 18 17 16 TRP OFZ TRP IFZ TRP INV TRP OVF TRP UNF TRP INX TRP DBZ FP exception trap-enable bits Figure 3-2. PNX1300 PCSW (Program Control and Status Word) register format. 3-2 PRELIMINARY SPECIFICATION Philips Semiconductors Table 3-2. PCSW FP exception flag definitions Flag INV Function Standard IEEE invalid flag OVF Standard IEEE overflow flag UNF Standard IEEE underflow flag INX Standard IEEE inexact flag DBZ Standard IEEE divide-by-zero flag OFZ ‘Output flushed to zero’ set if an operation caused a denormalized result IFZ ‘Input flushed to zero’ set if an operation was applied to one or more denormalized operands CS (Count Stalls). The CS flag determines the mode of CCCOUNT, the 64-bit clock cycle counter. If CS = ‘1’, the cycle counter increments on all clock cycles. If CS = ‘0’, the clock cycle counter only increments on non-stall cycles. See also Section 3.1.5, “CCCOUNT—Clock Cycle Counter.” After RESET, CS is set to ‘1’. MSE and TRPMSE (Misaligned-Store Exception). The MSE bit will be set when the processor detects a store operation to an address that is not aligned. For example, a 32-bit store executed with an address that is not a multiple of four will cause MSE to be set. The TRPMSE bit enables the DSPCPU to raise misaligned address exceptions. An exception is requested if the intersection of MSE and TRPMSE is non-zero. The acceptance and handling of exceptions is described in Section 3.5, “Special Event Handling.” Unaligned load operations do not cause an exception, because load operations can be speculative (i.e. their result is thrown away). When the DSPCPU generates an unaligned address, the low order address bit(s) (one bit in the case of a 16-bit load, two bits for a 32-bit load) are forced to zero and the load/store is executed from this aligned address. WBE and TRPWBE (Write Back Error). The WBE flag will be set whenever a program attempts to write back more than 5 results simultaneously. This is indicative of a programming error, likely caused by the scheduler or assembler. The TRPWBE bit enables the corresponding exception. RSE, TRPRSE (Reserved Exception). RSE and TRPRSE are reserved for diagnostic purposes and not described here. TFE (Trap on First Exit). The TFE bit is a support bit for the debugger. The TFE bit is set by the debugger prior to taking a (non-interruptible) jump to the application program. On the next interruptible jump (the first interruptible jump in the application being debugged), an exception is requested because the TFE bit is set. The acceptance and handling of exception processing is described in Section 3.5, “Special Event Handling.” It is the responsibility of the exception handler software to clear the TFE bit. The hardware does not clear or set TFE. Corner-case note: Whenever a hardware update (e.g. an exception being raised) and a software update (through writepcsw) of the PCSW coincide, the new value of the DSPCPU Architecture PCSW will be the value that is written by the writepcsw instruction, except for those bits that the hardware is currently updating (which will reflect the hardware value). 3.1.4 SPC and DPC—Source and Destination Program Counter The SPC and DPC registers are support registers for exception processing. The DPC is updated during every interruptible jump with the target address of that interruptible jump. If an exception is taken at an interruptible jump, the value in the DPC register can be used by the exception handling routine as the return address to resume the program at the place of interruption. The SPC register is updated during every interruptible jump that is not interrupted by an exception. Thus on an interrupted interruptible jump, the SPC register is not updated. The SPC register allows the exception handling routine to determine the start address of the decision tree (a block of uninterruptible, scheduled PNX1300 code) that was executing when the exception was taken (see also Section 3.5, “Special Event Handling”). Corner-case note: Whenever a hardware update (during an interruptible jump) and a software update (through writedpc or writespc) coincide, the software update takes precedence. 3.1.5 CCCOUNT—Clock Cycle Counter CCCOUNT is a 64-bit counter that counts clock cycles since RESET. Cycle counting can occur in two modes, depending on PCSW.CS. If PCSW.CS = ‘1’, the cycle count increments on every CPU clock cycle. If PCSW.CS = ‘0’, the clock cycle count only increments on non-stall CPU cycles. CCCOUNT is implemented as a master counter/slave register pair. The master 64-bit counter gets updated continuously. The value of the CCCOUNT slave register is updated with the current master cycle count during successful interruptible jumps only. The cycles and hicycles DSPCPU operations return the content of the 32 LSBs and 32 MSBs, respectively, of the slave register. This ensures that the value returned by hicycles and cycles is coherent, as long as there is no intervening interruptible jump, which makes these operations suitable for 64-bit high resolution timing from C source code programs. The curcycles DSPCPU operation returns the 32 LSBs of the master counter. The latter operation can be used for instruction cycle precise timing. When used, it must be precisely placed, probably at the assembly code level. 3.1.6 Boolean Representation The bit pattern generated by boolean valued operations (ileq, fleq etc.) is '00...00' (FALSE) or '00...01' (TRUE). When interpreting a bit pattern as a boolean value, only the LSB is taken into account, i.e. 'xx..x0' is interpreted as FALSE and 'xx..x1' is interpreted as TRUE. In particular, wherever a general purpose register is used as a ‘guard’, the LSB determines whether execution of the guarded operation takes place. PRELIMINARY SPECIFICATION 3-3 PNX1300/01/02/11 Data Book 3.1.7 Integer Representation The architecture supports the notion of 'unsigned integers' and 'signed integers.' Signed integers use the standard two’s-complement representation. Arithmetic on integers does not generate traps. If a result is not representable, the bit pattern returned is operation specific, as defined in the individual operation description section. The typical cases are: • • • Wrap around for regular add- and subtract-type operations. Clamping against the minimum or maximum representable value for DSP-type operations. Returning the least significant 32-bit value of a 64-bit result (e.g., integer/unsigned multiply). 3.1.8 Floating Point Representation The PNX1300 architecture supports single precision (32bit) IEEE-754 floating point arithmetic. All arithmetic conforms to the IEEE-754 standard in flush-to-zero mode. All floating point compute operations round according to the current setting of the PCSW IEEE mode field. The current setting of the field determines result rounding (to nearest, to zero, to positive infinity, to negative infinity). Conversions from float to integer/unsigned are available in two forms: a PCSW rounding-mode-observing form and an ANSI-C-specific-rounding form. The ANSI-Cspecific form forces round to zero regardless of the PCSW IEEE rounding mode. Conversion from integer/ unsigned to float always observes the IEEE rounding mode. Floating point exceptions are supported with two mechanisms. Each individual floating point operation (e.g. fadd) has a counterpart operation (faddflags) that computes the exception flag values. These operations can be used for precise exception identification1. The second mechanism uses the ‘sticky’ exception bits in the PCSW that collect aggregate exception events. The PCSW exception bits can selectively invoke CPU exception handling. See Section 3.5.2, “EXC (Exceptions).” Table 3-3 shows the representation choices that were made in PNX1300’s floating point implementation . 3.1.9 Addressing Modes The addressing modes shown in Table 3-4 are supported by the DSPCPU architecture (store operations allow only displacement mode). 1. 3-4 This mechanism allows precise exception identification in the context of our multi-issue microprocessor core— where many floating point operations may issue simultaneously—at the expense of additional operations generated by the compiler. It also allows the compiler to issue compute operations speculatively and compute exceptions precisely. PRELIMINARY SPECIFICATION Philips Semiconductors Table 3-3. Special Float Value Representation Item Representation +inf 0x7f800000 -inf 0xff800000 self generated qNaN 0xffffffff result of operation on any NaN argument argument | 0x00400000 (forcing the NaN to be quiet) signalling NaN never generated by PNX1300, accepted as per IEEE-754 Table 3-4. Addressing Modes Mode Suffix Applies to Name d Load & Store Displacement R[i] + R[k] r Load only Index R[i] + scaled(R[k]) x Load only Scaled index R[i] + scaled(#j) In these addressing modes, R[i] indicates one of the general purpose registers. The scale factor applied (1/2/4) is Table 3-5. Minimum values for implementationdependent addressing mode components Parameter Minimum Range ‘i’ and ‘k’ 0..127 (i.e., each implementation has at least 128 registers) ‘j’ -64..63 (i.e., displacements will be at least 7 bits long and signed) equal to the size of the item loaded or stored, i.e. 1 for a byte operation, two for a 16-bit operation and four for a 32-bit operation. The range of valid 'i', 'j' and 'k' values may differ between implementations of the architecture; the minimum values for implementation-dependent characteristics are shown in Table 3-5. Note that the assembly code specifies the true displacement, and not the value to be scaled. For example, ‘ld32d(–8) r3’ loads a 32-bit value from address (r3 – 8). This is encoded in the binary operation pattern as a –2 in the seven-bit field by the assembler. At runtime, the scale factor four is applied to reconstruct the intended displacement of –8. 3.1.10 Software Compatibility The DSPCPU architecture expressly does not support binary compatibility between family members. The ANSI C compiler ensures that all family members are compatible at the source-code level. Philips Semiconductors 3.2 INSTRUCTION SET OVERVIEW 3.2.1 Guarding (Conditional Execution) In the PNX1300 architecture, all operations can be optionally 'guarded'. A guarded operation executes conditionally, depending on the value in the ‘guard' register. For example, a guarded add is written as: IF R23 iadd R14 R10 → R13 This should be taken to mean if R23 then R13 ← R14 + R10. The ’if R23' clause controls the execution of the operation based on the LSB of R23. Hence, depending on the LSB of R23, R13 is either unchanged or set to contain the integer sum of R14 and R10. Guarding applies to all DSPCPU operations, except iimm and uimm (load-immediate). It controls the effect on all programmer-visible states of the system, i.e. register values, memory content, exception raising and device state. 3.2.2 Load and Store Operations Memory is byte addressable. Loads and stores must be ‘naturally aligned’, i.e. a 16-bit load or store must target an address that is a multiple of 2. A 32-bit load or store must target an address that is a multiple of 4. The BSX bit in the PCSW determines the byte order of loads and stores. For example, see ld32 and st32 in Appendix A, “PNX1300/01/02/11 DSPCPU Operations.” Only 32-bit load and store operations are allowed to access MMIO registers in the MMIO address aperture (see Section 3.4, “Memory and MMIO”). The results are undefined for other loads and stores. A load from a non-existent MMIO register returns an undefined result. A store to a non-existent MMIO register times out and then does not happen. There are no other side effects of an access to a nonexistent MMIO register. The state of the BSX bit has no effect on the result of MMIO accesses. Loads are allowed to be issued speculatively. Loads outside the range of valid data memory addresses for the active process return an implementation-dependent value and do not generate an exception. Misaligned loads also return an implementation dependent value and do not generate an exception. If a pair of memory operations involves one or more common bytes in memory, the effect on the common bytes is as defined in Table 3-6. Table 3-4 shows the supported addressing modes. The minimum values of implementation-dependent addressing-mode components are shown in Table 3-5. Note: The index and scaled-index modes are not allowed with store opcodes, due to the hardware DSPCPU Architecture Table 3-6. Behavior of loads and stores with coincident addresses Condition Behavior Tstore < Tload If a store is issued before a load, the value loaded contains the new bytes. Tload < Tstore If a load is issued before a store, the value loaded contains the old bytes. Tstore1 < Tstore2 If store1 is issued before store2, the resulting value contains the bytes of store2. Tstore = Tload If a load and store are issued in the same clock cycle, the result is UNDEFINED. Tstore1 = Tstore2 If two stores are issued in the same clock cycle, the resulting stored value is undefined. restriction that each operation have at most 2 source operand registers and 1 condition register. Stores use 1 operand register for the value to be stored leaving only 1 register to form an address. The scale factor applied (1/2/4) in the scaled addressing modes is equal to the size of the item loaded or stored, i.e. 1 for a byte operation, 2 for a 16-bit operation and 4 for a 32-bit operation. Table 3-7 lists the available load and store mnemonics for the three addressing modes. Table 3-7. Load and store mnemonics Operation Displacement Index ScaledIndex 8-bit signed load ild8d ild8r — 8-bit unsigned load uld8d uld8r — 16-bit signed load ild16d ild16r ild16x 16-bit unsigned load uld16d uld16r uld16x 32-bit load ld32d ld32r ld32x 8-bit store st8d — — 16-bit store st16d — — 32-bit store st32d — — Example usage of load and store operations: IF r10 ild16d(12) r12 → r13 If the LSB of r10 is set, load 16 bits starting at address (r12+12) using the byte ordering indicated in PCSW.BSX, sign-extend the value to 32 bits and store the result in r13. IF r10 st32d(40) r12 r13 If the LSB of r10 is set, store the 32-bit value from r13 to the address (r12+40) using the byte ordering indicated in PCSW.BSX. PRELIMINARY SPECIFICATION 3-5 PNX1300/01/02/11 Data Book 3.2.3 Philips Semiconductors Compute Operations 3.2.5 Compute operations are register-to-register operations. The specified operation is performed on one or two source registers and the result is written to the destination register. Immediate Operations. Immediate operations load an immediate constant (specified in the opcode) and produce a result in the destination register. Floating-Point Compute Operations. Floating-point compute operations are register-to-register operations. The specified operation is performed on one or two source registers and the result is written to the destination register. Unless otherwise mentioned all floating point operations observe the rounding mode bits defined in the PCSW register. All floating-point operations not ending in ‘flags’ update the PCSW exception flags. All operations ending in ‘flags’ compute the exception flags as if the operation were executed and return the flag values (in the same format as in the PCSW); the exception flags in the PCSW itself remain unchanged. Multimedia Operations. These special compute operations are like normal compute operations, but the specified operations are not usually found in general purpose CPUs. These operations provide special support for multimedia applications. 3.2.4 Control-flow operations change the value of the program counter. Conditional jumps test the value in a register and, based on this value, change the program counter to the address contained in a second register or continue execution with the next instruction. Unconditional jumps always change the program counter to the specified immediate address. Control-flow operations can be interruptible or non-interruptible. Execution of an interruptible jump is the only occasion where PNX1300 allows special event handling to take place (see Section 3.5, “Special Event Handling”). 3.3 Issue time constraints: • • an operation implies a need for a functional unit type (as documented in Appendix A, “PNX1300/01/02/11 DSPCPU Operations.”) each operation requires an issue slot that has an instance of the appropriate functional unit type attached issue slot 1 issue slot 2 issue slot 3 issue slot 4 issue slot 5 CONST CONST CONST CONST CONST ALU ALU ALU ALU ALU SHIFTER SHIFTER FCOMP DMEM DMEM FALU DSPMUL DSPMUL FALU DMEMSPEC BRANCH BRANCH BRANCH IFMUL IFMUL FTOUGH (latency 17, recovery 16) DSPALU DSPALU Figure 3-3. PNX1300 issue slots, functional units, and latency. 3-6 PNX1300 INSTRUCTION ISSUE RULES The PNX1300 VLIW CPU allows issue of 5 operations in each clock cycle according to a set of specific issue rules. The issue rules impose issue time constraints and a result writeback constraint. Any set of operations that meets all constraints constitutes a legal PNX1300 instruction. A more extensive description and a few special case issue rules and limitations can be found in the Philips TriMedia SDE documentation. Special-Register Operations Special register operations operate on the special registers: PCSW, DPC, SPC and CCCOUNT. Control-Flow Operations PRELIMINARY SPECIFICATION Philips Semiconductors • functional units should be ‘recovered’ from any prior operation issues Writeback constraint: • No more than 5 results should be simultaneously written to the register file at any point in time (writeback occurs ‘latency’ cycles after issue) Figure 3-3 shows all functional units of PNX1300, including the relation to issue slots, and each functional unit’s latency (e.g. 1 for CONST, 3 for FALU, etc.). With the exception of FTOUGH, each functional unit can accept an operation every clock cycle, i.e. has a recovery time of 1. The binding of operations to functional unit types is summarized in Table 3-8. In Appendix A, “PNX1300/01/02/ 11 DSPCPU Operations”, each operation lists the precise functional unit and unit latency. Table 3-8. Functional unit operations unit type operation category const immediate operations alu 32-bit arithmetic, logical, pack/unpack dspalu dual 16-bit, quad 8-bit multimedia arithmetic dspmul dual 16-bit and quad 8-bit multimedia multiplies dmem loads/stores dmemspec cache coherency, cache control, prefetch shifter multi-bit shift branch control flow falu floating point arithmetic & conversions ifmul 32-bit integer and floating point multiplies fcomp single cycle floating point compares ftough iterative floating point square root and division 3.4 MEMORY AND MMIO PNX1300 defines four apertures in its 32-bit address space: the memory hole, the DRAM aperture, the MMIO aperture and the PCI apertures (See Figure 3-4).The memory hole covers addresses 0..0xff. The DRAM and MMIO apertures are defined by the values in MMIO registers; the PCI apertures consist of every address that does not fall in the other three apertures. 3.4.1 Memory Map DRAM is mapped into an aperture extending from the address in DRAM_BASE to the address in DRAM_LIMIT. The maximum DRAM aperture size is 64 MB. DSPCPU Architecture not overlap; if they do, the consequences are undefined. The values of DRAM_BASE, DRAM_LIMIT, and MMIO_BASE are set during the boot process. In the case of a PCI host assisted boot, the values are determined by the host BIOS. In case of standalone boot (i.e., PNX1300 is the PCI host), the values are taken from the boot ROM. Refer to Chapter 13, “System Boot” for details. DSPCPU update of DRAM_BASE and MMIO_BASE is possible, but not recommended, see Section 11.6.3, “MMIO/DRAM_BASE updates.” 3.4.2 The Memory Hole The memory hole from address 0 to 0xff serves to protect the system from performance loss due to speculative loads. Due to the nature of C program references, most speculative loads issued by the DSPCPU fall in the range covered by the hole. Activated by default upon RESET, the hole serves to ensure that these speculative loads do NOT cause PCI read accesses and slow down the system. The value returned by any data load from the hole is 0. The hole only protects loads. Store operations in the hole do cause writes to PCI, SDRAM or MMIO as determined by the aperture base address values. If the SDRAM aperture overlaps the memory hole, the memory hole is ignored. The hole can be temporarily disabled through the DC_LOCK_CTL register. This is described in Section 5.3.8, “Memory Hole and PCI Aperture Disable.” 3.4.3 MMIO Memory Map Devices are controlled through memory-mapped device registers, referred to as MMIO registers. To ensure compatibility with future devices, any undefined MMIO bits should be ignored when read, and written as ‘0’s. Some devices can autonomously access data memory (DMA) and most devices can cause CPU interrupts. The 2-MB MMIO aperture is initially located at address 0xEFE00000 on RESET; it is relocated by the PCI BIOS 0xFFFF FFFFF PCI 2 MB MMIO Aperture MMIO_BASE PCI DRAM_LIMIT DRAM Aperture The MMIO aperture is located at address MMIO_BASE and is a fixed 2-MB size. In the default operating mode, al l memory accesses not going to either the hole, DRAM or MMIO space are interpreted as PCI accesses. This behavior can be overridden as described in Section 5.3.8, “Memory Hole and PCI Aperture Disable.” The MMIO aperture and the DRAM aperture can be at any naturally aligned location, in any order, but should 1 MB - 64 MB DRAM_BASE PCI 0x0000 0000 256byte hole Figure 3-4. PNX1300 memory map. PRELIMINARY SPECIFICATION 3-7 PNX1300/01/02/11 Data Book Philips Semiconductors for PC-hosted PNX1300 boards; its final location is determined by the boot EEPROM for standalone systems. See Chapter 13, “System Boot” for more information. Figure 3-5 gives a detailed overview of the MMIO memory map (addresses used are offsets with respect to the MMIO base). The operating system on PNX1300 can change MMIO_BASE by writing to the MMIO_BASE MMIO location. User programs should not attempt this. Refer to the TriMedia SDE Reference Manual for the standard method to access the device registers from C language device drivers. terrupts: ISETTING, IPENDING, ICLEAR, IMASK and the interrupt vectors. The timer MMIO locations are described in Section 3.8, “Timers.” The instruction and data breakpoint are described in Section 3.9, “Debug Support.” The MMIO locations of each device are treated in the respective device chapters. Only 32-bit load and store operations are allowed to access MMIO registers in the MMIO address aperture. The results are undefined for other loads and stores. Reads from non-existent MMIO registers return undefined values. Writes to nonexistent MMIO registers time out. There are no side effects of accesses to nonexistent MMIO registers. The state of the PCSW BSX bit has no effect on the result of MMIO accesses. With the exception of RESET, which is enabled at all times, the architecture of the DSPCPU allows special event handling to begin only during an interruptible jump operation (ijmpt, ijmpf or ijmpi) that succeeds (i.e., is a taken jump). EXC, NMI and INT handling can be initiated during handling of an EXC or an INT, butonly during successful interruptible jumps. The Icache tag and LRU bit access aperture give the DSPCPU read-only access to the Icache status. Refer to Section 5.4.8, “Reading Tags and Cache Status” for details. Table 3-9. Special Events and Event Vectors The EXCVEC MMIO location is explained in Section 3.5.2, “EXC (Exceptions).” Section 3.5.3, “INT and NMI (Maskable and Non-Maskable Interrupts),” describes the locations that deal with the setup and handling of in- 0x1F FFFFF Reserved for Future Use 0x10 3800 0x10 3400 0x10 3000 0x10 2C00 0x10 2800 0x10 2400 0x10 2000 0x10 1C00 0x10 1800 0x10 1400 0x10 1000 0x10 0C00 0x10 0800 0x10 0400 0x10 0000 JTAG interface I2C interface PCI interface SSI interface VLD coprocessor Image coprocessor Audio Out Audio In Video Out Video In Debug support Timers Vectored interrupt controller MMIO base Main memory, cache control Reserved for Future Use 0x01 0000 0x00 0000 Icache tags & LRU (r/o) 3.5 SPECIAL EVENT HANDLING The PNX1300 microprocessor responds to the special events shown in Table 3-9, ordered by priority. Event Vector RESET (Highest priority) vector to DRAM_BASE EXC (All exceptions) vector to EXCVEC (programmable) NMI, INT (Non-maskable interrupt, maskable interrupt) use the programmed vector (one of 32 vectors depending on the interrupt source) 0x10 1200 0x10 1000 data breakpoints instruction breakpoints 0x10 0C60 0x10 0C40 0x10 0C20 0x10 0C00 systimer timer3 timer2 timer1 0x10 08Fc 0x10 08F8 intvec31 intvec30 0x10 0888 0x10 0884 0x10 0880 intvec2 intvec1 intvec0 0x10 0828 0x10 0824 0x10 0820 0x10 081C 0x10 0818 0x10 0814 0x10 0810 0x10 0800 imask iclear ipending isetting3 isetting2 isetting1 isetting0 excvec 0x10 0400 MMIO_BASE 0x10 0004 0x10 0000 DRAM_LIMIT DRAM_BASE Figure 3-5. Memory map of MMIO address space (addresses are offset from MMIO_BASE). 3-8 PRELIMINARY SPECIFICATION Philips Semiconductors DSPCPU Architecture The instruction scheduler uses interruptible jumps exclusively for inter-decision tree jumps. Hence, within a decision tree, no special-event processing can be initiated. If a tree-to-tree jump is taken, special-event processing is allowed. Since the only registers live at this point (i.e., that contain useful data) are the global registers allocated by the ANSI C compiler, only a subset of the registers needs to be preserved by the event handlers. Refer to the TriMedia SDE Reference Manual for details on which registers can be in use. The DSPCPU register state can be described by the contents of this subset of general purpose registers and the contents of the PCSW and the DPC value (the target of the inter-tree jump). The priority resolution mechanism built into the DSPCPU hardware dispatches the highest-priority, non-masked special-event request at the time of a successful interruptible jump operation. In view of the simple, real-timeoriented nature of the mechanisms provided, only limited nesting of events should be allowed. 3.5.1 RESET RESET is the highest priority special event. It is asserted by external hardware or by the host CPU. PNX1300 will respond to it at any time. External hardware reset through the TRI_RESET# pin initiates boot protocol execution as described in Chapter 13, “System Boot.” This causes the current PC value to be lost and instruction execution to start from address DRAM_BASE. 1. DPC is assigned the intended destination address of the successful jump. 2. Instruction processing starts at EXCVEC. All other actions are the responsibility of the EXC handler software. Note that no other special event processing will take place until the handler decides to execute an interruptible jump that succeeds. 3.5.3 INT and NMI (Maskable and NonMaskable Interrupts) The on-chip Vectored Interrupt Controller (VIC) provides 32 INT request input hardware lines. The interrupt controller prioritizes and maps attention requests from several different peripherals onto successive INT requests to the DSPCPU. INT special event processing will occur under the following conditions: 1. RESET is de-asserted. 2. The intersection PCSW[15,6:0] & PCSW[31,22:16] is empty and PCSW.TFE is not set. 3. The intersection of IPENDING and IMASK is nonempty. 4. The interrupt is at level NMI or PCSW.IEN = 1. 5. A successful interruptible jump is in the final jump execution stage. DSPCPU hardware takes the following actions on the initiation of NMI or INT processing: A PCI host CPU can perform a PNX1300 DSPCPU-only reset by an MMIO write to the BIU_CTL.SR and CR bits. Such a reset does not cause a full boot, instead the DSPCPU resumes execution from DRAM_BASE. 1. DPC gets assigned the intended destination address of the successful jump. 2. Instruction processing starts at the appropriate interrupt vector. 3.5.2 All other actions are the responsibility of the INT handler software. Note that no other special event processing will take place until the handler decides to execute an interruptible jump that succeeds. EXC (Exceptions) The DSPCPU enters EXC special-event processing under the following conditions: 1. RESET is de-asserted. 2. The intersection PCSW[15,6:0] & PCSW[31,22:16] is non-empty or PCSW.TFE is set. 3. A successful interruptible jump is in the final jump execution stage. DSPCPU hardware takes the following actions on the initiation of EXC processing: 3.5.3.1 Interrupt vectors Each of the 32 interrupt sources can be assigned an arbitrary interrupt vector (the address of the first instruction of the interrupt handler). A vector is setup by writing the address to one of the MMIO locations shown in Figure 3-6. The state of the MMIO vector locations is undefined after RESET. (Addresses of the MMIO vector registers are offset with respect to MMIO_BASE.) MMIO_BASE offset: 0x10 08FC 0x10 08F8 INTVEC31 (r/w) INTVEC30 (r/w) Source 31 vector Source 30 vector • • • • • • • • • 0x10 0888 0x10 0884 0x10 0880 INTVEC2 (r/w) INTVEC1 (r/w) INTVEC0 (r/w) Source 2 vector Source 1 vector Source 0 vector 31 0 Figure 3-6. Interrupt vector locations in MMIO address space. PRELIMINARY SPECIFICATION 3-9 PNX1300/01/02/11 Data Book Philips Semiconductors ister, with a ‘1’ in the bit position(s) corresponding to the desired acknowledge flags. Programmer’s note: See the Philips TriMedia Cookbook (Book 2 of TriMedia SDE documentation) for information on writing interrupt handlers. 3.5.3.2 Programmers note: the store operation that performs the interrupt acknowledge should be issued at least 2 cycles before the (interruptible) jump that ends an interrupt handler. This ensures that the same interrupt is not dispatched twice due to request de-assertion clock delays. Interrupt modes DSPCPU interrupt sources can be programmed to operate in either level-sensitive or edge-triggered mode. Operation in edge-triggered or level-sensitive mode is determined by a bit in the ISETTING MMIO locations corresponding to the source, as defined in Figure 3-7. On RESET, all ISETTING registers are cleared. 3.5.3.4 Each interrupt source can be programmed to request one out of eight levels of priorities. The highest priority level (level 7) corresponds to requesting an NMI—an interrupt that cannot be masked by the DSPCPU PCSW.IEN bit. The other levels request regular interrupts, that can be masked as a group by the PCSW.IEN flag. Level six represents the highest priority normal interrupt level and level zero represents the lowest. Refer to Figure 3-7 for details of programming the priority level. In edge-triggered mode, the leading edge of the signal on the device interrupt request line causes the VIC (Vectored Interrupt Controller) to set the interrupt pending flag corresponding to the device source number. Note that, for active high signals, the leading edge is the positive edge, whereas for active low request signals (such as PCI INTA#), the negative edge is the leading edge. The interrupt remains pending until one of two events occurs: • • The VIC arbitrates the highest-priority pending interrupt requestor. Sources programmed to request at the same level are treated with a fixed priority, from source number 0 (highest) to 31 (lowest). At such time as the DSPCPU is willing to process special events, the vector of highest priority NMI source will be dispatched. If no NMI is pending, and the DSPCPU allows regular interrupts (PCSW.IEN is asserted), the vector of the highest priority regular source is dispatched. Once a vector is dispatched, the corresponding interrupt pending flag is deasserted (edge triggered mode sources only). The VIC successfully dispatches the vector corresponding to the source to the PNX1300 CPU, or PNX1300 CPU software clears the interrupt-pending flag by a direct write to the ICLEAR location. No interrupt acknowledge to ICLEAR is needed for devices operating in edge-triggered mode, since the vector dispatch clears the IPENDING request. The device itself may however need a device-specific interrupt acknowledge to clear the requesting condition. Edge-triggered mode is not recommended for devices that can signal multiple simultaneous interrupt conditions. The on-chip timers must be operated in edge triggered mode. 3.5.3.5 Device interrupt acknowledge All devices capable of generating level-triggered interrupts have interrupt acknowledge bits in their memory mapped control registers for this purpose. An interrupt acknowledge is performed by a store to such control reg- Each interrupt source device typically has its own interrupt enable flag(s) that determine whether certain key MMIO_BASE offset: 0x10 081C ISETTING3 (r/w) MP31 MP30 MP29 MP28 MP27 MP26 MP25 MP24 0x10 0818 ISETTING2 (r/w) MP23 MP22 MP21 MP20 MP19 MP18 MP17 MP16 0x10 0814 ISETTING1 (r/w) MP15 MP14 MP13 MP12 MP11 MP10 MP9 MP8 0x10 0810 ISETTING0 (r/w) MP7 MP6 MP5 MP4 MP3 MP2 MP1 MP0 31 27 23 19 15 Each MP Field: 0xxx source operates in edge-triggered mode 1xxx source operates in level-sensitive mode Figure 3-7. Interrupt mode and priority MMIO locations and formats. 3-10 Interrupt masking A single MMIO register (IMASK in Figure 3-8) allows masking of an arbitrary subset of the interrupt sources. Masking applies to both regular as well as NMI level requestors. Masking is used by software to disable unused devices and/or to implement nested interrupt handling. In the latter case, each interrupt handler can stack the old IMASK content for later restoration and insert a new mask that only allows the interrupts it is willing to handle. For level-triggered device handlers, IMASK should also exclude the device itself to prevent repeated handler activation. In level-sensitive mode, the device requests an interrupt by asserting the VIC source request line. The device holds the request until the device interrupt handler performs a device interrupt acknowledge. It is highly recommended that all off-chip and on-chip sources, with the exception of the timers, operate in level-sensitive mode. 3.5.3.3 Interrupt priorities PRELIMINARY SPECIFICATION 11 7 Each MP x111 x110 ... x000 3 0 Field: NMI (highest) priority maskable level 6 maskable level 0 Philips Semiconductors DSPCPU Architecture The ICLEAR register reads the same as the IPENDING register. Writes to the ICLEAR register serve to clear pending flags for edge-triggered mode sources. All IPENDING flags corresponding to bit positions in which ‘1’s are written are cleared. IPENDING flags corresponding to bit positions in which ‘0’s are written are not affected. Writes have no effect on level-sensitive mode sources. When a pending interrupt bit is being cleared through a write to the ICLEAR register at the same time that the hardware is trying to set that interrupt bit, the hardware takes precedence. device events lead to the request of an interrupt. In addition, the PCSW.IEN flag determines whether the DSPCPU is willing to handle regular interrupts. Non maskable interrupts ignore the state of this flag. All three mechanisms are necessary: the PCSW.IEN flag is used to implement critical sections of code during which the RTOS (real-time operating system) is unable to handle regular interrupts. The IMASK is used to allow full control over interrupt handler nesting. The device interrupt flags set the operational mode of the device. When RESET is asserted, IPENDING, ICLEAR, and IMASK are set to all zeroes. (MMIO register addresses shown in Figure 3-8 are offset addresses with respect to MMIO_BASE.) 3.5.3.6 3.5.3.7 Software interrupts and acknowledgment The IPENDING register shown in Figure 3-8 can be read to observe the currently pending interrupts. Each bit read depends on the mode of the source: • • 3.5.3.8 Software can request an interrupt for sources operating in edge-triggered mode. Writes to the IPENDING register assert an interrupt request for all sources where a 1 occurred in the bit position of the written value. The state of sources where a 0 occurred in the written value is unchanged. Writes have no effect on level-sensitive mode sources. The interrupt request, if not masked, will occur at the next successful interruptible jump. This differs from the conventional software interrupt-like semantics of many architectures. Any of the 32 sources can be requested in software. In normal operation however, software-requested interrupts should be limited to source vectors not allocated for hardware devices. Note that another PCI master can request interrupts by manipulating the IPENDING location in the MMIO aperture. This is useful for inter-processor communication. 31 Interrupt source assignment Table 3-10 shows the assignment of devices to interrupt source numbers, as well as the recommended operating mode (edge or level triggered). Note that there are a total of 5 external pins available to assert interrupt requests. The PCI INTA to INTD requests are asserted by active low signal conventions, i.e. a zero level or a negative edge asserts a request. The USERIRQ pin operates with active high signalling conventions. For a level-sensitive source, a bit value corresponds to the current state of the device interrupt request line. For an edge-triggered interrupt, a ‘1’ is read if and only if an interrupt request occurred and the corresponding vector has not yet been dispatched. MMIO_BASE offset: 0x10 0828 NMI sequentialization In most applications, it is desirable not to nest NMIs. The NMI interrupt handler can accomplish this by saving the old IMASK content and clearing IMASK before the first interruptible jump is executed by the NMI handler. 3.6 PNX1300 TO HOST INTERRUPTS In systems where PNX1300 is operating in the presence of a host CPU on PCI, PNX1300 can generate interrupts to the host, using any combination of the four PCI INTA# to INTD# pins. In a typical host system, only one of these pins needs to be wired to the PCI bus interrupt request lines. Any unused pins of this group are then available for use as software programmable I/O pins. The INT_CTL register (see Figure 3-9) IEx bits, when set, enable the open collector driver of the four INTD#..INTA# pins. The INTx bits determine the output value generated (if enabled). A ‘1’ in INTx causes the corresponding PCI interrupt pin to be asserted (low INTx# pin). The ISx bits are read-only and reflect the cur- 23 15 7 0 IMASK (r/w) Each IMASK(i) bit: On read or write, 0 ⇒ disallow source i interrupt request On read or write, 1 ⇒ allow source i interrupt request 0x10 0824 ICLEAR (r/w) Each ICLEAR(i) bit: On read, same as IPENDING(i) On write, 1 ⇒ clear source i interrupt request 0x10 0820 IPENDING (r/w) Each IPENDING(i) bit: On read, 1 ⇒ source i interrupt request is pending On write, 1 ⇒ software source i interrupt request Figure 3-8. Interrupt controller request, clear, and mask MMIO registers. PRELIMINARY SPECIFICATION 3-11 PNX1300/01/02/11 Data Book MMIO_BASE offset: 0x10 3038 INT_CTL (r/w) Philips Semiconductors 31 27 23 19 15 11 7 3 0 IS[D:A] IE[D:A] INT[D:A] Figure 3-9. Host interrupt control register Table 3-10. Interrupt source assignments SOURCE NAME SRC NUM MODE SOURCE DESCRIPTION PCI INTA 0 level PCI_INTA# pin signal PCI INTB 1 level PCI_INTB# pin signal PCI INTC 2 level PCI_INTC# pin signal 3.7 HOST TO PNX1300 INTERRUPTS A host CPU can generate an interrupt to PNX1300 in several ways: • • by a PCI MMIO write to IPENDING to assert the HOSTCOMM interrupt (bit 28) by a hardware circuit that asserts one of the interrupt request pins TRI_USERIRQ, or INTA..INTD. PCI INTD 3 level PCI_INTD# pin signal TRI_USERIRQ 4 either external general-purpose pin TIMER1 5 edge general-purpose timer TIMER2 6 edge general-purpose timer 3.8 TIMER3 7 edge general-purpose timer SYSTIMER 8 edge reserved for debugger VIDEOIN 9 level video in block VIDEOOUT 10 level video out block AUDIOIN 11 level audio in block The DSPCPU contains four programmable timer/ counters, all with the same function. The first three (TIMER1, TIMER2, TIMER3) are intended for general use. The fourth timer/counter (SYSTIMER) is reserved for use by the system software and should not be used by applications. AUDIOOUT 12 level audio out block ICP 13 level image coprocessor VLD 14 level VLD coprocessor SSI 15 level SSI interface PCI 16 level PCI BIU (DMA, etc.; see Table 11-14 for possible interrupt causes) IIC 17 level I 2C interface JTAG 18 level JTAG interface t.b.d. 19..24 SPDO t.b.d. 25 reserved for future devices level 26..27 SPDO block reserved for future devices HOSTCOM 28 edge (software) host communication APP 29 edge (software) application DEBUGGER 30 edge (software) debugger RTOS 31 edge (software) RTOS rent actual state of the pins. Note that the pins have negative logic (active low) polarity and are of the open collector output type. Hence the pin voltage is low (active) when the logical value set or seen in the INT_CTL register is a ‘1’. The assertion and de-assertion of host interrupts is the responsibility of PNX1300 software. See also Section 11.6.17, “INT_CTL Register.” 3-12 PRELIMINARY SPECIFICATION The first and most common method requires no circuitry and leaves the interrupt pins available for other purposes. TIMERS Each timer has three registers as shown in Figure 3-10. The MMIO register addresses shown are offset addresses with respect to the timer’s base address. Each timer/counter can be set to count one of the event types specified in Table 3-12. Note that the DATABREAK event is special, in that the timer/counter may increment by zero, one or two in each clock cycle. For all other event types, increments are by zero or one. The CACHE1 and CACHE2 events serve as cache performance monitoring support. The actual event selected for CACHE1 and CACHE2 is determined by the MEM_EVENTS MMIO register, see Section 5.7, “Performance Evaluation Support.” If a PNX1300 pin signal (VICLK, etc.) is selected as an event, positive-going edges on the signal are counted. Each timer increments its value until the modulus is reached. On the clock cycle where the incremented value would equal or exceed the modulus, the value wraps around to zero or one (in the case of an increment by two), and an interrupt is generated as defined in Table 3-10. The timer interrupt source mode should be set as edge-sensitive. No software interrupt acknowledge to the timer device is necessary. Counting starts and continues as long as the run bit is set. Loading a new modulus does not affect the contents of the value register. If a store operation to either the modulus or value register results in value and modulus being the same, no interrupt will be generated. If the run bit is set, the next value will be modulus+1 or modulus+2, and Philips Semiconductors DSPCPU Architecture Timer base offset: 0 TMODULUS (r/w) 4 TVALUE (r/w) 8 TCTL (r/w) 31 27 23 19 15 11 7 3 0 MODULUS VALUE PRESCALE “PRESCALE”: Prescale value is 2^PRESCALE, i.e., in the range [1..32768] SOURCE “SOURCE” select: see table Table 3-12 R “RUN” bit: 0 Timer stopped 1 Timer running Figure 3-10. Timer register definitions. Table 3-11. Timer base MMIO address TIMER1 MMIO_BASE+0x10,0C00 TIMER2 MMIO_BASE+0x10,0C20 TIMER3 MMIO_BASE+0x10,0C40 SYSTIMER MMIO_BASE+0x10,0C60 Table 3-12. Timer source selections Source Name Source Bits Value Source Description CLOCK 0 PRESCALE 1 CPU clock prescaled CPU clock TRI_TIMER_CLK 2 external clock pin DATABREAK 3 data breakpoints INSTBREAK 4 instruction breakpoints CACHE1 5 cache event 1 CACHE2 6 cache event 2 VI_CLK 7 video in clock pin VO_CLK 8 video out clock pin AI_WS 9 audio in word strobe pin AO_WS 10 audio out word strobe pin SSI_RXFSX 11 SSI receive frame sync pin 12 SSI transmit frame sync pin SSI_IO2 — 13-15 undefined 3.9 DEBUG SUPPORT This section describes the special debug support offered by the DSPCPU. Instruction and data breakpoints can be defined through a set of registers in the MMIO register space. When a breakpoint is matched, an event is generated that can be used as a timer source (see Section 3.8, “Timers”). The timer TMODULUS has to be set to generate a DSPCPU interrupt after the desired number of breakpoint matches. 3.9.1 Instruction Breakpoints The instruction-breakpoint control register is shown in Figure 3-11. On RESET, the BICTL register is cleared. (MMIO-register addresses shown are offset with respect to MMIO_BASE.) The instruction-breakpoint address-range registers are shown in Figure 3-12. After RESET, the value of these registers is undefined. (MMIO-register addresses shown are offset with respect to MMIO_BASE.) When the IC bit in the breakpoint control register is set to ‘1’, instruction breakpoints are activated. Any instruction address issued by the PNX1300 chip is compared against the low and high address-range values. The IAC bit in the breakpoint control register determines whether the instruction address needs to be inside or outside of the range defined by the low and high address-range registers. A successful comparison takes place when either: IAC = ‘0’ and low ≤ iaddr ≤ high, or IAC = ‘1’ and iaddr < low or iaddr > high. the counter will have to loop around before an interrupt is generated. • • A modulus value of zero causes a wrap-around as if the modulus value was 232. On a successful comparison, an instruction breakpoint event is generated, which can be used as a clock input to a timer. After counting the programmed number of instruction breakpoint events, the timer will generate an interrupt request. On RESET, the TCTL registers are cleared, and the value of the TMODULUS and TVALUE registers is undefined. PRELIMINARY SPECIFICATION 3-13 PNX1300/01/02/11 Data Book MMIO_BASE offset: 0x10 1000 Philips Semiconductors 31 27 23 19 15 11 7 3 0 BICTL (r/w) IC ‘IAC’ Instruction address control: 0 Breakpoint if address inside range 1 Breakpoint if address outside range ‘IC’ Instruction control bit: 0 Disable instruction breakpoints 1 Enable instruction breakpoints Figure 3-11. Instruction-breakpoint control register. MMIO_BASE offset: 0x10 1004 BINSTLOW (r/w) Address Range Start 0x10 1008 BINSTHIGH (r/w) Address Range End 31 27 23 19 15 11 7 3 0 11 7 3 0 Figure 3-12. Instruction-breakpoint address-range registers. MMIO_BASE offset: 0x10 1030 BDATAALOW (r/w) 0x10 1034 BDATAAHIGH (r/w) 0x10 1038 BDATAVAL (r/w) 0x10 103C BDATAMASK (r/w) 31 27 23 19 15 Address Range Start Address Range End Data Breakpoint Value Data Breakpoint Value Mask Figure 3-13. Data-breakpoint address-range and value-compare registers. 3.9.2 When the DC bits in the data breakpoint control register are not set to ‘0’, data breakpoints are activated. When the value of the DC bits is ‘1’ or ‘3’, any data address from load operations (if the BL bit is set) and/or store operations (if the BS bit is set) issued by the DSPCPU is compared against the low and high address-range values. The DAC bit in the breakpoint control register determines whether data addresses need to be inside or outside of the range defined by the low and high address-range registers. A successful comparison occurs when either: Data Breakpoints The data-breakpoint address-range and compare-value registers are shown in Figure 3-13. After RESET, the value of the data breakpoint registers is undefined. (MMIOregister addresses shown are offset with respect to MMIO_BASE.) The data-breakpoint control register is shown in Figure 3-14. On RESET, the BDCTL register is cleared. (The register address shown is offset with respect to MMIO_BASE.) MMIO_BASE offset: 0x10 1020 31 27 • • 23 19 15 11 BDCTL (r/w) ‘DVC’ Data Value Control: 0 Breakpoint if data equal 1 Breakpoint if data not equal ‘BS’ Break on Store: 0 Don’t check data stores 1 Do check data stores 7 3 0 BS BL DC ‘DAC’ Data Address Control: 0 Breakpoint if address inside range 1 Breakpoint if address outside range ‘BL’ Break on Load: 0 Don’t check data loads 1 Do check data loads ‘DC’ Data Control: 0 No checking 1 Check data addresses 2 Check data values 3 Check data value and addresses Figure 3-14. Data-breakpoint control register. 3-14 DAC = ‘0’ and low ≤ daddr ≤ high, or DAC = ‘1’ and daddr < low or daddr > high. PRELIMINARY SPECIFICATION Philips Semiconductors Note that this comparison works for all addresses regardless of the aperture to which they belong. When the value of the DC bits is ‘2’ or ‘3’, any data value from load operations (if the BL bit is set) and/or store operations (if the BS bit is set) issued by the PNX1300 CPU is compared against the value in the BDATAVAL register. Only the bits for which the corresponding BDATAMASK register bits are set to ‘1’ will be used in the comparison. The DVC bit in the breakpoint control register determines whether the data value needs to be equal or not equal to the comparison value. A successful comparison occurs when either of the following are true: • • DVC = ‘0’ and (data & BDATAMASK) = (BDATAVAL & BDATAMASK). DVC = ‘1’ and (data & BDATAMASK) != (BDATAVAL & BDATAMASK). DSPCPU Architecture Note: use a nonzero datamask or the result is undefined. When a successful comparison has taken place, a data breakpoint event is generated, which can be used as a clock input to a timer. After counting the set number of data breakpoint events, the timer will generate an interrupt request. When the value of the DC bits is ‘3’, a data breakpoint event is generated if and only if a successful comparison occurs on both address and data simultaneously. Note that up to two data breakpoint events can occur per clock cycle, due to the dual load/store capability of the CPU and data cache. PRELIMINARY SPECIFICATION 3-15 PNX1300/01/02/11 Data Book 3-16 PRELIMINARY SPECIFICATION Philips Semiconductors Custom Operations for Multimedia Chapter 4 by Gert Slavenburg, Pieter v.d. Meulen, Yong Cho, Sang-Ju Park 4.1 CUSTOM OPERATIONS OVERVIEW In this document, the generic PNX1300 name refers to the PNX1300 Series, or the PNX1300/01/02/11 products. Custom operations in the PNX1300 DSPCPU architecture are specialized, high-function operations designed to dramatically improve performance in important multimedia applications. When properly incorporated into application source code, custom operations enable an application to take advantage of the highly parallel PNX1300 microprocessor implementation. Achieving a similar performance increase through other means— e.g., executing a higher number of traditional microprocessor instructions per cycle—would be prohibitively expensive for PNX1300’s low-cost target applications. Custom operations are simple to understand and consistent in their definition, but their unusual functions make it difficult for automatic code generation algorithms to use them effectively. Consequently, custom operations are inserted into source code by the programmer. To make this process as painless as possible, custom operation syntax is consistent with the C programming language, and, just as with all other operations generated by the compiler, the scheduler takes care of register allocation, operation packing, and flow analysis. 4.1.1 Custom Operation Motivation For both general-purpose and embedded microprocessor-based applications, programming in a high-level language is desirable. To effectively support optimizing compilers and a simple programming model, certain microprocessor architecture features are needed, such as a large, linear address space, general-purpose registers, and register-to-register operations that directly support the manipulation of linear address pointers. A common choice in microprocessor architectures is 32-bit linear addresses, 32-bit registers, and 32-bit integer operations. PNX1300 is such a microprocessor architecture. For the data manipulation in many algorithms, however, 32-bit data and operations are wasteful of expensive silicon resources. Important multimedia applications, such as the decompression of MPEG video streams, spend significant amounts of execution time dealing with eightbit data items. Using 32-bit operations to manipulate small data items makes inefficient use of 32-bit execution hardware in the implementation. If these 32-bit resources could be used instead to operate on four eight-bit data items simultaneously, performance would be improved by a significant factor with only a tiny increase in implementation cost. Getting the highest execution rate from standard microprocessor resources is one of the motivations behind custom operations in PNX1300. A range of custom operations is provided that each processes—simultaneously—four 8-bit or two 16-bit data items. There is little cost difference between a standard 32-bit ALU and one that can process either one pair of 32-bit operands or four pairs of eight-bit operands, but there is a big performance difference for PNX1300’s target applications. PNX1300’s custom operations go beyond simply making the best use of standard resources. Some custom operations combine several simple operations. These combinations are tailored specifically to the needs of important multimedia applications. Some high-function custom operations eliminate conditional branches, which helps the scheduler make effective use of all five operation slots in each PNX1300 instruction. Filling up all five slots is especially important in the inner loops of computational intensive multimedia applications. In short, custom operations help PNX1300 reach its goals of extremely high multimedia performance at the lowest possible cost. 4.1.2 Introduction to Custom Operations Table 4-1 and Table 4-2 contain two listings of the custom operations available in the PNX1300 architecture. Table 4-1 groups the custom operations by type of function while Table 4-2 lists the operations by operand size. For more detailed information about the custom operations, Appendix A, “PNX1300/01/02/11 DSPCPU Operations.” Some operations exist in several versions that differ in the treatment of their operands and results, and the mnemonics for these versions make it easy to select the appropriate operation. For example, the sum of products operations all have “fir” in their mnemonics; the prefix and suffix of the mnemonic expresses the treatment of the operands and result. The ifir8ii operation treats both of its operands as signed (ifir8ii) and produces a signed result (ifir8ii). The ifir8iu operation treats its first operand as signed (ifir8iu), the second as unsigned (ifir8i u), and produces a signed result (ifir8iu). The ume8ii operation implements an eight-bit motion-estimation; it treats both operands as signed but produces an unsigned result. The operations beginning with “dsp” implement a clipping (sometimes called saturating) function before storPRELIMINARY SPECIFICATION 4-1 PNX1300/01/02/11 Data Book Philips Semiconductors Table 4-1. Key Multimedia Custom Operations Listed by Function Type Function Custom Op Description DSP absolute value dspiabs Clipped signed 32-bit absolute value dspidualabs Dual clipped absolute values of signed 16-bit halfwords Shift dualasr dual-16 arithmetic shift right Clip dualiclipi dual-16 clip signed to signed dualuclipi dual-16 clip signed to unsigned quadumax Unsigned bytewise quad max quadumin Unsigned bytewise quad min Min,max DSP add DSP multiply DSP subtract Sum of products Merge, pack dspiadd Clipped signed 32-bit add Table 4-2. Key Multimedia Custom Operations Listed by Operand Size Op. Size 32-bit Custom Op dspiabs Description Clipped signed 32-bit abs value dspuadd Clipped unsigned 32-bit add dspiadd Clipped signed 32-bit add dspidualadd Dual clipped add of signed 16bit halfwords dspuadd Clipped unsigned 32-bit add dspimul Clipped signed 32-bit multiply dspuquadaddui Quad clipped add of unsigned/ signed bytes dspumul Clipped unsigned 32-bit multiply dspimul Clipped signed 32-bit multiply dspisub Clipped signed 32-bit subtract dspumul Clipped unsigned 32-bit multiply dspusub Clipped unsigned 32-bit subtract dspidualmul Dual clipped multiply of signed 16-bit halfwords mergedual16lsb Merge dual-16 least-significant bytes dspisub Clipped signed 32-bit subtract dualasr dual-16 arithmetic shift right dspusub Clipped unsigned 32-bit subtract dualiclipi dual-16 clip signed to signed dspidualsub Dual clipped subtract of signed 16-bit halfwords dualuclipi dual-16 clip signed to unsigned dspidualmul ifir16 Signed sum of products of signed 16-bit halfwords Dual clipped multiply of signed 16-bit halfwords dspidualabs ifir8ii Signed sum of products of signed bytes Dual clipped absolute values of signed 16-bit halfwords dspidualadd ifir8iu Signed sum of products of signed/unsigned bytes Dual clipped add of signed 16bit halfwords dspidualsub ufir16 Unsigned sum of products of unsigned 16-bit halfwords Dual clipped subtract of signed 16-bit halfwords ifir16 ufir8uu Unsigned sum of products of unsigned bytes Signed sum of products of signed 16-bit halfwords ufir16 Unsigned sum of products of unsigned 16-bit halfwords pack16lsb Pack least-significant 16-bit halfwords pack16msb Pack most-significant 16-bit halfwords mergedual16lsb Merge dual-16 least-significant bytes mergelsb Merge least-significant bytes mergemsb Merge most-significant bytes pack16lsb Pack least-significant 16-bit halfwords pack16msb Pack most-significant 16-bit halfwords packbytes Pack least-significant bytes Byte averages quadavg Unsigned byte-wise quad average Byte multiplies quadumulmsb Unsigned quad 8-bit multiply most significant Motion estimation ume8ii Unsigned sum of absolute values of signed 8-bit differences ume8uu Unsigned sum of absolute values of unsigned 8-bit differences 4-2 ing the result(s) in the destination register. Otherwise, their naming follows the rules given above where appropriate. For example, the dspuquadaddui operation implements four 8-bit additions; it treats the first operand of each addition as unsigned, the second operand as signed, and produces an unsigned result for each addition. Each result, which is computed with no loss of precision, is clipped into the representable range of a byte (0..255). PRELIMINARY SPECIFICATION 16-bit Philips Semiconductors Custom Operations for Multimedia Table 4-2. Key Multimedia Custom Operations Listed by Operand Size Memory Location 31 Op. Size 8-bit Custom Op Description 0 n+0: a b c d 31 i m Unsigned bytewise quad max n+4: e f g h b f j n quadumin Unsigned bytewise quad min n+8: i j k l c g k o dspuquadaddui Quad clipped add of unsigned/ signed bytes m n o p d h l p ifir8ii Signed sum of products of signed bytes ifir8iu Signed sum of products of signed/unsigned bytes ufir8uu Unsigned sum of products of unsigned bytes mergelsb Merge least-significant bytes mergemsb Merge most-significant bytes packbytes Pack least-significant bytes quadavg Unsigned byte-wise quad average quadumulmsb Unsigned quad 8-bit multiply most significant ume8ii Unsigned sum of absolute values of signed 8-bit differences ume8uu Unsigned sum of absolute values of unsigned 8-bit differences Example Uses of Custom Ops The next three sections illustrate the advantages of using custom operations. Also, the more complex examples illustrate how custom operations can be integrated into application code by providing listings of C-language program fragments. The examples progress in complexity from simple to intricate; the most interesting examples are taken from actual multimedia codes, such as MPEG decompression. 4.2 e quadumax n+12: Transpose Row Major 4.1.3 0 a EXAMPLE 1: BYTE-MATRIX TRANSPOSITION The goal of this example is to provide a simple, introductory illustration of how custom operations can significantly increase processing speed in small kernels of applications. As in most uses of custom operations, the power of custom operations in this case comes from their ability to operate on multiple data items in parallel. Imagine that our task is to transpose a packed, 4-by-4 matrix of bytes in memory; the matrix might, for example, contain 8-bit pixel values. Figure 4-1 illustrates both the organization of the matrix in memory and the task to be performed in standard mathematical notation. Performing this operation with traditional microprocessor instructions is straight forward but time consuming. One way to perform the manipulation is to perform 12 loadbyte instructions (since only 12 of the 16 bytes need to be repositioned) and 12 store-byte instructions that place the bytes back in memory in their new positions. Another way would be to perform four load-word instructions, re- a e i m b f j n c g k o d h l p Column Major Transpose a b c d e f g h i j k l m n o p Figure 4-1. Byte-matrix transposition. Top shows byte matrices packed into memory words; bottom shows mathematical matrix representation. position the bytes in registers, and then perform four store-word instructions. Unfortunately, repositioning the bytes in registers would require a large number of instructions to properly shift and mask the bytes. Performing the 24 loads and stores makes implicit use of the shifting and masking hardware in the load/store units and thus yields a shorter instruction sequence. The problem with performing 24 loads and stores is that loads and stores are inherently slow operations because they must access at least the cache and possibly slower layers in the memory hierarchy. Further, performing byte loads and stores when 32-bit word-wide accesses run just as fast wastes the power of the cache/memory interface. We would prefer a fast algorithm that takes full advantage of cache/memory bandwidth while not requiring an inordinate number of byte-manipulation instructions. PNX1300 has instructions that merge and pack bytes and 16-bit halfwords directly and in parallel. Four of these instructions can be applied in this case to speed up the manipulation of bytes that are packed into words. Figure 4-2 shows the application of these instructions to the byte-matrix transposition problem, and the left side of Figure 4-3 shows a list of the operations needed to implement the matrix transpose. When assembled into actual PNX1300 instructions, these custom operations would be packed as tightly as dependencies allow, up to five operations per instruction. Note that a programmer would not need to program at this level (PNX1300 assembler). The matrix transpose would be expressed just as efficiently in C-language source code, as shown on the right side of Figure 4-3. The low-level code is shown here for illustration purposes only. The first sequence of four load-word operations in Figure 4-3 brings the packed words of the input matrix into registers R10, R11, R12, and R13. The next sequence of four merge operations produces intermediate results into registers R14, R15, R16, and R17. The next sequence of four pack operations could then replace the original operands or place the transposed matrix in separate registers if the original matrix operands were needPRELIMINARY SPECIFICATION 4-3 PNX1300/01/02/11 Data Book Philips Semiconductors ld32d(0) r100 → r10 ld32d(4) r100 → r11 ld32d(8) r100 → r12 ld32d(12) r100 → r13 char matrix[4][4]; . . . int *m = (int *) matrix; mergemsb r10 r11 → r14 mergemsb r12 r13 → r15 mergelsb r10 r11 → r16 mergelsb r12 r13 → r17 pack16msb r14 r15 → r18 pack16lsb r14 r15 → r19 pack16msb r16 r17 → r20 pack16lsb r16 r17 → r21 temp0 temp1 temp2 temp3 m[0] m[1] m[2] m[3] st32d(0) r101 r18 st32d(4) r101 r19 st32d(8) r101 r20 st32d(12) r101 r21 = = = = = = = = MERGEMSB(m[0], m[1]); MERGEMSB(m[2], m[3]); MERGELSB(m[0], m[1]); MERGELSB(m[2], m[3]); PACK16MSB(temp0, temp1); PACK16LSB(temp0, temp1); PACK16MSB(temp2, temp3); PACK16LSB(temp2, temp3); . . . Figure 4-3. On the left is a complete list of operations to perform the byte-matrix transposition of Figure 4-1 and Figure 4-2. On the left is an equivalent C-language fragment. ed for further computations (the PNX1300 optimizing C compiler performs this analysis automatically). In this example, the transpose matrix is placed in registers R18, R19, R20, and R21. The final four store-word operations put the transposed matrix back into memory. Thus, using the PNX1300 custom operations, the bytematrix transposition requires four load-word operations and four store-word operations (the minimum possible) and eight register-to-register data-manipulation operations. The result is 16 operations, or byte-matrix transposition at the rate of one operation per byte. While the advantage of the custom-operation-based algorithm over the brute-force code that uses 24 load- and store-byte instruction seems to be only eight operations (a 33% reduction), the advantage is actually much greater. First, using custom operations, the number of memory references is reduced from 24 to eight (a factor of three). Since memory references are slower than register-to-register operations (such as the custom operations in this example), the reduction in memory references is significant. Further, the ability of the PNX1300 VLIW compilation system to exploit the performance potential of the PNX1300 microprocessor hardware is enhanced by the custom-operation-based code. This is because it is easier for the compilation system to produce an optimal schedule (arrangement) of the code when the number of memory references is in balance with the number of register-to-register operations. The PNX1300 CPU (like all high-performance microprocessors) has a limit on the number of memory references that can be processed in a single cycle (two is the current limit). A long sequence of code that contains only memory references can result in empty operation slots in the long PNX1300 instructions. Empty operation slots waste the performance potential of the PNX1300 hardware. As this example has shown, careful use of custom operations has the potential to not only reduce the absolute number of operations needed to perform a computation but can also help the compilation system produce code that fully exploits the performance potential of the PNX1300 CPU. 4.3 EXAMPLE 2: MPEG IMAGE RECONSTRUCTION The complete MPEG video decoding algorithm is composed of many different phases, each with computational intensive kernels. One important kernel deals with reconstructing a single image frame given that the forwardand backward-predicted frames and the inverse discrete cosine transform (IDCT) results have already been computed. This kernel provides an excellent opportunity to illustrate of the power of PNX1300’s specialized custom operators. In the code fragments that follow, the backward-predicted block is assumed to have been computed into an array back[], the forward-predicted block is assumed to have been computed into forward[], and the IDCT results are assumed to have been computed into idct[]. Row Major a e i m b f j n c g k o d h l p Column Major mergemsb a e b f pack16msb mergemsb i m j n pack16lsb mergelsb c g d h pack16msb mergelsb k o l p pack16lsb a b c d e f g h i j k l m n o p Figure 4-2. Application of merge and pack instructions to the byte-matrix transposition of Figure 4-1. 4-4 PRELIMINARY SPECIFICATION Philips Semiconductors Custom Operations for Multimedia void reconstruct (unsigned char *back, unsigned char *forward, char *idct, unsigned char *destination) { int i, temp; for (i = 0; i < 64; i += 1) { temp = ((back[i] + forward[i] + 1) >> 1) + idct[i]; if (temp > 255) temp = 255; else if (temp < 0) temp = 0; destination[i] = temp; } } Figure 4-4. Straightforward code for MPEG frame reconstruction. A straightforward coding of the reconstruction algorithm might look as shown in Figure 4-4. This implementation shares many of the undesirable properties of the first example of byte-matrix transposition. The code accesses memory a byte at a time instead of a word at a time, which wastes 75% of the available bandwidth. Also, in light of the many quad-byte-parallel operations introduced in Section 4.1.2, “Introduction to Custom Operations,” it seems inefficient to spend three separate additions and one shift to process a single eight-bit pixel. Perhaps even more unfortunate for a VLIW processor like PNX1300 is the branch-intensive code that performs the saturation testing; eliminating these branches could reap a significant performance gain. After some experience is gained with custom operations, it is not necessary to unroll loops to discover situations where custom operations are useful. Often, a good programmer with knowledge of the function of the custom operations can see by simple inspection opportunities to exploit custom operations. Since MPEG decoding is the kind of task for which PNX1300 was created, there are two custom operations—quadavg and dspuquadaddui—that exactly fit this important MPEG kernel (and other kernels). These custom operations process four pairs of 8-bit pixel values in parallel. In addition, dspuquadaddui performs saturation tests in hardware, which eliminates any need to execute explicit tests and branches. takes arguments in registers rsrc1 and rsrc2, and it computes a result into register rdest. rsrc1 = [abcd], rsrc2 = [wxyz], and rdest = [pqrs] where a, b, c, d, w, x, y, z, p, q, r, and s are all unsigned eight-bit values. Then, quadavg computes the output vector [pqrs] as follows: For readers familiar with the details of MPEG algorithms, the use of eight-bit IDCT values later in this example may be confusing. The standard MPEG implementation calls for nine-bit IDCT values, but extensive analysis has shown that values outside the range [–128..127] occur so rarely that they can be considered unimportant. Pursuant to this observation, the IDCT values are clipped into the eight-bit range [–128..127] with saturating arithmetic before the frame reconstruction code runs. The assumption that this saturation occurs permits some of PNX1300’s custom operations to have clean, simple definitions. The first step in seeing how custom operations can be of value in this case, is to unroll the loop by a factor of four. The unrolled code is shown in Figure 4-5. This creates code that is parallel with respect to the four pixel computations. As it is easily seen in the code, the four groups of computations (one group per pixel) do not depend on each other. To understand how quadavg and dspuquadaddui can be used in this code, we examine the function of these custom operations. The quadavg custom operation performs pixel averaging on four pairs of pixels in parallel. Formally, the operation of quadavg is as follows: quadavg rscr1 rsrc2 -> rdest p q r s = = = = (a (b (c (d + + + + w x y z + + + + 1) 1) 1) 1) >> >> >> >> 1 1 1 1 The pixel averaging in Figure 4-5 is evident in the first statement of each of the four groups of statements. The rest of the code—adding idct[i] value and performing the saturation test—can be performed by the dspuquadaddui operation. Formally, its function is as follows: dspuquadaddui rsrc1 rsrc2 -> rdest takes arguments in registers rsrc1 and rsrc2, and it computes a result into register rdest. rsrc1 = [efgh], rsrc2 = [stuv], and rdest = [ijkl] where e, f, g, h, i, j, k, and l are unsigned 8-bit values; s, t, u, and v are signed 8-bit values. Then, dspuquadaddui computes the output vector [ijkl] as follows: i j k l = = = = uclipi(e uclipi(f uclipi(g uclipi(h + + + + s, t, u, v, 255) 255) 255) 255) The uclipi operation is defined in this case as it is for the separate PNX1300 operation of the same name described in Appendix A, “PNX1300/01/02/11 DSPCPU Operations,”. Its definition is as follows: PRELIMINARY SPECIFICATION 4-5 PNX1300/01/02/11 Data Book Philips Semiconductors void reconstruct (unsigned char *back, unsigned char *forward, char *idct, unsigned char *destination) { int i, temp; for (i = 0; i < 64; i += 4) { temp = ((back[i+0] + forward[i+0] + 1) >> 1) + idct[i+0]; if (temp > 255) temp = 255; else if (temp < 0) temp = 0; destination[i+0] = temp; temp = ((back[i+1] + forward[i+1] + 1) >> 1) + idct[i+1]; if (temp > 255) temp = 255; else if (temp < 0) temp = 0; destination[i+1] = temp; temp = ((back[i+2] + forward[i+2] + 1) >> 1) + idct[i+2]; if (temp > 255) temp = 255; else if (temp < 0) temp = 0; destination[i+2] = temp; temp = ((back[i+3] + forward[i+3] + 1) >> 1) + idct[i+3]; if (temp > 255) temp = 255; else if (temp < 0) temp = 0; destination[i+3] = temp; } } Figure 4-5. MPEG frame reconstruction code using PNX1300 custom operations; compare with Figure 4-4. uclipi (m, n) { if (m < 0) return 0; else if (m > n) return n; else return m; } To make is easier to see how these operations can subsume all the code in Figure 4-5, Figure 4-6 shows the same code rearranged to group the related functions. Now it should be clear that the quadavg operation can replace the first four lines of the loop assuming that we can get the individual 8-bit elements of the back[] and forward[] arrays positioned correctly into the bytes of a 32bit word. That, of course, is easy: simply align the byte arrays on word boundaries and access them with word (integer) pointers. Similarly, it should now be clear that the dspuquadaddui operation can replace the remaining code (except, of course, for storing the result into the destination[] array) assuming, as above, that the 8-bit elements are aligned and packed into 32-bit words. Figure 4-7 shows the new code. The arrays are now accessed in 32-bit (int-sized) chunks, the loop iteration control has been modified to reflect the ‘four-at-a-time’ operations, and the quadavg and dspuquadaddui operations have replaced the bulk of the loop code. Finally, Figure 4-8 shows a more compact expression of the loop code, eliminating the temporary variable. Note that PNX1300 C compiler does the optimization by itself. Again, note that the code in Figure 4-7 and Figure 4-8 assumes that the character arrays are 32-bit word 4-6 PRELIMINARY SPECIFICATION aligned and padded if necessary to fill an integral number of 32-bit words. The original code required three additions, one shift, two tests, three loads, and one store per pixel. The new code using custom operations requires only two custom operations, three loads, and one store for four pixels, which is more than a factor of six improvement. The actual performance improvement can be even greater depending on how well the compiler is able to deal with the branches in the original version of the code, which depends in part on the surrounding code. Reducing the number of branches almost always improves the chances of realizing maximum performance on the PNX1300 CPU. The code in Figure 4-8 illustrates several aspects of using custom operations in C-language source code. First, the custom operations require no special declarations or syntax; they appear to be simple function calls. Second, there is no need to explicitly specify register assignments for sources, destinations, and intermediate results; the compiler and scheduler assign registers for custom operations just as they would for built-in language operations such as integer addition. Third, the scheduler packs custom operations into PNX1300 VLIW instructions as effectively as it packs operations generated by the compiler for native language constructs. Thus, although the burden of making effective use of custom operations falls on the programmer, that burden consists only of discovering the opportunities for exploiting the operations and then coding them using standard C-language notation. The compiler and scheduler take care of the rest. Philips Semiconductors Custom Operations for Multimedia void reconstruct (unsigned char *back, unsigned char *forward, char *idct, unsigned char *destination) { int i, temp0, temp1, temp2, temp3; for (i = 0; { temp0 = temp1 = temp2 = temp3 = i < 64; i += 4) ((back[i+0] ((back[i+1] ((back[i+2] ((back[i+3] + + + + forward[i+0] forward[i+1] forward[i+2] forward[i+3] + + + + 1) 1) 1) 1) >> >> >> >> 1); 1); 1); 1); temp0 += idct[i+0]; if (temp0 > 255) temp0 = 255; else if (temp0 < 0) temp0 = 0; temp1 += idct[i+1]; if (temp1 > 255) temp1 = 255; else if (temp1 < 0) temp1 = 0; temp2 += idct[i+2]; if (temp2 > 255) temp2 = 255; else if (temp2 < 0) temp2 = 0; temp3 += idct[i+3]; if (temp3 > 255) temp3 = 255; else if (temp3 < 0) temp3 = 0; destination[i+0] destination[i+1] destination[i+2] destination[i+3] = = = = temp 0; temp1; temp2; temp3; } } Figure 4-6. Re-grouped code of Figure 4-5. void reconstruct (unsigned char *back, unsigned char *forward, char *idct, unsigned char *destination) { int i, temp; int int int int *i_back *i_forward *i_idct *i_dest = = = = (int (int (int (int *) *) *) *) back; forward; idct; destination; for (i = 0; i < 16; i += 1) { temp = QUADAVG(i_back[i], i_forward[i]); temp = DSPUQUADADDUI(temp, i_idct[i]); i_dest[i] = temp; } } Figure 4-7. Using the custom operation dspquadaddui to speed up the loop of Figure 4-6. 4.4 EXAMPLE 3: MOTION-ESTIMATION KERNEL Another part of the MPEG coding algorithm is motion estimation. The purpose of motion estimation is to reduce the cost of storing a frame of video by expressing the contents of the frame in terms of adjacent frames. A given frame is reduced to small blocks, and a subsequent frame is represented by specifying how these small blocks change position and appearance; usually, storing the difference information is cheaper than storing a whole block. For example, in a video sequence where the camera pans across a static scene, some frames can be expressed simply as displaced versions of their predecessor frames. To create a subsequent frame, most blocks are simply displaced relative to the output screen. The code in this example is for a match-cost calculation, a small kernel of the complete motion-estimation code. As with the previous example, this code provides an excellent example of how to transform source code to make the best use of PNX1300’s custom operations. PRELIMINARY SPECIFICATION 4-7 PNX1300/01/02/11 Data Book Philips Semiconductors void reconstruct (unsigned char *back, unsigned char *forward, char *idct, unsigned char *destination) { int i; int int int int *i_back *i_forward *i_idct *i_dest = = = = (int (int (int (int *) *) *) *) back; forward; idct; destination; for (i = 0; i < 16; i += 1) i_dest[i] = DSPUQUADADDUI(QUADAVG(i_back[i], i_forward[i]), i_idct[i]); } Figure 4-8. Final version of the frame-reconstruction code. unsigned char A[16][16]; unsigned char B[16][16]; . . . for (row = 0; row < 16; row += 1) { for (col = 0; col < 16; col += 1) cost += abs(A[row][col] – B[row][col]); } Figure 4-9. Match-cost loop for MPEG motion estimation. unsigned char A[16][16]; unsigned char B[16][16]; . . . for (row = 0; row < 16; row += 1) { for (col = 0; col < 16; col += 4) { cost += abs(A[row][col+0] – B[row][col+0]); cost += abs(A[row][col+1] – B[row][col+1]); cost += abs(A[row][col+2] – B[row][col+2]); cost += abs(A[row][col+3] – B[row][col+3]); Figure 4-10. Unrolled, but not parallel, version of the loop from Figure 4-9. Figure 4-9 shows the original source code for the matchcost loop. Unlike the previous example, the code is not a self-contained function. Somewhere early in the code, the arrays A[][] and B[][] are declared; somewhere between those declarations and the loop of interest, the arrays are filled with data. 4.4.1 A Simple Transformation First, we will look at the simplest way to use a PNX1300 custom operation. We start by noticing that the computation in the loop of Figure 4-9 involves the absolute value of the difference of two unsigned characters (bytes). By now, we are familiar with the fact that PNX1300 includes a number of operations that process all four bytes in a 32-bit word simultaneously. Since the match-cost calculation is fundamental to the MPEG algorithm, it is not surprising to find 4-8 PRELIMINARY SPECIFICATION a custom operation—ume8uu—that implements this operation exactly. To understand how ume8uu can be used in this case, we need to transform the code as in the previous example. Though the steps are presented here in detail, a programmer with a even a little experience can often perform these transformations by visual inspection. To use a custom operation that processes 4 pixel values simultaneously, we first need to create 4 parallel pixel computations. Figure 4-10 shows the loop of Figure 4-9 unrolled by a factor of 4. Unfortunately, the code in the unrolled loop is not parallel because each line depends on the one above it. Figure 4-11 shows a more parallel version of the code from Figure 4-10. By simply giving each computation its own cost variable and then summing the costs all at once, each cost computation is completely independent. Philips Semiconductors Custom Operations for Multimedia unsigned char A[16][16]; unsigned char B[16][16]; . . . for (row = 0; row < 16; row += 1) { for (col = 0; col < 16; col += 4) { cost0 = abs(A[row][col+0] – B[row][col+0]); cost1 = abs(A[row][col+1] – B[row][col+1]); cost2 = abs(A[row][col+2] – B[row][col+2]); cost3 = abs(A[row][col+3] – B[row][col+3]); cost += cost0 + cost1 + cost2 + cost3; Figure 4-11. Parallel version of Figure 4-10. unsigned char unsigned char . . . unsigned char unsigned char A[16][16]; B[16][16]; *CA = A; *CB = B; for (row = 0; row < 16; row += 1) { int rowoffset = row * 16; for (col = 0; col < 16; col { cost0 = abs(CA[rowoffset cost1 = abs(CA[rowoffset cost2 = abs(CA[rowoffset cost3 = abs(CA[rowoffset += 4) + + + + col+0] col+1] col+2] col+3] – – – – CB[rowoffset CB[rowoffset CB[rowoffset CB[rowoffset + + + + col+0]); col+1]); col+2]); col+3]); cost += cost0 + cost1 + cost2 + cost3; Figure 4-13. The loop of Figure 4-11 recoded with one-dimensional array accesses. Excluding the array accesses, the loop body in Figure 4-11 is now recognizable as the function performed by the ume8uu custom operation: the sum of 4 absolute values of 4 differences. To use the ume8uu operation, however, the code must access the arrays with 32-bit word pointers instead of with 8-bit byte pointers. Figure 4-13 shows the loop recoded to access A[][] and B[][] as one-dimensional instead of two-dimensional arrays. We take advantage of our knowledge of C-language array storage conventions to perform this code transformation. Recoding to use one-dimensional arrays prepares the code for transformation to 32-bit array accesses. (From here on, until the final code is shown, the declarations of the A and B arrays will be omitted from the code fragments for the sake of brevity.) unsigned int *IA = (unsigned int *) A; unsigned int *IB = (unsigned int *) B; for (i = 0; i < 64; i += 1) cost += UME8UU(IA[i], IB[i]); Figure 4-12. The loop of Figure 4-14 with the inner loop eliminated. Figure 4-14 shows the loop of Figure 4-13 recoded to use ume8uu. Once again taking advantage of our knowledge of the C-language array storage conventions, the one-dimensional byte array is now accessed as a one-dimensional 32-bit-word array. The declarations of the pointers IA and IB as pointers to integers is the key, but also notice that the multiplier in the expression for row offset has been scaled from 16 to 4 to account for the fact that there are 4 bytes in a 32-bit word. Of course, since we are now using one-dimensional arrays to access the pixel data, it is natural to use a single for loop instead of two. Figure 4-12 shows this streamlined version of the code without the inner loop. Since Clanguage arrays are stored as a linear vector of values, we can simply increase the number of iterations of the outer loop from 16 to 64 to traverse the entire array. The recoding and use of the ume8uu operation has resulted in a substantial improvement in the performance of the match-cost loop. In the original version, the code executed 1280 operations (including loads, adds, subtracts, and absolute values); in the restructured version, there are only 256 operations—128 loads, 64 ume8uu operations, and 64 additions. This is a factor of five reduction in the number of operations executed. Also, the PRELIMINARY SPECIFICATION 4-9 PNX1300/01/02/11 Data Book Philips Semiconductors unsigned int *IA = (unsigned int *) A; unsigned int *IB = (unsigned int *) B; for (row = 0; row < 16; row += 1) { int rowoffset = row * 4; for (col4 = 0; col4 < 4; col4 += 1) cost += UME8UU(IA[rowoffset + col4], IB[rowoffset + col4]); } Figure 4-14. The loop of Figure 4-13 recoded with 32-bit array accesses and the ume8uu custom operation. overhead of the inner loop has been eliminated, further increasing the performance advantage. 4.4.2 More Unrolling The code transformations of the previous section achieved impressive performance improvements, but given the VLIW nature of the PNX1300 CPU, more can be done to exploit PNX1300’s parallelism. The code in Figure 4-12 has a loop containing only 4 operations (excluding loop overhead). Since PNX1300’s branches have a 3-instruction delay and each instruction can contain up to 5 operations, a fully utilized minimumsized loop can contain 16 operations (20 minus loop overhead). The PNX1300 compilation system performs a wide variety of powerful code transformation and scheduling optimizations to ensure that the VLIW capabilities of the CPU are exploited. It is still wise, however, to make program parallelism explicit in source code when possible. Explicit parallelism can only help the compiler produce a fast running program. To this end, we can unroll the loop of Figure 4-12 some number of times to create explicit parallelism and help the compiler create a fast running loop. In this case, where the number of iterations is a power-of-two, it makes sense to unroll by a factor that is a power-of-two to create clean code. Figure 4-15 shows the loop unrolled by a factor of eight. The compiler can apply common sub-expression elimination and other optimizations to eliminate extraneous operations in the array indexing, but, again, improvements in the source code can only help the compiler produce the best possible code and fastest-running program. Figure 4-16 shows one way to modify the code for simpler array indexing. unsigned int *IA = (unsigned int *) A; unsigned int *IB = (unsigned int *) B; for (i = 0; { cost0 = cost1 = cost2 = cost3 = cost4 = cost5 = cost6 = cost7 = i < 64; i += 8) UME8UU(IA[i+0], UME8UU(IA[i+1], UME8UU(IA[i+2], UME8UU(IA[i+3], UME8UU(IA[i+4], UME8UU(IA[i+5], UME8UU(IA[i+6], UME8UU(IA[i+7], IB[i+0]); IB[i+1]); IB[i+2]); IB[i+3]); IB[i+4]); IB[i+5]); IB[i+6]); IB[i+7]); cost += cost0 + cost1 + cost2 + cost3 + cost4 + cost5 + cost6 + cost7; } Figure 4-15. Unrolled version of Figure 4-12. This code makes good use of PNX1300’s VLIW capabilities. unsigned char A[16][16]; unsigned char B[16][16]; . . . unsigned int *IA = (unsigned int *) A; unsigned int *IB = (unsigned int *) B; for (i = 0; 8) { cost0 = cost1 = cost2 = cost3 = cost4 = cost5 = cost6 = cost7 = i < 64; i += 8, IA += 8, IB += UME8UU(IA[0], UME8UU(IA[1], UME8UU(IA[2], UME8UU(IA[3], UME8UU(IA[4], UME8UU(IA[5], UME8UU(IA[6], UME8UU(IA[7], IB[0]); IB[1]); IB[2]); IB[3]); IB[4]); IB[5]); IB[6]); IB[7]); cost += cost0 + cost1 + cost2 + cost3 + cost4 + cost5 + cost6 + cost7; } Figure 4-16. Code from Figure 4-15 with simplified array index calculations. 4-10 PRELIMINARY SPECIFICATION Cache Architecture Chapter 5 by Eino Jacobs 5.1 MEMORY SYSTEM OVERVIEW In this document, the generic PNX1300 name refers to the PNX1300 Series, or the PNX1300/01/02/11 products. The separate on-chip data and instruction caches serve only the DSPCPU since the data access patterns of the autonomous I/O and graphics units exhibit little or no locality of reference (they access each piece of the multimedia data stream only once in each operation). The high-performance video and audio throughput of PNX1300 is implemented by its DSPCPU and autonomous I/O and co-processing units, but the foundation of this processing is the PNX1300 memory hierarchy. To get the full potential of the chip’s processing units, the memory hierarchy must read and write data (and DSP CPU instructions) fast enough to keep the units busy. Without the caches, the CPU would not be able to achieve its performance potential. SDRAM has enough bandwidth to handle serial streams of multimedia data, but its bandwidth and latency are insufficient to satisfy the CPU’s high rate of random data accesses and repeated instruction accesses. To meet the requirements of its target applications, PNX1300’s memory hierarchy must satisfy the conflicting goals of low cost, simple system design (e.g., low parts count), and high performance. Since multimedia video streams can require relatively large temporary storage, a significant amount of external DRAM is required. Minimizing the cost of bulk memory is important. Table 5-1. 100-MHz PNX1300 memory bandwidth parameters PNX1300’s memory system achieves a good compromise between cost and performance by coupling substantial on-chip caches with a glueless interface to synchronous DRAM (SDRAM). SDRAM provides higher bandwidth than standard DRAM for only a small cost premium. A block diagram of the memory system is shown in Figure 5-1. SDRAM permits PNX1300 to use a narrower and simpler interface than would be required to achieve similar performance with standard DRAM. Three sets, each has address, opcode, condition, and guard Three Branch Units Decompressor VLIW CPU Magnitude 2800 MB/s 800 MB/s Data bandwidth (two 32-bit memory ports) 400 MB/s Main-memory bandwidth (one 32-bit port) Table 5-1 shows bandwidth parameters for the PNX1300 DSPCPU and the main-memory interface. Although 400 MB/s is a lot of bandwidth, it is clear that the SDRAM alone cannot keep up with the CPU’s maximum requirements for instructions and data. Luckily, multimedia algorithms resemble other computer programs in terms of locality of reference, so the on-chip caches typically supply Internal data highway: 32-bit address, 32-bit data 32KB, 8-way Instruction Cache Main Memory Interface 224 bits of decompressed instruction Two Memory Units Use Instruction bandwidth (224 bits/instruction) SDRAM Main Memory 16KB, 8-way Data Cache Two sets, each has a guard, opcode, data, and two address components To on-chip peripherals Main-memory bus: glueless, SDRAM control with 32-bit data Figure 5-1. The main components of the PNX1300 memory system. PRELIMINARY SPECIFICATION 5-1 PNX1300/01/02/11 Data Book Philips Semiconductors PNX1300’s processing units access the external SDRAM through the on-chip central “data highway” bus. The highway consists of separate 32-bit address and data buses, and use of the bus is mediated by the mainmemory interface unit. The main-memory interface contains the SDRAM controller and a central arbiter that determines how much of the available SDRAM memory bandwidth is allocated to each unit. Unused bandwidth is always made available to the VLIW CPU for cache refill and memory accesses that bypass the caches. the majority of instructions and data to the DSPCPU. The wide paths to the caches are matched to the bandwidth requirements of the DSPCPU. Table 5-2. Summary of memory system characteristics Unit Branch units Decompression unit Description Branch units execute branch operations. Up to three branch operations can be executed in parallel, but the program must guarantee that only one branch is taken. Table 5-2 gives a summary description of each component of PNX1300’s memory system. Instructions are stored in memory and in the instruction cache in a space-saving, compressed format. The decompression unit expands instructions to their full, 28-byte size before they are issued to the CPU. Instruction cache The instruction cache holds 32 KB, is 8-way set-associative, and has a 64-byte block size. A miss in a block causes the entire block to be read from SDRAM. The cache can sustain an issue rate of one instruction per cycle on cache hits. Memory units Memory units execute load and store operations. The data cache is dual ported to allow the memory units to operate concurrently. Data cache The data cache holds 16 KB, is 8-way setassociative, has a 64-byte block size, and implements a copyback, allocate-on-write policy. A miss in a block causes the entire block to be read from SDRAM. The cache supports memory-mapped I/O through non-cacheable address regions. Data highway The on-chip data highway bus serves all onchip units. The highway has separate 32-bit data and address buses. Bus bandwidth is allocated by the highway arbiter according to one of several modes. 5.2 PNX1300 implements a 32-bit linear address space of bytes. Within that address space, PNX1300 supports several different apertures for specific purposes. The DRAM aperture describes the part of the address space into which the external SDRAM is mapped. SDRAM must consist of a single, contiguous region of memory, which is the most practical configuration for PNX1300 systems. The location and size of the DRAM aperture is defined by two registers, DRAM_BASE and DRAM_LIMIT. These registers are both readable and writeable as MMIO registers and as PCI configuration space registers. The view of the registers in MMIO space is shown in Figure 5-2. The view of the registers in PCI configuration space is described in Chapter 11, “PCI Interface.” In normal operation, the base address registers are assigned once during boot and not changed when the DSPCPU is running. Refer to Chapter 11, “PCI Interface,” and Chapter 13, “System Boot,” for a description of this process. DRAM_LIMIT must be set equal to DRAM_BASE plus the actual size of SDRAM present. The amount of the SDRAM is not required to be a power of 2, but it must be a multiple of 64 KB. Note that the size of the aperture as set in the PCI configuration space can be larger, because it must be a power of 2. Main-memory The main-memory interface contains the datainterface highway access arbiter, the SDRAM controller, and MMIO logic. SDRAM main memory External SDRAM connects gluelessly to PNX1300 over the 32-bit main-memory bus. A memory operation will access SDRAM if its address satisfies: To improve cache behavior and thus program performance, the caches have a locking mechanism. In addition, the instruction cache is coupled with an instruction decompression unit. The compressed instruction format improves the cache hit rate and reduces the bus bandwidth required between main memory and cache. Instructions in main memory and cache use the compressed format. MMIO_BASE offset: 31 0x10 0000 DRAM_BASE (r/w) 0x10 0004 DRAM_LIMIT (r/w) 27 DRAM APERTURE [DRAM_BASE] ≤ address < [DRAM_LIMIT] Any address outside this range cannot access SDRAM. When PNX1300 is reset, DRAM_BASE_FIELD is set to 0x0 and DRAM_LIMIT is set to 0x0010 0000 (1-MB DRAM aperture starting at address 0x0). The boot process described in Chapter 13, “System Boot,” overrides these initial settings. 23 DRAM_BASE_FIELD 19 DRAM_LIMIT_FIELD Figure 5-2. Formats of the DRAM_BASE and DRAM_LIMIT registers. 5-2 PRELIMINARY SPECIFICATION 15 11 7 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Philips Semiconductors 5.3 Cache Architecture 5.3.1 DATA CACHE The PNX1300 data cache is 16 KB in size with a 64-byte block size. Thus, it contains 256 blocks each with its own address tag. The cache is 8-way set-associative, so there are 32 sets, each containing 8 tags. A single valid bit is associated with a block, so each block and associated address tag is either entirely valid in the cache or invalid. On a cache miss, 64 bytes are read from SDRAM to make the entire block valid. The data cache serves only the DSPCPU and is controlled by two memory units that execute the load and store operations issued by the DSPCPU. The following sections describe the data cache and its operation; Table 5-3 summarizes the important characteristics for easy reference. Table 5-3. Summary of data cache characteristics Characteristic Cache size Each block also contains a dirty bit, which is set whenever a write to the block occurs. Each set contains 10 bits to support the hierarchical LRU replacement policy. PNX1300 Implementation 16 KB Cache associativity 8-way set-associative Block size 64 bytes Valid bits One valid bit per 64-byte block Dirty bits One dirty bit per 64-byte block Miss transfer order Miss transfers begin with the critical word first Replacement policies Copyback, allocate on write, hierarchical LRU Endianness Either little- or big-endian, determined by PCSW bit Ports The cache is quasi dual ported; two accesses can proceed concurrently if they reference different banks (determined by bits [4:2] of the computed addresses) Alignment General Cache Parameters The geometry of the data cache is available to software by reading the MMIO register DC_PARAMS. Figure 5-3 shows the format of the DC_PARAMS register; Table 5-4 lists its field values. The product of block size, associativity, and number of sets gives the total cache size (16 KB in this case). Table 5-4. DC_PARAMS field values Field Name Value BLOCK SIZE 8 NUMBER_OF_SETS 32 5.3.2 Access must be naturally aligned (32-bit words on 32-bit boundaries, 16-bit halfwords on 16-bit boundaries); the appropriate number of LSBs of un-naturally aligned addresses are set to zero. For misaligned stores, PCSW.MSE is asserted to generate an exception 64 ASSOCIATIVITY Address Mapping PNX1300 data addresses are mapped onto the data cache storage structure as shown in Figure 5-4. A data address is partitioned into four fields as described in Table 5-5. Table 5-5. Data address field partitioning Partial word operations The cache implements 8-bit and 16-bit accesses with the same performance as 32-bit accesses Field Operation latency Three cycles for both load and store operations Address Bits Byte 1..0 Coherency enforce- Software uses special operations to ment enforce cache coherency Byte offset within a word for byte or halfword accesses Cache locking Non-cacheable region Purpose Word 5..2 Up to 1/2 (four out of 8 blocks of each set) of the cache contents can be locked; granularity is 64-byte Selects one of the words in a set (one of 16 words in the case of PNX1300) Set 10..6 Selects one of the sets in the cache (one of 32 in the case of PNX1300) One non-cacheable aperture in the DRAM address space is supported. Tag 31..11 Compared against address tags of set members MMIO_BASE offset: 31 27 23 0x10 001C DC_PARAMS (r/o) 19 BLOCKSIZE 15 11 ASSOCIATIVITY 7 3 0 NUMBER_OF_SETS Figure 5-3. Format of the DC_PARAMS register. 31 Data Cache Address 11 Tag 10 6 Set 5 2 Word 1 0 Byte Figure 5-4. Data cache address partitioning. PRELIMINARY SPECIFICATION 5-3 PNX1300/01/02/11 Data Book 5.3.3 Miss Processing Order When a miss occurs, the data cache fills the block containing the requested word from the critical word first. The CPU is stalled until the first word is transferred. The block is then filled up while the CPU keeps running. 5.3.4 Replacement Policies, Coherency The cache implements a copyback replacement policy with one dirty bit per 64-byte block. Thus, when a miss occurs and the block selected for replacement has its dirty bit set, the dirty block must be written to main memory to preserve its modified contents. On PNX1300, the dirty block is written to memory before the needed block is fetched. Coherency is not maintained in any way by hardware between the data cache, the instruction cache, and main memory. Special operations are available to implement cache coherency in software. See Section 5.6, “Cache Coherency,” for a discussion of coherency issues. Write misses are handled with an allocate-on-write policy—the write that caused the miss stores its data in the cache after the missing block is fetched into the cache. The cache implements a hierarchical LRU replacement algorithm to determine which of the eight elements (blocks) in a set is replaced. The algorithm partitions the eight set elements into four groups, each group with two elements. The hierarchical LRU replacement victim is determined by selecting the least-recently used group of two elements and then selecting the least-recently used element in that group. This hierarchical algorithm yields performance close to full LRU but is simpler to implement. See Section 5.5, “LRU Algorithm,” for a full discussion of the LRU algorithm. 5.3.5 Alignment, Partial-Word Transfers, Endian-ness The cache implements 32-bit word, 16-bit half-word, and 8-bit byte transfers. All transfers, however, must be to addresses that are naturally aligned; that is, 32-bit words must be aligned on 32-bit boundaries, and 16-bit halfwords must be aligned on 16-bit boundaries. Like other PNX1300 processing units, the CPU has the capability to use either big- or little-endian byte order. It is recommended that all units and the CPU run with the same endian-ness. Detailed endian-ness description can be found in Appendix C, “Endian-ness.” 5.3.6 Dual Ports To allow two accesses to proceed in parallel, the data cache is quasi-dual ported. The cache is implemented as eight banks of single-ported memory, but the hardware allows each bank to operate independently. Thus, when the addresses of two simultaneous accesses select two different banks, both accesses can complete simultaneously. Bank selection is determined by the three loworder address bits [4..2] of each address. Thus, the 5-4 PRELIMINARY SPECIFICATION Philips Semiconductors words in a 64-byte cache block are distributed among the eight blocks, which prevents conflicts between two simultaneously issued accesses to adjacent words in a cache block. The PNX1300 compiling system attempts to avoid bank conflicts as much as possible. The dual-ported cache can execute the load and store opcodes (ild8d, uld8d, ild16d, uld16d, ld32d, h_st8d, h_st16d, h_st32d, ild8r, uld8r, ild16r, uld16r, ld32r, ild16x, uld16x, ld32x) in either or both of the two ports. The special opcodes alloc, dcb, dinvalid, pref, rdtag and rdstatus can only be executed in the second port, not in the first port. Whenever any of these special opcodes is issued in the second port, there should not be a concurrent load or store operation in the first. This is a special scheduling constraint. 5.3.7 Cache Locking The data cache allows the contents of up to one-half of its blocks to be locked. Thus, on PNX1300, up to 8 KB of the cache can be used as a high-speed local data memory. Only four out of eight blocks in any set can be locked. A locked block is never chosen as a victim by the replacement algorithm; its contents remain undisturbed until either (1) the block’s locked status is changed explicitly by software, or (2) a dinvalid operation is executed that targets the locked block. Cache locking occurs only for the data in the address range described by the MMIO registers DC_LOCK_ADDR and DC_LOCK_SIZE. The granularity of the address range is one 64-byte cache block. The MMIO register DC_LOCK_CTL contains the cache-locking enable bit DC_LOCK_ENABLE. Figure 5-5 shows the layout of the data-cache lock registers. Locking will occur for an address if locking is enabled and both of the following are true: 1. The address is greater than or equal to the value in DC_LOCK_ADDR. 2. The address is less than the sum of the values in DC_LOCK_ADDR and DC_LOCK_SIZE. Programmers (or compilers) must combine all data that needs to be locked into this single linear address range. Setting DC_LOCK_ENABLE to ‘1’ causes the following sequence of events: 1. All blocks that are in cache locations that will be used for locking are copied back to main memory (if they are dirty) and removed from the cache. 2. All blocks in the lock range are fetched from main memory into the cache. If any block in the lock range was already in the cache, it’s first copied back into main memory (if it’s dirty) and invalidated. 3. The LRU status of any set that contains locked blocks is set to the initialization value. 4. Cache locking is activated so that the locked blocks cannot be victims of the replacement algorithm. This sequence of events is triggered by writing ‘1’ to DC_LOCK_ENABLE even if the enable is already set to Philips Semiconductors MMIO_BASE offset: 0x10 0010 DC_LOCK_CTL (r/w) Cache Architecture APERTURE_CONTROL 31 27 23 19 15 11 6 7 5 3 0 reserved 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DC_LOCK_ENABLE 0x10 0014 DC_LOCK_ADDR (r/w) 0x10 0018 DC_LOCK_SIZE (r/w) DC_LOCK_ADDRESS 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 DC_LOCK_SIZE 0 0 0 0 0 0 Figure 5-5. Formats of the registers in charge of data-cache locking. Table 5-6. Aperture control field ‘1’. Setting DC_LOCK_ENABLE to ‘0’ causes no action except to allow the previously locked blocks to be replacement victims. Value To program a new lock range, the following sequence of operations is used: Memory map properties 00 (RESET) Normal operation memory map (Section 3.4.1): • loads to 0..0xff always return 0 and cause no PCI read (memory hole is enabled) • PCI aperture(s) are enabled 1. Disable cache locking by writing ‘0’ to DC_LOCK_ENABLE. 2. Define a new lock range by writing to DC_LOCK_ADDR and DC_LOCK_SIZE. 3. Enable cache locking by writing ‘1’ to DC_LOCK_ENABLE. Dirty locked blocks can be written back to main memory while locking is enabled by executing copyback operations in software. 01 • loads to address 0..0xff cause a PCI read, i.e. the memory hole is disabled • PCI aperture(s) are enabled 10 PCI apertures are disabled for loads • loads return a 0 and cause no PCI read 11 RESERVED for future extensions 5.3.9 Programmer’s note: Software should not execute dinvalid operations on a locked block. If it does, the block will be removed from the cache, creating a ‘hole’ in the lock range (and the data cache) that cannot be reused until locking is deactivated. Non-cacheable Region Cache locking is disabled by default when PNX1300 is reset. The data cache supports one non-cacheable address region within the DRAM address space aperture. The base address of this region is determined by the value in the DRAM_CACHEABLE_LIMIT MMIO register, which is shown in Figure 5-6. Since uncached memory operations always incur many stall cycles, the non-cacheable region should be used sparingly. The RESERVED field in DC_LOCK_CTL should be ignored on reads and written as all zeroes. A memory operation is non-cacheable if its target address satisfies: Locking should not be enabled by PCI accesses to the MMIO registers. [dram_cacheable_limit]

下载 PDF

PNX1301EH,557 价格&库存

-> 查询更多价格&库存

很抱歉，暂时无法提供与“PNX1301EH,557”相匹配的价格&库存，您可以联系我们找货

免费人工找货

搜索历史

PNX1301EH,557

相关技术文章