w SPARC[ high-performance RISC architecture
w 2 Kbytes 2-way set associative instruction cache
w 2 Kbytes 2-way set associative data cache
w Flexible locking mechanism for data and instruction cache entries
w Harvard-style separate instruction and data buses on-chip
w 8 window, 136 word register file
w Fast interrupt response time
w 247 address spaces, 4 Gbyte each
w User and supervisor modes
w Buffered writes and instruction pre-fetching
w Fast page-mode DRAM support
w Programmable address decoder and wait-state generator
w 16-bit auto reload timer
w On-chip clock generator circuit
w JTAG test interface
w Emulator support hardware
w Single vector trapping
w 0.8 micron gate, 3 level metal CMOS technology
On-chip data and instruction caches are included to
help decouple the processor from external memory
latency. Separate on-chip instruction and data paths
provide a high bandwidth interface between the IU and
caches. Included to maximize the performance of the
system with minimum glue logic, are chip select outputs,
programmable wait-state generation and built-in support
for a high performance connection to page-mode
DRAM. See MB86930 block diagram on page 5.
Support for debug and diagnostic tools has been included on-chip and allows for direct connection to hardware emulators and improves debug capability when using ROM based monitors.
These features combine to give the MB86930 superior speed, flexibility and efficiency to make it the ideal choice for a wide variety of low-cost, high-performance embedded systems.
PIN CONFIGURATION
PIN ASSIGNMENT - 179-PIN PGA
ORDERING CODE
Note: The ordering code is for production level product. Early shipments of this device may be marked with "ES" to indicate that the part
is not yet at full production status. Contact your local Fujitsu representative for additional information on "ES" level products.
PIN ASSIGNMENT - 208-PIN QFP
BLOCK DIAGRAM
SIGNAL DESCRIPTIONS
SIGNAL DESCRIPTIONS (Continued)
SIGNAL DESCRIPTIONS (Continued)
SIGNAL DESCRIPTIONS (Continued)
The MB86930 instruction set is streamlined and hardwired for fast execution with most instructions executing in a single cycle. The Integer Unit (IU) features a 5-stage pipeline which has been designed to handle data interlocks, has an optimized branch handler for efficient control transfers, and a bus interface to handle single cycle bus accesses to on-chip memory.
An internal register file consisting of 136 registers organized into eight overlapping windows speeds interrupt response time and context switches. The register file minimizes accesses to memory during procedure linkages and facilitates passing of parameters and assignment of variables.
On-chip 2 Kbyte data and instruction caches have been added to decouple the processor from external memory. These caches have been designed with maximum flexibility in mind and allow entries to be locked to improve overall system performance.
Separate 32-bit on-chip instruction and data paths provide a high bandwidth interface between the IU and on-chip cache. These buses support single cycle instruction execution as well as single cycle data transfers with the cache. Future expansion of the MB86930 design is supported by this bus definition as well.
The MB86930 also includes hardware for integer multiply and divide. The hardware support significantly improves the performance of these operations with 32-bit integer multiplies executing in 5 clock cycles, 16-bit integer multiplies in 3 cycles, 8-bit integer multiplies in 2 cycles, and a multiply by zero can complete in a single cycle.
KEY FEATURES
Fast Instruction Execution: Simple functions make up the bulk of instructions in most programs so that execution speed can be greatly improved by designing these instructions to execute in as short a time as possible. The majority of instructions execute in one cycle with only a few of the more complex, such as integer multiply, taking additional cycles.
Large Register Set: The large register set reduces the number of required accesses to data memory. The registers are organized in overlapping groups called register windows which allows registers to be reserved for high priority tasks, such as interrupts, or for recurring requirements such as operating system working registers. The overlapping windows also simplify parameter passing during procedure linkage and reduce code in most programs.
On-Chip Caches: To decouple the speed of the processor from the memory sub-system, data and instructions caches have been added. The caches are organized as two-way set-associative for improved hit rates. In addition, the set-associative caches allow entries to be locked, individually or as a bank, without significantly degrading the cache performance.
Cache Locking: Both data and instruction entries can be locked into their respective caches to ensure deterministic response and highest performance for critical or frequently recurring routines. Maximum flexibility has been designed into the cache to allow all or selected portions to be locked.
Bus Interface: The requirement for glue logic between the MB86930 and the system is minimized by providing programmable chip selects, programmable wait-state circuitry, and support for connection to fast page-mode DRAM. Multiple bus masters are supported through a simple handshake protocol.
Clock Generator: To simplify the clock design a crystal can be connected directly to the on-chip oscillator or an external clock source can be used. A built in phase-locked loop minimizes the skew between on and off-chip clocks.
Enhanced Instruction Set: The MB86930 incorporates a fast integer multiply instruction which executes in a fast 5, 3 or 2 cycles for 32-bit, 16-bit or 8-bit multiplicands. An integer divide-step instruction cuts divide times by a factor of 10 over previous SPARC implementations. A scan instruction supports a single cycle search for the most significant 1 or 0 in a word.
Fully Static Circuit Design: Embedded applications that need a means to reduce power consumption can take advantage of the MB86930's fully static design. The processor clock can be slowed or stopped for arbitrary periods of time to reduce operating current with no loss of internal state. Noise immunity is improved as well. (Note: stopping the clock will result in the Phase-Lock Loop losing lock. Lock must be re-established before normal operation can be resumed.)
Test and Debug Interface: The MB86930 supports production test through industry standard JTAG boundary scan. Hardware emulation is supported with on-chip breakpoint and single step logic. A dedicated emulator bus provides a means to trace transactions between the integer unit and on-chip cache.
CPU
The MB86930 core is a high performance full
custom implementation of the SPARC architecture. The
core is compact to leave room for peripheral integration
and yet is designed in a way to allow the major blocks to
be customized for varying application requirements. The
core is made up of three functional units: the Instruction
block, the Address block and the Execute block (see
Figure 1).
A five stage instruction pipeline is responsible for decoding all instructions and generating the control signals to the other blocks. The 5-stage pipeline consists of Fetch (F), Decode (D), Execute (E), Memory (M) and Writeback (W). Instruction memory is addressed and returns instructions in the (F) stage, the register file is addressed and returns operands in the (D) stage, the ALU computes results in the (E) stage, external memory is addressed in the (M) stage, and the register file is written back in the (W) stage.
TABLE 1. MB86930 Instruction Set
ADDRESS SPACE
The MB86930 offers a large addressing range and allows separate user and supervisor spaces to be defined. In addition to 32 address lines, 4 alternate address space identifiers (ASIs) distinguish between protected and unprotected space. Of the 256 possible ASI values, two define accesses to user data and user instruction space while the remaining ASI values define supervisor space.
Anytime a reset, synchronous trap or asynchronous trap occurs, the processor is placed into the supervisor mode. In this mode, the processor executes instructions and moves data out of supervisor space. While in supervisor mode, the processor also has access to the remaining ASI values. Except for those mentioned and those reserved for control register space, the remaining ASI values can be used to access other alternate data spaces defined by the application.
The distinction of user versus supervisor space allows the hardware to protect against accidental or un-authorized access to system resources. For real time operating system (RTOS) development for example, the separate spaces provide a mechanism for effectively partitioning RTOS space from user space.
REGISTERS
The MB86930 register set is divided into those used for general purpose functions and those used for control and status.
The 136 general purpose registers are divided into 8 global registers and 8 overlapping blocks or "windows". Each window contains 24 registers. Of these, 8 are local to the window, 8 "out" registers overlap with the next window and 8 "in" registers overlap with the previous window (see Figure 2).
This organization makes it easy to pass parameters to subroutines. Parameters that are to be passed along are written to the "out" registers and the subsequent procedure call decrements the window pointer to make a new set of registers available. The passed parameters are now available to the subroutine in the current window's "in" registers.
Register windows improve performance in embedded applications because they function as local variable caches which retain either interrupt, subroutine, context or operating system variables with no additional overhead. In addition, code can be reduced by exploiting the efficient execution of procedure linkage by preventing in-lining compiler optimizations.
The registers that make up the register file each have three read-only and one write-only port. The use of a four port register file allows even store instructions, which may require that three operands be read out of the register file, to proceed at one instruction per cycle.
The control and status registers include those defined by the SPARC architecture (see Table 1) and those mapped into alternate address space to control peripheral functions (see Table 2).
INSTRUCTION SET
The MB86930 is upward code compatible with other SPARC processors. Additional instructions, previously not directly supported, have been added to improve performance in embedded applications. Integer multiply, integer divide step, and scan for first changed bit have been added to the already powerful SPARC instruction set. See Table 1 for a list of supported instructions.
INTERRUPTS
A key measure of a processor's suitability for use in
embedded application is in its ability to handle interrupts
with a minimum of delay and in a deterministic fashion.
The MB86930 implementation has been tailored to
insure not only low average latency but low maximum
latency as well.
Interrupt response time is made up of the sum of the times it takes the processor to finish its current task after recognizing an interrupt, and the time it takes to begin executing interrupt service routine instructions. The MB86930 implements numerous features to minimize both factors.
To minimize the time it takes to finish the current task, the MB86930 is designed so that tasks can either be interrupted or completed in a minimum of cycles. Implementation details that accomplish this aim include cache line misses that are filled one word at a time through a pre-fetch buffer, integer divide that is interruptible through the use of a divide step instruction, fast multiply and a 1 word write buffer to limit pending bus transactions.
To minimize the time required to start executing the interrupt service routine the processor switches to a new register window when an interrupt is detected. This feature allows the service routine to be executed without first requiring that the current registers be saved. The user can also elect to lock the service routine into the cache. This makes the routine available for immediate access. The on-chip data cache can also serve the service routine as a fast local stack for minimum delay in accessing routine variables.
The MB86930 provides for up to 15 different interrupt levels and direct support for 15 separate interrupt sources. The highest interrupt level is non-maskable.
CACHE
The MB86930 has separate on-chip data and instruction caches. This allows the user to build a high performance system without incurring the cost of requiring fast external memory and the associated control logic.
The data and instruction caches are each organized as two banks of sixty-four 16-byte lines (see Figure 4). The lines are organized as two-way set-associative for good performance even when cache locking is in effect. Lines are divided into four sub-blocks each four bytes wide. On a cache miss, the cache is updated in sub-block increments for efficient re-fill of typical code segments and to avoid interrupt latency incurred by long cache line replacements. An instruction pre-fetch buffer fetches the next sequential instruction anticipating that it will be needed to fill the next instruction cache miss.
The caches can be used in either normal or one of two
lock modes. In normal mode, the caches use an LRU
(least recently used) algorithm to replace one of the two
appropriate entries. Alternately, the two locking modes
allow the entire cache or just selected entries to be locked.
The lock modes allow time critical routines to be locked
in cache.
Global locking allows the entire content of either the instruction or data cache to be frozen. Two control bits in the cache control register enable or disable locking for either cache. With the entire cache locked, no valid entry can be replaced. To insure best possible performance however, invalid entries will be updated if they are accessed. This is done automatically and incurs no time penalty.
Local cache locking makes it possible to dynamically lock selected instructions or data entries into the appropriate cache. This feature gives the flexibility, for example, to assure deterministic response for certain critical interrupt routines by locking the routine's code into the cache. Entries can also be locked where it is desirable to give performance priority to certain often used routines which might otherwise be removed from cache. The 2-way set-associativity allows the cache to perform effectively even with some locked entries.
In local lock mode, each entry can either be locked individually by software or automatically with hardware assist. For individual locking, software writes the lock bit in the appropriate cache tag line. For automatic locking, a bit in each cache control register enables or disables the feature. The enable bit is set at the beginning of a routine for which the entries are to be locked. This causes the location of any cache access occurring while the bit is enabled to be locked into the cache. In addition to requiring just one initial cycle to enable, automatic entry locking incurs no overhead while in effect.
In unlocked operation, the data cache uses a write-through update policy and allocates a cache entry only on a load. Writes are buffered so that the processor can continue executing while data is written back to memory. In contrast, writes to locked data cache locations are not written through to main memory. Besides reducing external bus activity, this design supports configuring a portion of data cache as on-chip RAM which does not map to external memory.
The data and instruction caches are designed to be accessed independently over separate data and instruction buses to allow data to be loaded from and stored to cache at peak rates of 1 CPI.
BUS INTERFACE
The Bus Interface Unit (BIU) is designed to simplify the interface between the MB86930 and the rest of the system. Separate address and data buses make it easy to build fast systems. At the same time, on-chip circuitry allows these systems to be built with a minimum of external hardware.
The bus interface supports fully programmable wait-state generation, address decoding with chip select outputs, same page detection to support page-mode DRAM, and an auto-reload timer to support a refresh counter.
CLOCK GENERATOR
The on-chip clock generator provides a means to directly connect the MB86930 to either a crystal oscillator or an external clock source. For either case, the external frequency is the same as the chip operating frequency.
A clock output signal provides the system with a reference by which external timing can be synchronized when not using an external clock source. The skew between the internal clock and an external input clock source is minimized by the inclusion of an on-chip phase lock loop circuit.
TABLE 1. MB86930 Control and Status registers (All registers are read/write)
TABLE 2. MB86930 Memory Mapped Control Registers (All registers are read/write)
TABLE 2. MB86930 Memory Mapped Control Registers (Continued)
Operation of the BIU
The BIU receives requests for external memory operations from the Cache Control Logic (CCL). In the case of reads from external memory, it performs the read operation and returns the data to the Cache and IU. A parallel path is used to make the data available to the IU in the same cycle that it is written to the cache.
In the case of a write to external memory, the BIU makes use of a write buffer which can hold a one word write transaction. When the BIU receives a request for a write transaction it stores the write data and address in the write buffer allowing the IU to continue operating out of on-chip cache and/or its register file. The BIU then proceeds to complete the write to external memory. In most cases the write buffer will hide external memory latency from the IU. The exceptions are in cases where the write buffer is still filled from a previous transaction or if the subsequent IU cycle results in an instruction cache miss. In these cases, IU execution is held until the write buffer is emptied.
The BIU includes a one stage prefetch buffer for instruction fetches. This buffer is used to fetch the next sequential instruction after an instruction cache miss. The instruction is prefetched only if the BIU does not have a request for a bus transaction from the IU nor is any external device requesting use of the bus. The prefetch buffer operation is suspended if the buffer is full. This occurs if the prefetched instruction is a hit in the instruction cache. The buffer restarts after another instruction cache miss. If an exception occurs during an instruction prefetch, the exception is not sent to the IU unless the instruction is actually requested by the IU. The prefetch buffer operates only when the instruction cache is on.
In any cycle the BIU can receive a request for accesses to either or both instruction and/or data memory. If it receives a request for both in the same cycle, it completes the data memory transaction first.
Exception Handling
The external memory system can indicate an exception during a memory operation. The BIU signals the appropriate data or instruction exception to the IU which will trap accordingly.
As mentioned above, the IU can continue operation after putting the data and address for a store in the write buffer. If an exception is detected while completing this buffered write, then the BIU indicates a data access exception to the IU.
Any system which needs to recover from this error should store the address and data of such write transactions in hardware. If the system can generate both read and write exceptions, then the system must also provide a status bit which indicates whether the exception was generated on a read or on a write transaction. With access to this information the data access exception service routine can determine the cause of the exception and recover accordingly.
Bus Cycles
Timings 1 through 9 illustrate representative combinations of bus cycles.
Load
Whenever an instruction fetch or a load from data memory has a miss in the cache, the BIU performs a read from external memory.
A read transaction begins with the BIU asserting -AS, to indicate a new bus transaction. The -AS signal is de-asserted after one cycle. At the same time the ADR < 31:2> and ASI < 7:0> bits are driven with the location to be read. The BIU drives the RD/-WR signal high to indicate a read transaction.
The external memory system responds with the read data on pins D < 31:0 >. It also asserts the -READY signal when the data is ready. For slow memory, the -READY signal can be delayed until data is valid.
A load double operation is treated as back-to-back reads.
Load with Exception
If the external memory system sees a memory exception it can terminate the current memory transaction by asserting the -MEXC and -READY signals. The data on the data bus is ignored by the MB86930.
Store
A write transaction begins with the BIU asserting -AS, to indicate a new bus transaction. The -AS signal is de-asserted after one phase. At the same time the ADR < 31:2> and ASI < 7:0> pins are driven with the location to be written while the D < 31:0> pins has corresponding write data. The -BE0-3 pins indicate byte, half-word or word transaction width. The BIU drives the RD/-WR signal low to indicate a write transaction.
The external memory system responds by asserting the -READY signal when it has stored the data.
A store double operation is treated as back-to-back writes.
Store with Exception
If an access exception occurs on a write, the external memory system can terminate the current memory transaction by asserting the -MEXC and -READY signals. The external memory system is expected to ignore the data on the data bus in this situation.
Atomic Load Store
An atomic load store executes as a load followed by a store with no operation allowed in between. The -LOCK signal is asserted to indicate that the bus is being used for more than one external memory operation.
There is one cycle between the termination of the
read and the beginning of the write to provide time for the
switching of the data bus drivers.
External Bus Request and Grant
Any external device can request ownership of the bus by asserting the -BREQ signal. The BIU asserts the -BGRNT signal to indicate that it is relinquishing control of the bus and also three-states all of its bus drivers. In the following cycle, the external device can complete its transaction. On completion of its transaction the external device de-asserts the -BREQ signal. The BIU responds by de-asserting the -BGRNT signal in the following cycle.
The MB86930 is the default owner of the bus.
PACKAGE THERMAL CHARACTERISTICS
Note: All numbers for package thermal characteristics assume multilayer PCB, except for the numbers for PGA package, which assume a
single layer PCB.
DC SPECIFICATIONS 3 VCC = 5V + 5%
AC CHARACTERISTICS1,2,4 VCC = 5V + 5%, TA 0-70_C
AC CHARACTERISTICS1,2,4 VCC = 5V + 5%, TA 0-70_C (Continued)
1. Parameters are valid over specified temperature range and supply voltage range unless otherwise noted.
2. All voltage measurements are referenced to ground. All time measurements are referenced at input and output levels of 1.5V. For testing, all inputs swing between 0.4V and 2.4V (Except XTAL1 which swings from 0.4V to 3.0V). Input rise and fall times are 2ns or less.
3. Not more than one output may be shorted at a time for a maximum duration of one second.
4. Timing specifications apply to frequency of operation listed at top of column.
5. All output timings are based on a 50pF load.
6. The IRL input setup and hold times are measured with respect to the midpoint of the input clock cycle.
7. These specs will be improved in the future.
8. Data bus output driver control is same as for RD/-WR so timing is similar.
SPARClite is a trademark of Fujitsu Microelectronics, Inc.
SPARC is a registered trademark of SPARC International based on technology developed by Sun Microsystems, Inc.
All rights reserved. This publication contains information considered proprietary by Fujitsu Limited and Fujitsu Microelectronics, Inc. No part of this document may be copied or reproduced in any form or by any means or transferred to any third party without the prior written consent of Fujitsu Microelectronics, Inc.
Circuit diagrams utilizing Fujitsu products are included as a means of illustrating typical semiconductor applications. Consequently, complete information sufficient for design purposes is not necessarily given.
Fujitsu Limited and its subsidiaries reserve the right to change products or specifications without notice. Fujitsu advises its customers to obtain the latest version of device specifications to verify, before placing orders, that the information being relied upon by the customer is current.
The information contained in this document does not convey any license under copyrights, patent rights or trademarks claimed and owned by Fujitsu Limited or its subsidiaries. Fujitsu assumes no liability for Fujitsu applications assistance, customer's product design, or infringement of patents arising from use of semiconductor devices in such systems' designs. Nor does Fujitsu warrant or represent that any patent right, copyright, or other intellectual property right of Fujitsu covering or relating to any combination, machine, or process in which such semiconductor devices might be or are used.
Fujitsu Microelectronics, Inc.'s Semiconductor Division's products are not authorized for use in life support devices or systems. Life support devices or systems are device or systems which are:
1. Intended for surgical implant into the human body.
2. Designed to support or sustain life; and when properly used according to label instructions, can reasonably be expected to cause significant injury to the user in the event of failure.
The information contained in this document has been carefully checked and is believed to be entirely accurate. However, Fujitsu Limited and Fujitsu Microelectronics, Inc. assume no responsibility for inaccuracies.
This document is published by the marketing department of Fujitsu Microelectronics, Inc., Semiconductor Division, 3545 North First Street, San Jose, California, U.S.A. 95134-1804.
FUJITSU MICROELECTRONICS, INC. SALES OFFICES
CALIFORNIA
Santa Clara Sales Office
2880 Lakeside Drive, #250
Santa Clara, CA 95054
(408 ) 982-1800
Irvine Sales Office
Century Center
2603 Main Street, #510
Irvine, CA 92714
(714 ) 724-8777
COLORADO (Denver Sales Ofc.)
5445 DTC Parkway, #P4
Englewood, CO 80111
(303 ) 740-8880
GEORGIA (Atlanta Sales Ofc.)
3500 Parkway Lane, #210
Norcross, GA 30092
(404 ) 449-8539
ILLINOIS (Chicago Sales Ofc.)
One Pierce Place, #1130 West
(708 ) 250-8580
Itasca, IL 60143-2681
MASSACHUSETTS (Boston Sales Ofc.)
1000 Winter Street, #2500
Waltham, MA 02154
(617 ) 487-0029
MINNESOTA (Minneapolis Sales Ofc.)
3460 Washington Drive, #209
Eagan, MN 55122-1303
(612 ) 454-0323
NEW YORK (New York Sales Ofc.)
898 Veterans Memorial Hwy.