**Computer Information Center** 

# **Computation Department**

# **Livermore LTSS** Time-Sharing **System**

**s**

 $\overline{\mathbf{3}}$ 



MASTER

## **Lawrence Radiation Laboratory**

**University of California/Livermore**

## **DISCLAIMER**

**This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.**

## **DISCLAIMER**

**Portions of this document may be illegible in electronic image products. Images are produced from the best available original document.**

LTSS-1 Edition <sup>2</sup>

## LIVERMORE TIME SHARING SYSTEM

Part 1: Octopus

**\**

 $\bullet$ 

Chapter 1: Hardware

This report was prepared as an account of work<br>ponsored by the United States Government. Neither<br>the United States nor the United States Energy<br>Research and Development Administration, nor any of<br>Research and Development A

University of California Lawrence Radiation Laboratory Livermore, California 94550

#### HARDWARE

CIC-LTSS-1-ED. 2

Authors: John G. Fletcher, J. Dennis Lawrence

March 10, 1970

If you wish to be placed on <sup>a</sup> mailing list for LTSS-related documents, please send your name and L-code to:

LTSS LIST

COMPUTER INFORMATION CENTER

L-60

Acknowledgment: My thanks to Harry L. Nelson for carefully verifying the technical accuracy of sections 1.2 and 1.3. -- J.D.L.

#### LTSS-1 Edition <sup>2</sup>

### TABLE OF CONTENTS



 $\bullet$ 

# LTSS-1<br>Edition 2

 $\bullet$ 



### ILLUSTRATIONS



#### 1. HARDWARE

### 1.1. THE PDP-6 COMPUTER SYSTEM<sup>T</sup>

#### 1.1.1. Hardware Components

The Octopus computer complex is centered on <sup>a</sup> "head" consisting of the following devices (illustrated in Figure 1.1):

- A. Two Digital Equipment Corporation (DEC) Programmed Data Processor-6 (PDP-6) computers, modified locally to include paged and segmented addressing. These are described more fully below, in Sections 1.1.2.- 1**.**1**.**6**.**
- B. 256 <sup>K</sup> (where <sup>K</sup> <sup>=</sup> 1024) 36-bit words of high-speed random-access magnetic core memory. This memory consists of <sup>16</sup> "boxes" of <sup>16</sup> <sup>K</sup> words each. Eight of these boxes are products of the Lockheed Electronics Company (EEC), six of Ampex and two of DEC. The DEC memories have <sup>a</sup> 1.75 microsecond cycle time; the others are one microsecond (see Section 1.1.7.).
- C. One General Precision Librascope (GPL) Disc File with <sup>a</sup> capacity of 0.807 billion bits (22.4 million words), and <sup>a</sup> transfer rate of 20.35 megabits/second. This is <sup>a</sup> fixed head disc; all access delays are rotational in origin (averaging about <sup>35</sup> milliseconds).
- D. One International Business Machines (IBM) Data Cell with <sup>a</sup> capacity of 3.24 billion bits (90 million words) and <sup>a</sup> transfer rate of <sup>324</sup> kilobits/second (reading) or <sup>162</sup> kilobits/second (writing). Random access averages half <sup>a</sup> second.
- E. One IBM Photo-Digital Storage Device with <sup>a</sup> capacity of 1.02 trillion bits (28.3 billion words), and <sup>a</sup> transfer rate of 1.5 megabits/ second (reading) or <sup>255</sup> kilobits/second (writing). Random access is about <sup>5</sup> seconds.
- F. A number of line units, which are interfaces to devices which lie on the Octopus "tentacles", enabling those devides to directly read or write (under PDP-6 control) the core memory and to interrupt <sup>a</sup> processor. Currently, each CDC-6600 and CDC-7600 is connected to <sup>a</sup> line unit.

Caution: The PDP-6's may be replaced with PDP-10's in the near future. This will render this entire discussion obsolete.

- G. <sup>A</sup> pair of I/O busses enabling the processors to send or receive data or control information to or from the disc, data cell, photo store, line units, and other devices. These other devices include <sup>a</sup> clock, PDP-8 computers which multiplex teletypewriters, <sup>a</sup> switch which can rearrange I/O bus connections, two magnetic tape transports, an IBM <sup>1401</sup> computer, <sup>a</sup> card reader-punch, <sup>a</sup> paper tape reader, two console teletypewriters, and <sup>a</sup> television monitor display system (TMDS).
- H. <sup>A</sup> video display which can examine memory and is used for maintenance and debugging.



Figure 1.1. PDP-6 Hardware

#### 1.1.2. Processor Registers

Each PDP-6 processor executes independently of the other. The only communication between them is through the shared core memory and by means of <sup>a</sup> mutual interrupt facility described in Section 1.1.6. <sup>A</sup> PDP-6 does not execute in synchronism with any clock; rather, each instruction proceeds as fast as circuitry and memory availability will allow. The typical instruction takes about <sup>5</sup> microseconds.

The following registers characterize the state of execution of <sup>a</sup> process:

- A. One program counter which holds the 18-bit effective address of the instruction next to be fetched.
- B. Sixteen accumulators, each of which can hold <sup>a</sup> 36-bit word. These registers are addressable as core locations <sup>0</sup> through 15. Except for accumulator 0, their right halves are also used as index registers.
- C. <sup>A</sup> number of registers used by the pagination-segmentation hardware, which are described more fully in Section 1.1.4. They include <sup>a</sup> descriptor base register, eight address base registers, two interrupt summary registers, and eight associative memory registers.
- D. Nine flags, bits indicative of various processor events and states, including arithmetic overflow and carry (3 bits), jump or skip occurrence (1 bit) byte manipulation state (1 bit), processor mode (2 bits), and optional interrupt enables (2 bits). Further information of <sup>a</sup> similar character, but including enough information to diagnose the source of processor-generated interrupts, is obtained through the use of special I/O instructions.
- E. Internal registers which retain the state of the interrupt channels, which are discussed in Section 1.1.5.

The PDP-6 repertoire of <sup>393</sup> instructions directly manipulates the contents of these registers (notably the accumulators) and the contents of memory words. The instructions are executed sequentially without significant overlap or look-ahead.

#### 1.1.3. Word Formats

Certain uses of words by the processors (e.g., as operands of move, Boolean, and byte manipulation instructions) regard them as bit patterns and assume no special format. Other uses, however, interpret them according to the formats now summarized.

A. Under various circumstances <sup>a</sup> word may be used to generate an *effective address.* (The word "generate" is used to distinguish this situation from one in which <sup>a</sup> word is merely interpreted as holding an effective address in one or both halves.) The format is



The effective address is generated as follows. The <sup>A</sup> (address) field is added to the contents of the index register (right half of the accumulator) specified by the X-field (unless zero is specified, in which case <sup>A</sup> is used directly) to obtain an indirect address. This is also the effective address unless the <sup>I</sup> (indirect) bit is one, in which case the word specified by the indirect address is read and is used to generate <sup>a</sup> new indirect address; the effective address is finally obtained only when <sup>a</sup> word with the I bit equal to zero is encountered. If the chain of indirect addresses loops on itself, the processor will remain in the loop until interrupted. Bits 0-12 are never involved in effective address generation.

- B. All instructions without exception generate an effective address as explained above; this address either is used to specify <sup>a</sup> word from which an operand is to be fetched and/or in which <sup>a</sup> result is to be stored, or is used itself as the right half of an operand having zeros in its left half. Bits 0-12 of the instruction are interpreted as follows.
	- 1. If bits 0-2 are all zeros, the instruction is <sup>a</sup> *programmed operator.* This instruction stores bits 0-12 of itself and the effective address it generates into core location <sup>32</sup> and then

executes the instruction in location 33. It is intended that this be <sup>a</sup> jump to <sup>a</sup> subroutine which interpretively executes the information in 32, thus providing <sup>a</sup> means of extending the instruction repertoire.

- 2. If bits 0-2 are all ones, the instruction is one of eight I/O instructions as specified by bits 10-12. Bits 3-9 specify one of <sup>a</sup> possible 128 devices (connected to the I/O bus) to which the instruction may refer.
- 3. In all other cases, bits 0-8 specify one of <sup>384</sup> instructions. Usually the separate bits may be individually understood; for example, if bits 0-2 are 110, an accumulator mask and test instruction is indicated and bits 3-8 are respectively interpreted as complement masked bits, clear masked bits, mask with contents of effective address (rather than with the address itself), skip, reverse skip selection if masked bits are all zero, and swap halves of word before masking. Except for two jump instructions, for which they act as modifiers of the meaning of bits 0-8, bits 9-12 select an accumulator (or, in <sup>a</sup> few cases, <sup>a</sup> pagination-segmentation register) which contains an operand and/or is <sup>a</sup> repository for <sup>a</sup> result.
- C. <sup>A</sup> *byte pointer* is <sup>a</sup> word used in connection with certain instructions in order to select <sup>a</sup> byte. The effective address generated by this word locates the word in which the byte is to be found, while bits 0-5 describe the position of the byte in the word and bits 6-11 give the byte size, which may range from <sup>0</sup> to 36. Bit <sup>12</sup> is unused.
- D. All integer operands and results are two's complement numbers. That is, the 36-bit binary number implied by the bit pattern is interpreted directly if bit <sup>0</sup> is zero. Otherwise this number is interpreted as the negative number found by subtracting  $2^{36}$ . It should be noted that there is, therefore, only one representation of the integer zero.
- E. Floating point operands and results use bits 1-8 as <sup>a</sup> binary exponent plus 128, and bits 9-35 as <sup>a</sup> fraction. Bit <sup>0</sup> is zero for positive numbers. Negative numbers are the two's complement of the corresponding positive number. Both floating and fixed point zeros have

all bits zero. It follows that the same arithmetic comparison instructions may be used for fixed and floating point numbers.

#### 1.1.4. Pagination - Segmentation

The PDP-6 uses two distinct schemes by which effective addresses (such as those generated by instructions) are translated into actual addresses (in core memory). The first scheme, *absolute addressing;* simply treats the effective address as the actual address. This scheme is used only when the processor is executing in *executive mode,* <sup>a</sup> mode intended to be reserved to the operating system itself. All computations (user problems) should be permitted to run only under the second scheme, *pagination-segmentation.^* The complete rationale behind this rather involved scheme, now described, will be made clear when its role in an operating system is discussed (see Section 2.1.).

A computation will have access to a *virtual memory* of 2<sup>22</sup> words. The 22bit *virtual address* of one of these words may be analyzed into <sup>a</sup> 7-bit (highorder) *segment number* specifying one of <sup>128</sup> *segments* into which the virtual memory is divided, and <sup>a</sup> 15-bit (low-order) *word number* specifying <sup>a</sup> word within the segment. The contents of each of the 22-bit *address base registers* (ABR) is interpreted as <sup>a</sup> virtual address. An (18-bit) effective address, when it is used to specify <sup>a</sup> memory word, is converted into <sup>a</sup> virtual address as follows (except for addresses 0-15 which always specify an accumulator). The high-order three bits of the effective address select an ABR. The word number in this register (called the *augment)* is added to the low-order fifteen bits of the effective address (called the *offset)* to produce <sup>a</sup> word number for the resultant virtual address. The segment number in the ABR becomes the segment number of the resultant address. The intention is that the ABR contain an entry point address (or pointer to an array) which serves as <sup>a</sup> reference point from which the offset is measured.

Address base register <sup>0</sup> is called the *procedure base register* (PBR), and always contains <sup>a</sup> zero augment. The program counter always has its highorder three bits equal to zero (except in executive mode), and thus selects this register to determine the segment containing instructions. Jump instructions reload the PBR if the jump is out of segment; <sup>a</sup> special pair of jump

instructions save and restore this register, thus permitting ready return from subroutines.

Every segment is divided into *pages* of <sup>a</sup> uniform size which may be 128, 512, <sup>2</sup> K, or <sup>8</sup> <sup>K</sup> words. <sup>A</sup> segment is brought into actual (core) memory <sup>a</sup> page at <sup>a</sup> time from the disc. <sup>A</sup> core table (the page table), associated with each segment that has at least one page in core, has one entry for each page of the segment. This entry contains the actual core address of the page or shows it to not be in core. <sup>A</sup> segment is limited to <sup>32</sup> pages; hence segments using the smaller page sizes cannot have the maximum 32 <sup>K</sup> words.

Every active computation is associated with <sup>a</sup> core table (the segment table), with an entry for each segment in the virtual memory of the computation. The entry contains the actual core address of the page table for that segment, or shows that the page table (and hence every page) for the segment is not in core. Several segment tables may point to the same page table; segments may be shared. The *descriptor base register* (DBR) contains the address of the segment table associated with the executing process.

Thus at each memory reference, the processor hardware, as well as converting the effective address into <sup>a</sup> virtual address as described above, goes to <sup>a</sup> segment table and <sup>a</sup> page table and finds the actual address of the desired word (see Figure 1.2.). If for any reason the actual address cannot be found (such as, if the page is not in core), the instruction execution is aborted and the processor interrupted (see Section 1.1.6.). The two *interrupt summary registers* then contain information which enables the operating system to diagnose and, if possible, correct the difficulty (such as by retrieving the page from disc  $-$  time shared execution of other processes meanwhile continuing).

The processor has eight *associative memory registers* which contain the segment and page numbers of eight pages together with the corresponding page table entry. At each memory reference, these registers are interrogated in parallel by the hardware to find whether they hold information about the referenced page; if so, the two memory references to the segment and page tables may be bypassed. Only because of these registers can paging be efficient. The associative memory registers are ranked by the hardware according to

recentness of use; every time segment and page tables are actually referenced, the page table information obtained replaces the information in the least recently used register. Clearly, programs which "jump around" will be less efficient than those which do not.

Paging has the advantages that programs need not occupy contiguous core and need not be fully loaded in order to execute, thus increasing the efficiency of core utilization.

#### 1.1.5. Protection

In addition to executive mode, the processors have two other modes, in both of which pagination and segmentation are effective. One of these modes, the *IOT mode,* is intended to be used only by the operating system, since, except for the addressing scheme, it differs little from executive mode. The other mode, *user mode,* is intended for general use. In this mode the following things, possible in executive mode, are illegal and cause an interrupt (thus shifting control to the operating system, see Section 1.1.6.).

- A. Memory references are limited to areas located by the chain of pointers: DBR to segment table to page table to page. Further, the kinds of references may be limited. Each page table entry contains three bits, the read  $(R)$ , write  $(W)$ , and execute  $(X)$  bits. If and only if the corresponding bit is set may the words of the page be referenced to fetch data (R), store results (W), or fetch instructions (X). (All references to accumulators are permitted.) Thus, shared segments will usually not have "W" access. It is assumed that segment and page tables and locations 32-47, if they are accessible at all in user mode, occur in pages with "W" access denied.
- B. All I/O instructions are forbidden.
- C. Certain other instructions are also forbidden, such as those which load the descriptor base register, change the processor mode, or halt the processor. It is assumed that the instruction in location <sup>33</sup> is carefully chosen.

-9- LTSS-1 Edition <sup>2</sup>





#### 1.1.6. Interrupt Capabi1ity

Each processor has seven interrupt channels. Each device on the I/O bus may be connected to one of the channels by <sup>a</sup> suitable I/O instruction. The processor itself is considered to be device <sup>0</sup> and thus may (and should) be connected to <sup>a</sup> channel so that processor interrupts may be effective; these include push-down-list overflow, arithmetic overflow (if enabled), program counter skip or jump (if enabled for flow tracing), pagination-segmentation faults, nonexistent memory (because of faulty or missing box), and memory parity failure. The seven channels are ranked in priority from <sup>1</sup> (high) to <sup>7</sup> (low); any number of devices may be on <sup>a</sup> single channel. <sup>A</sup> suitable I/O instruction can clear, turn on, or turn off the entire interrupt system, can turn on or off any channel, and can request an interrupt on any channel. Also, <sup>a</sup> suitable I/O instruction to the I/O bus switch device will cause it to interrupt the other processor.

When an interrupt is requested on <sup>a</sup> channel (either by <sup>a</sup> device connected to that channel with the channel on or by the special I/O instruction just mentioned), the interrupt begins as soon as the current instruction reaches <sup>a</sup> suitable point and all higher priority channels are dismissed. This beginning of the interrupt consists of the execution of instruction at location (32+2n), where n is the channel number. This instruction should be carefully chosen. Usually it is <sup>a</sup> jump to subroutine, entering executive mode and saving the program counter (which may be that of <sup>a</sup> lower priority channel). This subroutine should determine the cause of the interrupt (by checking the status of all devices on that channel), service the needs indicated (saving and restoring any accumulators or other registers used), reset the interrupting devices to <sup>a</sup> noninterrupting status, and then return to the interrupted routine with <sup>a</sup> special jump instruction which dismisses the channel.

#### 1.1.7. Core Memory

The core memories are accessible through six *ports.* Two of these ports are attached to the processors; the others to multiplexors to which several devices can be attached. Priority schemes in the porting structure and in the multiplexors select which device desiring access to <sup>a</sup> given box gets the

next memory cycle. Accesses to different boxes do not conflict. Clearly, the port and multiplexor connection of each device must be made with full consideration of its speed and its ability to survive delay without data loss. Consecutive addresses may be optionally placed in <sup>a</sup> single box or alternated between two interlaced boxes.

An important feature is that the processors may - on <sup>a</sup> given single access not only read or write the memory, but may also read the memory and rewrite the same word with different data  $-$  no other processor or device being able to access the word in the interim. Many instructions exploit this feature. The result is vast simplifications in the programming required to handle certain "close call" situations when two processors share <sup>a</sup> common memory.

#### 1.2. THE CDC 6600 COMPUTER SYSTEM

#### 1.2.1. System Organization

The Control Data Corporation 6600 computer system $^3$  consists of eleven independent computers, <sup>a</sup> shared central memory, and twelve input/output data channels (Figure 1.3.). Ten of these eleven computers are called *peripheral and control processing units* (PPU); the eleventh is termed the *central processing unit* (CPU).

The CPU is <sup>a</sup> high speed arithmetic device with access to the central memory. Each PPU has its own memory, and can also access the central memory. <sup>A</sup> PPU may execute <sup>a</sup> program independently of the CPU and the other nine PPU's, may control and start the CPU, and may transfer data between the central memory and any of the twelve data channels.

The 6600 computer is capable of concurrent operations of three types: program execution, CPU functional unit operation, and memory access. Concurrent program execution may occur when the CPU and any PPU's are operating simultaneously, as described above. The other two types will be discussed below.





#### 1.2.2. Hardware Components

The Lawrence Radiation Laboratory presently has three 6600 computing systems in the time sharing system. The equipment for each system is very similar to the other two; the components of <sup>a</sup> typical system are given below.

- A. One CDC 6601 main frame, containing the CPU, ten PPU's, central memory, and some input/output synchronizers.
- B. One CDC <sup>6602</sup> console display unit, with <sup>a</sup> manual keyboard and two cathode ray tube display units.
- C. One CDC 405 high speed card reader.
- D. One CDC 415 card punch.
- E. One CDC <sup>501</sup> high speed printer.
- F. One CDC <sup>1612</sup> high speed printer.
- G. Eight CDC 607 one-half inch magnetic tape units.
- H. Three CDC <sup>6603</sup> mass storage disc files.
- I. One 280-C cathode ray tube display system, commonly known as the DD80-C.

#### 1.2.3. Central Memory Characteristics

The 6600 *central memory* is <sup>a</sup> random access coincident-current magnetic core memory of 131,072 sixty-bit words, arranged in thirty-two logically independent banks of <sup>4096</sup> words each. The central memory is common to all eleven system computers. It requires one major cycle (1000 nanoseconds) to perform <sup>a</sup> read-write operation. Several banks may be in operation at any given time. The maximum memory reference rate is one address per minor cycle (100 nanoseconds). Similarly, the maximum rate of data flow to and from memory is one word per minor cycle.

The location of each word in central memory is specified by <sup>a</sup> seventeen bit address. Consecutive words are located in different banks to obtain the rapid access rates described above. Thus, each address is composed of two fields — <sup>a</sup> twelve-bit left-hand portion defining <sup>a</sup> location within <sup>a</sup> bank and <sup>a</sup> five-bit right-hand portion defining the bank. (The remaining bit of

the eighteen-bit address field, located to the left of the twelve-bit portion, is not used.)

OAA AAAAA AAAAA BBBBB

where  $A = address$  within bank, and  $B = bank$ .

Certain precautions are required so that independent programs may time share <sup>a</sup> computer. In the 6600, every program has <sup>a</sup> *reference address* (RA) and <sup>a</sup> *field length* (FL) attached to it by the PPU that initiates the program. All CPU references to central memory (for instructions or data) are therefore made relative to the reference address and are checked by the hardware to insure that they do not exceed the field length. This allows easy relocation of the program in central memory, and insures the integrity of each program in central memory. As an example, suppose <sup>P</sup> is an address in <sup>a</sup> program (relative to the program's first word address of zero). The program is loaded into central memory beginning at location RA (that is, program address zero corresponds to memory address RA). Any reference by the program to location <sup>P</sup> results first in <sup>a</sup> check to insure that

 $0 < P < FL$ .

Then, the sum  $P + RA$  is formed, and this word of central memory is accessed.

#### 1.2,4. Peripheral Processor Characteristics

There are ten identical *peripheral and control processors* (PPU), each with <sup>a</sup> twelve-bit 4096-word random access coincident-current magnetic core memory. There are twelve input/output data channels in the 6600; all PPU's have access to all twelve channels. In addition, all PPU's have access to central memory, and may cause the CPU to begin execution of <sup>a</sup> program in central memory. Each PPU may operate independently of and simultaneously with the other nine. The PPU's act as system control computers, performing input, output, and supervisory tasks while the CPU carries out high-speed arithmetic computations.

The basic time units for the <sup>6600</sup> are <sup>a</sup> minor cycle of <sup>100</sup> nanoseconds and <sup>a</sup> major cycle of <sup>1000</sup> nanoseconds. <sup>A</sup> PPU may access its own <sup>4096</sup> word memory in one major cycle, and may transmit or receive data through an I/O channel at <sup>a</sup> maximum rate of one twelve-bit word per major cycle. There is <sup>a</sup> real time clock, available on <sup>a</sup> channel of its own (not one of the twelve I/O channels), which counts major cycles. The clock period is 4096 major cycles (4.096 milliseconds); that is, the twelve-bit word holding the clock time overflows every 4.096 ms.

An important feature of <sup>a</sup> PPU is its ability to control the CPU, by issuing an *exchange jump* instruction. This instruction sends an eighteen-bit address to the CPU, and causes the CPU to cease executing instructions. The address is the starting location of <sup>a</sup> sixteen word *exchange jump package* containing various pieces of information about <sup>a</sup> CPU program to be executed. Hardware in the CPU replaces the exchange jump package with similar data from the interrupted program.

<sup>A</sup> PPU can also monitor <sup>a</sup> CPU program by obtaining the current program address.

#### 1.2.5. Central Processor Characteristics

The 6600 CPU is <sup>a</sup> high speed arithmetic processor that has access only to the central memory; it is incapable of any input or output functions. It is composed of several functional units, to carry out arithmetic and logical operations, and <sup>a</sup> control unit, to direct the functional units, initiate instruction fetching, and perform fetching and storing of data. The high speed of the CPU is obtained by minimizing memory references for both instructions and data, by the bank interlacing of the central memory, and by the concurrent functional unit operation on unrelated instructions.

#### 1.2.6. Instruction Format

The <sup>6600</sup> instructions may occupy fifteen bits or thirty bits; one instruction word may contain any of five different instruction combinations, as indicated below:



Groups of bits in an instruction are commonly identified by the letters f, m, i, j, <sup>k</sup> (all three-bit groups), and <sup>K</sup> (eighteen-bit constant). <sup>A</sup> typical fifteen-bit instruction has five three-bit fields:



(total of <sup>15</sup> bits),

where



The typical thirty-bit instruction has four three-bit fields and one eighteen-bit field:



where



Details of the CPU instructions are summarized in the appendix.

#### 1.2.7. CPU Instruction Registers

The CPl/ contains eight sixty-bit *instruction registers* (commonly called the *stack),* designated by 10, II, ..., 17. During the execution of <sup>a</sup> program, instruction words are transferred from central memory to register 10, one at <sup>a</sup> time from (usually) sequential locations. The instruction word is transferred by the control unit from <sup>10</sup> to <sup>a</sup> device called the *U-register,* where it is decoded into the component two, three, or four instructions. These are then issued, sequentially, to the functional units for execution.

If the instruction word does not contain <sup>a</sup> branch instruction, <sup>a</sup> new instruction word is requested for <sup>10</sup> as soon as the current instruction word has been transferred to the U-register. At this point, all instruction words in registers <sup>10</sup> - <sup>16</sup> are transferred to the next higher instruction register;  $16 \div 17$ ,  $15 \div 16$ , ...,  $10 \div 11$ . The contents of 17 are lost from the stack.

If the instruction word contains <sup>a</sup> branch instruction, this process changes somewhat. First, the branch is tested to see if the branch condition is satisfied; if not, processing continues as described in the above paragraph. If the branch is to be taken, however, the branch address is examined to see if the next instruction word is already in <sup>a</sup> stack register. If not, the new word is fetched into <sup>10</sup> and the foregoing procedure continues. (There is <sup>a</sup> considerable delay while this fetch from memory to <sup>10</sup> takes place.) If the new instruction word is already in the stack (say, in register 15), it is fetched into the U-registers from that instruction register. Succeeding instructions will be taken from successive stack registers as long as possible at <sup>a</sup> more rapid rate (that is, from 14, ..., 10). No more instruction words are brought to <sup>10</sup> until an instruction word is requested that is not in the stack. <sup>A</sup> loop of seven words or less (at most 27 instructions) can be executed very rapidly, since no waiting is necessary for instruction words to be brought from memory. Efficient <sup>6600</sup> programming may require maximal use of stack loops and minimal use of out-of-stack branch instructions.

#### 1.2.8. CPU Operating Registers

References to central memory for fetching and storing data are minimized by the use of twenty-four operating registers, divided into three sets of eight registers:

> Eight 18-bit address registers (A-registers) AO, ..., A7 Eight 60-bit operand registers (X-registers) XO, ... X7 Eight 18-bit increment registers (B-registers) BO, ..., B7 .

All X-registers are used to hold operands for and operation results from the functional units. The five registers XI, ..., X5 can hold operands read from central memory; the two registers X6 and X7 similarly can hold results to be sent to central memory. As an illustration of the use of these registers, consider the following instruction sequence:

> Transfer  $\lceil A \rceil$  to  $B^{\dagger}$ Set D equal to  $[A] + [A] * [C]$ .

This may be accomplished by the following sequence of operations:

- 1. Fetch [A] into register X2.
- 2. Fetch [C] into register X4.
- 3. Multiply the contents of X2 and X4 together; send the result to XO.
- 4. Move the contents of X2 to X6.
- 5. Add the contents of XO and X2 together and send the result to X7.
- 6. Store [X6] into B.
- 7. Store [X7] into D.

The address registers are used to fetch operands from memory and store results into memory. Placing <sup>a</sup> number <sup>P</sup> in address register A1, ..., A5 will cause the contents of memory word <sup>P</sup> to be placed into the corresponding X-register. Similarly, placing <sup>a</sup> number <sup>P</sup> in the address register A6 or A7 will cause the contents of the corresponding X-register to be placed into memory location P. Registers AO and XO are independent and have no connection with central memory. As an illustration, consider the sequence given above. It may more accurately be given by changing certain steps as follows:

—<br>† "[A]" means the contents of the variable named "A".

- 1. Place (A) into register A2, causing [A] to be placed into X2. $^\dagger$
- 2. Place (C) into A4, causing [C] to be placed in X4.
- 3. Multiply the contents of X2 and X4 together; send the result to XO
- 4. Move the contents of X2 to X6.
- 5. Add the contents of XO and X2 together and send the result to X7.
- 6. Place (B) into A6, causing [X6] to be transferred to B.
- 7. Place (D) into A7, causing [X7] to be transferred to D.

The B-registers have no direct connection to central memory; registers B1, ..., B7 are used to provide program indexing. Register B0 is eternally fixed as an eighteen-bit zero. In the example above, suppose we wish to perform the instructions

> Set  $B(I) = A(I)$ SET  $D(I) = A(I) + C*A(I)$

for  $I = 0, 1, \ldots, 10$ . Instructions to do this, using two B-registers, might be the following:

- 1. Initialize the B-registers by setting Bl <sup>=</sup> <sup>0</sup> and B3 <sup>=</sup> 11.
- 2. Place  $(A) + [B1]$  into A2 (that is, the location of the first word of array A, plus the increment given in Bl), resulting in  $[A(I)]$  in X2.
- 3. Place (C) into A4, resulting in [C] in X4.
- 4. Multiply [X2] and [X4] together, and send the result to XO.
- 5. Move [X2] to X6.
- 6. Add [XO] and [X2] together, sending the result to X7.
- 7. Place  $(B) + [B]$  into A6, resulting in  $B(I) = A(I)$ .
- 8. Place  $(D) + [B1]$  into A7, resulting in  $D(I) = A(I) + C*A(I)$ .
- 9. Place [Bl] <sup>+</sup> <sup>1</sup> into Bl.
- 10. If  $[B1] < [B3]$ , jump to step 2 for the next iteration of this loop.

#### 1.2.9. CPU Functional Units

Additional operating speed is obtained by the use of ten *functional units* which may operate simultaneously on unrelated instructions, as long as no conflicts are present. The multiply and increment units are duplexed, so

 $\hat{\tau}_{\text{u}}(A)$ " means the location of the variable named "A".

that an instruction may be sent to the second unit whenever the first one is busy.



#### 1.2.10. Exchange Jump

An eighteen-bit *P-register* is used to hold the address of each program instruction word as it is being executed. <sup>P</sup> is advanced in the following ways:

- A. <sup>P</sup> is advanced by one when all instructions in <sup>a</sup> sixty-bit word have been extracted and sent to the instruction registers.
- B. <sup>P</sup> is set to the address specified by <sup>a</sup> branch instruction.
- C. <sup>P</sup> is set to the address specified in the exchange jump package.

<sup>A</sup> program is begun in the CPU by means of an *exchange jump* instruction from <sup>a</sup> PPU. This causes initial values to be entered into all operating registers and the P-register from <sup>a</sup> 16-word *exchange jump package* located in central memory. The PPU actually provides the CPU with the first word address of this package, and the CPU exchanges the current contents of the program's registers with the new contents as given in the exchange jump package. Thus,

the controlling data for two programs is interchanged; another exchange jump may later cause control to go back to the interrupted program.

The exchange jump package provides the following items of information for <sup>a</sup> program to be executed:

- A. Program address (P).
- B. Reference address for central memory (RA).
- C. Field length for central memory (FL).
- D. Program exit mode (EM).
- E. Initial contents of operating registers.

These quantities are located as indicated in Figure 1.4.

-22- LTSS-1 Edition <sup>2</sup>



Location is relative to the first word of the exchange jump package. Bits are numbered <sup>0</sup> to 59, reading right to left.

Figure 1.4. CDC 6600 Exchange Jump Package

#### 1.2.11. CPU Exit Mode

Execution of <sup>a</sup> program by the CPU will continue until <sup>a</sup> PPU causes an exchange jump to take place, or until an error occurs. The *exit mode* feature allows the programmer to control the three error conditions that may occur. These are:



To select the exit conditions that he wishes to cause execution to stop, the programmer must preset the EM bits of the exchange jump package as follows



When an error is detected for which the exit mode is set, the CPU generates <sup>a</sup> halt instruction at location zero of the program (location RA of central memory), containing the upper six bits of the exit condition and the contents of the P-register.





The CPU then sets the P-register to zero, causing <sup>a</sup> jump to this generated halt instruction. When an error is detected for which the exit mode is not set, it is ignored.

#### 1.2.12. Floating Point Arithmetic

All arithmetic in the 6600 is performed using one's complement numbers. This means that <sup>a</sup> K-bit number is interpreted directly if the sign bit (bit K-l) is zero, and is interpreted by comolementing the entire number if the sign bit is one. Note that there are therefore two representations of zero (00...0 and 77...7, octal).

Since the <sup>6600</sup> is intended to be used primarily for large scientific problems, floating point arithmetic is used for most calculations. <sup>A</sup> floating point data word on the <sup>6600</sup> computer system occupies an entire sixty-bit word, and contains three fields:



The binary point is considered to be to the right of the coefficient. The sign bit is zero for plus and one for minus; negative numbers are represented in one's complement notation. The exponent is biased by 2000 octal  $(2^{10})$ . This means that the characteristic is formed by adding 2000 to the true exponent if it is positive, or by adding <sup>3777</sup> to the true exponent if it is negati ve :

> True exponent 274 becomes biased exponent 2274. True exponent -36 becomes biased exponent 1741.

For example, floating point numbers 1.0 and 0.75 would be given by

20000 00000 00000 00001 17760 00000 00000 00003

if unnormalized, and by

17204 00000 00000 00000 17176 00000 00000 00000

if normalized.

Note that exponent arithmetic uses the one's complement notation. Floating point numbers may appear with exponents from 0000 to 3776 (thus ranging from -1023 to +1022, decimal). This allows floating point numbers in the range  $10^{-293}$  to  $10^{322}$ , approximately.

Floating point numbers may be normalized or unnormalized, and performing an arithmetic operation on two normalized floating point numbers need not produce <sup>a</sup> normalized result. The 6600 also has capability for operating on double precision floating point numbers, each with its own exponent and coeffi cient.

An arithmetic operation that results in an exponent larger than 3777 is treated as an *infinite* quantity; <sup>a</sup> coefficient of all zeros and an exponent of <sup>3777</sup> is created for such <sup>a</sup> result. Use of infinity (and, in some cases, zero) as operands may produce an *indefinite* result; <sup>a</sup> coefficient of all zeros and an exponent of <sup>1777</sup> is created for such <sup>a</sup> result. Note that no error occurs when an infinite or indefinite result is generated. An optional exit is available when such results are used as operands. For more information on these topics, olease refer to reference 3.

#### 1.3. THE CDC 7600 COMPUTER SYSTEM

#### 1.3.1. Hardware Components

The Control Data Corporation 7600 computer system<sup>5,6</sup> actually consists of eleven independent computers, much like the CDC 6600. The machine was, in fact, designed to be upward compatible with the 6600 for user programs. The input-output section has been greatly changed, however, and core memory now comes in two types.

The *central processing unit* (CPU) contains <sup>a</sup> high speed computation section, with access to both types of central memory. The *peripheral processor units* (PPU) operate independently of the CPU and of each other, and control the input-output functions of the system. The *small core memory* (SCM) is <sup>a</sup> very fast coincident current multibank device, and is used to contain executing programs, the resident system monitor program, and some 1/0 buffer areas. The *large core memory* (LCM) is <sup>a</sup> much larger and slower linear selection type of memory, and provides the basic working storage for the CPU. The system is illustrated in Figure 1.5.

#### 1.3.2. Small Core Memory Characteristics

The 7600 CPU contains two memories. The *small core memory* (SCM) is <sup>a</sup> random access coincident-current magnetic core memory of 65,536 sixty-bit words, arranged in <sup>32</sup> logically independent banks of 2048 words each. Up to ten banks may be in operation at any given time. The maximum memory reference rate is one word per clock period (27.5 nanoseconds); the read-write cycle time is ten clock periods (275 ns).

The location of each word in SCM is specified by <sup>a</sup> sixteen-bit address. Consecutive words are located in different banks to obtain the rapid access rates. Thus, each address is composed of two fields — an eleven bit left-hand portion defining <sup>a</sup> location within <sup>a</sup> bank, and <sup>a</sup> five-bit right-hand portion selecting the bank. (The remaining portion of the eighteen-bit address field, located to the left of the eleven-bit portion, is not used.)





#### 00A AAAAA AAAAA BBBBB ,

where  $A =$  address within bank, and  $B =$  bank.

The 7600 memory protection scheme is similar to the 6600 process. Every program in SCM has <sup>a</sup> reference address (RAS) and <sup>a</sup> field length (FLS). All CPU references to SCM are made relative to RAS, and are checked by the hardware to insure that they do not exceed FLS.

The first <sup>4096</sup> addresses of SCM are reserved for input-output and control buffers. All data entering the CPU from <sup>a</sup> PPU goes through these buffers; when <sup>a</sup> buffer is full, the CPU normally transfers the data to LCM.

#### 1.3.3. Large Core Memory Characteristics

The 7600 *large cove memory* (LCM) is <sup>a</sup> linear selection type memory of 512,000 sixty-bit words, arranged in eight independent banks of 64,000 words each. <sup>A</sup> reference to <sup>a</sup> word in LCM results in eight sixty-bit words being read simultaneously into <sup>a</sup> 480-bit register; there is one such register per bank. The memory cycle time is 1.76 microseconds (64 clock periods).

The large core memory is intended to provide basic working storage for <sup>a</sup> CPU program. Instructions cannot be executed directly from LCM — they must be read into the SCM and executed from there. Maximum data transfer rates between LCM and SCM occur when many consecutive words are moved. One sixtybit word per clock period can be transferred during such block copy instructions. However, single words may be read from LCM. <sup>A</sup> contiguous group of eight words is brought to a 480-bit register in 1.76  $\mu$ s. The other seven words in this packet may then be referenced directly from this register in only three clock periods each (82.5 ns).

Memory protection is similar to that described for SCM, using <sup>a</sup> relative address (RAL) and <sup>a</sup> field length (FLL).

#### 1.3.4. Peripheral Processor Characteristics

Each of the ten *peripheral processor units* (PPU) is an independent computer with <sup>a</sup> twelve-bit 4096 word random access coincident current magnetic core memory. Each PPU has provision for eight fully duplex input/output channels, one of which leads to the small core memory buffer area. The PPU's are used to perform input and output at the request of the CPU system monitor program.

The basic time unit for <sup>a</sup> 7600 PPU is the clock period of 27.5 ns. <sup>A</sup> PPU may access its own memory in <sup>275</sup> ns, and may transmit or receive data through an I/O channel at <sup>a</sup> maximum rate of one twelve-bit word per nine clock periods (247.5 ns).

#### 1.3.5. Central Processor Characteristics

The 7600 *Central Processing Unit* (CPU) consists of <sup>a</sup> computation section, both memories, and an input/output multiplexor. The *computation section* is <sup>a</sup> high speed arithmetic processor that has access only to the central memories; it is incapable of any input or output function. It has <sup>a</sup> sixty-bit internal word, nine independent functional units to carry out arithmetic and logical operations, and <sup>a</sup> twelve-word instruction stack. The high speed of the CPU is obtained, as in the 6600, by minimizing memory references for instructions and data, by the interlacing of the SCM, and by the concurrent operation of the functional units. In addition, the functional units are segmented (Section 1.3.9).

The CPU *I/O Multiplexor* (MUX) is used to communicate between the PPU's and the SCM, by <sup>a</sup> data-buffering mechanism. Communication between CPU and PPU is over <sup>a</sup> twelve-bit fully duplex channel. The MUX has <sup>15</sup> of these channels, each with separate SCM buffer areas for input and for output.

#### 1.3.6. Instruction Format

 $\blacktriangle$ 

The CDC <sup>7600</sup> instruction format is identical to that of the 6600. Each instruction may occupy fifteen or thirty bits; one instruction word may contain any of five different instruction combinations, as indicated below:

(bits)



Groups of bits in an instruction are commonly identified by the letters g, h, i, j, <sup>k</sup> (three-bit groups), and <sup>K</sup> (eighteen-bit constant). <sup>A</sup> typical fifteen-bit instruction has five three-bit fields:



where



The typical thirty-bit instruction has four three-bit fields and one eighteenbit field:



where



The <sup>g</sup> bits generally identify the type of instruction and the functional unit. The <sup>h</sup> bits usually specify the functional unit mode. Details of the instructions are summarized in the appendix.

#### 1.3.7. CPU Instruction Registers

The CPU contains twelve sixty-bit *instruction registers,* commonly called the *stack.* During the execution of <sup>a</sup> program, instruction words are transferred to the stack from the SCM, two words ahead of the instruction currently being executed. As each new instruction is fetched into the stack, the ones already there are shifted one register. The oldest instruction is lost.

The stack is used most efficiently for small programs that can be contained entirely within the stack. Only ten of the stack registers may be used in such <sup>a</sup> loop, so it may contain at most <sup>40</sup> instructions.

An 18-bit *P-register* is used to hold the address of each program instruction word as it is being executed. <sup>P</sup> is advanced in the following ways:

- A. <sup>P</sup> is advanced by one when all instructions in <sup>a</sup> sixty-bit stack register have been extracted and sent to the *current instruction word* register.
- B. <sup>P</sup> is set to the address specified by <sup>a</sup> branch instruction.
- C. <sup>P</sup> is set to the address specified in the exchange package.

#### 1.3.8. CPU Operating Registers

The twenty-four operating registers are nearly identical in arrangement, number, and function to those of the 6600. (See Section 1.2.8.) There are three sets of eight registers each:

> Eight 60-bit operand registers (X-registers) XO, ..., X7, Eight 18-bit address registers (A-registers) AO, ..., A7, Eight 18-bit increment registers (B-registers) BO, ..., B7.

All X-registers are used to hold operands for and operation results from the functional units and LCM. The five registers XI, X2, ..., X5 can hold operands read from SCM, and the two registers X6 and X7 can hold results to be sent to SCM.

The A-registers are used to fetch operands from SCM and store results into SCM. Placing a number in address register Al,  $A2$ ,  $\ldots$ ,  $A5$  will cause the

contents of the corresponding SCM word to be transferred to the corresponding X-register. Similarly, placing <sup>a</sup> number in A6 or A7 causes the contents of X6 or X7 to be transferred to SCM.

Registers AO and XO have no connection with SCM. They are used to hold addresses for the block copy instructions that transfer data between LCM and SCM

Any X-register may be used to hold the 21-bit address of <sup>a</sup> single word in LCM. This word may then be brought to an X-register from LCM, or sent to LCM from an X-register, by means of the read and write LCM instructions (014 and 015 see Appendix).

The B-registers are used primarily to provide program indexing and loop counting. Register BO is eternally fixed as an eighteen-bit zero.

#### 1.3.9. CPU Functional Units

The *functional units* of the <sup>7600</sup> are considerably different from the 6600. Except for the multiply and divide units, all functional units have *one clock period segmentation.* This means that the operands for the unit move through the unit in <sup>a</sup> "pipe-line" fashion, freeing previous portions each clock period Thus, <sup>a</sup> new set of operands may be entered into the functional unit each clock period. The multiply unit has <sup>a</sup> two clock period segment, and the divide unit has no segmentation at all (it uses an iterative algorithm, and requires eighteen clock periods before <sup>a</sup> new divide instruction may begin).



-33- LTSS-1 Edition <sup>2</sup>



Functional unit times are given in Figure 1.6.

-34- LTSS-1 Edition <sup>2</sup>

 $\hat{\mathbf{r}}$ 



Figure 1.6. CDC 7600 Timing

#### 1.3.10. Exchange Jump

The *exchange jump* instruction is <sup>a</sup> special instruction to allow the CPU to switch execution from one program to another. It causes initial values to be entered into all operating registers and the program address register from <sup>a</sup> sixteen word *exchange package,* and previous values of those registers to be stored in the same package. The following quantities are involved:

- A. Program address (P).
- B. Reference address for small core (RAS) and large core (RAL) memories.
- C. Field length for small core (FLS) and large core (FLL) memories.
- D. Program status designation register (PSD).
- E. Normal exit address (NEA).
- F. Error exit address (EEA).
- G. Breakpoint address (BPA).
- H. The operating registers.

These quantities are diagrammed in Figure 1.7.

An exchange jump may be initiated under several different conditions.

- A. An exchange exit instruction is issued by the system monitor program.
- B. An error exit instruction is issued by <sup>a</sup> user program or by <sup>a</sup> CPU error condition.
- C. An input or output interrupt occurs.
- D. <sup>A</sup> real time interrupt occurs by an overflow of the clock period counter (every 3.6 ms).
- E. <sup>A</sup> program breakpoint is reached.
- F. The program is operating in step mode.

The <sup>7600</sup> has two exit addresses in the exchange package. These designate absolute SCM addresses, assumed to reside within the system monitor program. NEA is used by <sup>a</sup> user program issuing an exchange exit instruction, to request monitor services (such as an input or output request). EEA is used by a user program issuing an error exit instruction, or if <sup>a</sup> CPU error is detected (such as arithmetic overflow, indefinite results, hardware failure, or address out of range). In the latter case, the type of error will be specified in the PSD register. (See Section 1.3.11.)

 $\bar{\mathbf{x}}$ 

 $\bar{\mathbf{z}}$ 



bits 59...54,53............................36,35................................18,17................................ <sup>0</sup>

Location is relative to the first word of the exchange package. Bits are numbered <sup>0</sup> to 59, reading right to left.

Figure 1.7. CDC 7600 Exchange Package

<sup>A</sup> real time interrupt occurs after <sup>a</sup> user program has run for about 3.6 milli seconds. This is used to segment program execution, and to enable the system monitor program to perform required periodic "house-keeping": initiate input or output processing, update the clock, update the charge to the user, and possibly initiate <sup>a</sup> new user program.

During debugging, <sup>a</sup> program may be executed in small sections by using the *breakpoint address* (BPA) register. Whenever the program reaches the breakpoint (P <sup>=</sup> BPA), then execution ceases with <sup>a</sup> jump to EEA. Normally, no instructions are executed at BPA. <sup>A</sup> program executing in *step mode* will cease execution, with <sup>a</sup> jump to EEA, at the end of each instruction word.

#### 1.3.11. Program Status Designators

Execution of a program by the CPU will continue until an exchange jump occurs, as discussed in the previous section. The reason for the exit may be determined by examining the bits of the *Program Status Designator* (PSD) in the exchange package.

<sup>1</sup>

- 
- 
- 
- 
- 
- 12 Underflow Mode Flag 3 Step Condition Flag
- 11 LCM Parity Condition Flag 2 Indefinite Condition Flag
- 10 SCM Parity Condition Flag 1 Overflow Condition Flag
- 9 LCM Block Range Condition Flag 0 Underflow Condition Flag

#### Bit Function Bit Function

- 17 Exit Mode Flag 8 SCM Block Range Condition Flag
- 16 Monitor Mode Flag 7 LCM Direct Range Condition Flag
- 15 Step Mode Flag 6 SCM Direct Range Condition Flag
- 14 Indefinite Mode Flag 5 Program Range Condition Flag
- 13 Overflow Mode Flag 13 Overflow Mode Flag
	-
	-
	-
	-

The six *mode flags* are set before the program is initiated, and govern its execution. The *exit mode flag* controls the source of the exchange package address for the execution of an exchange exit instruction, and the *monitor mode flag* determines whether the program can be interrupted by an input or output request. These two flags are never set for user programs.

 $\epsilon$ 

The *step mode flag* causes the program to be interrupted at the end of each instruction word. The *indefinite mode flag* will cause <sup>a</sup> program interrupt whenever an indefinite floating point result is detected. The *overflew mode flag* and *underflow mode flag* cause <sup>a</sup> program interrupt whenever <sup>a</sup> floating point overflow or underflow is detected in <sup>a</sup> floating point result. The interrupt will occur only after the current instruction word is completed, and control is sent to EEA in all four cases.

The twelve *condition flags* are set by the hardware whenever the specified conditions occur; control is sent to EEA at the end of the current instruction word. The *LCM parity condition flag* and the *SCM parity condition flag* are set whenever <sup>a</sup> parity error is detected in <sup>a</sup> memory reference. (The SCM has one parity bit per <sup>12</sup> bits, and the LCM has one parity bit per <sup>15</sup> bits.) The *LCM block range condition flag* and the *SCM block range condition flag* are set whenever <sup>a</sup> block copy instruction (instructions Oil and 012) requests <sup>a</sup> reference to an address greater than or equal to FLL or FLS. The *LCM direct range condition flag* is set whenever <sup>a</sup> direct read or write (instructions <sup>014</sup> and 015) requests <sup>a</sup> reference to an address greater than or equal to FLL. The *SCM direct range condition flag* is set whenever any SCM reference (other than block copy) is greater than or equal to FLS, and whenever the program register is greater than or equal to FLS.

The *breakpoint condition flag* is set whenever <sup>P</sup> <sup>=</sup> BPA, and the *step condition flag* is set whenever an instruction issues and the step mode flag is set. The *indefinite condition flag, overflow condition flag,* and *underflow condition flag* are set whenever the corresponding conditions are detected by <sup>a</sup> floating point functional unit. If the corresponding mode flag is also set, execution will terminate at the end of the current instruction word. This word probably does not contain the instruction that caused the condition, because of the overlapping of the functional unit processing. Indefinite and overflow conditions may occur during the execution of instructions 30-35, 40-42, and 44-45; underflow may occur during the execution of instructions 32-33, 40-42, and 44-45.

#### 1.3.12. Floating Point Arithmetic

All arithmetic in the 7600 is performed using one's complement numbers. This means that <sup>a</sup> k-bit number is interpreted directly if the sign bit is zero, and is interpreted by complementing the entire number if the sign bit is one. <sup>A</sup> floating point data word occupies sixty bits, including <sup>a</sup> onebit sign (the leftmost bit), an eleven-bit biased exponent, and <sup>a</sup> 48-bit integer coefficient. The binary point is considered to be to the right of the coefficient. The exponent is biased by 2000 octal  $(2^{10})$ . This means that the biased exponent is formed by adding <sup>2000</sup> to the true exponent if it is positive, and adding <sup>3777</sup> to the true exponent if it is negative.

Floating point numbers need not be normalized, and arithmetic operations performed on normalized operands need not produce <sup>a</sup> normalized result. The <sup>7600</sup> can operate with double precision numbers, each with its own exponent and coefficient.

It is sometimes necessary to convert <sup>a</sup> floating point number into decimal notation. One way to do this is described here.

- A. Examine the sign bit.
	- 1. If the sign bit is zero, the number is positive, so let  $S = +1$ .
	- 2. If the sign bit is one, the number is negative. Let  $S = -1$ , and complement the number.
- B. Separate the exponent, E, and the coefficient, C. The exponent is the leftmost four octal digits.
- C. Examine the exponent.

1. If  $E > 2000$  (octal), let  $R = E - 2000$ . (R is the true exponent.) 2. If  $E < 2000$  (octal), let  $R = -(1777 - E)$ .

D. Convert the coefficient to <sup>a</sup> decimal integer. Suppose the coefficient has octal digits  $C_{16}$   $C_{15}$   $C_{14}$  ...  $C_2$   $C_1$ . Then, this is converted to <sup>a</sup> decimal integer <sup>D</sup> by the formula

$$
D = C_{16} \cdot 8^{15} + C_{15} \cdot 8^{14} + \dots + C_2 \cdot 8 + C_1
$$

E. The final result is formed by attaching the true exponent and the sign:  $Y = S \cdot 2^R \cdot D$ .

This can be clarified by some examples.

1. The number is <sup>X</sup> <sup>=</sup> 17204 00000 00000 00000. A. <sup>S</sup> = +1. B. <sup>E</sup> = 1720, <sup>C</sup> = 4 00000 00000 00000. C. <sup>R</sup> <sup>=</sup> -578 <sup>=</sup> -4710. D. <sup>D</sup> <sup>=</sup> 4-815 <sup>=</sup> 4\*2^'' <sup>=</sup> 2Z\*7. E. <sup>V</sup> <sup>=</sup> (+1)(2"/,7)(21<7) <sup>=</sup> +1. 2. The number is <sup>X</sup> <sup>=</sup> 60511 57777 77777 77777. A. <sup>S</sup> <sup>=</sup> -1. <sup>X</sup> <sup>=</sup> 17266 20000 00000 00000. B. <sup>E</sup> = 1726, <sup>C</sup> = 6 20000 00000 00000. C. <sup>R</sup> <sup>=</sup> -51g <sup>=</sup> -41jQ. D. <sup>D</sup> <sup>=</sup> 6-815 <sup>+</sup> 2-811\* <sup>=</sup> 25-2/<3. E. <sup>Y</sup> <sup>=</sup> (-1) (2"^ ) (25 •2it3) <sup>=</sup> -25 \*22 <sup>=</sup> -100 3. The number is 16744 14336 75013 27554. A. <sup>S</sup> = +1. B. <sup>E</sup> <sup>=</sup> 1674, 0=4 <sup>14336</sup> <sup>75013</sup> 27554. C. <sup>R</sup> <sup>=</sup> -103g <sup>=</sup> -6710. D. <sup>D</sup> <sup>=</sup> 4-815 <sup>+</sup> S1\*\* <sup>+</sup> 4\*813 <sup>+</sup> 3\*812 <sup>+</sup> ... ; E. <sup>Y</sup> <sup>=</sup> (+1)(2"67)(2147)(236) <sup>=</sup> 2147-2"31 <sup>=</sup> (2147)(.465)(10"^) <sup>=</sup> (.998)(10"6) ^ 10"6.

Biased floating point exponents may range from <sup>2000</sup> to <sup>3777</sup> (octal) for positive exponents, and from <sup>1776</sup> to <sup>0000</sup> for negative exponents. If <sup>a</sup> number is generated with <sup>a</sup> biased exponent greater than 3777, an overflow condition occurs, and the overflow condition flag will be set.<sup> $\dagger$ </sup> The exponent will be forced to 3777, and the coefficient (or its complement) will be forced to zeros. If <sup>a</sup> number is generated with <sup>a</sup> biased exponent less than 0000, an underflow condition occurs. <sup>A</sup> word of all zeros or all ones is produced.

 $^{\texttt{T}}$ Some exceptions to these remarks may occasionally occur; see reference 6.

An indefinite condition occurs whenever <sup>a</sup> floating point functional unit cannot complete <sup>a</sup> calculation (such as dividing zero by zero). An exponent of <sup>1777</sup> and <sup>a</sup> zero characteristic will be generated.

The standard cases are as follows:



Overflow and indefinite conditions are unlikely to occur during error-free computations, since the permissible exponents allow numbers in the approximate range  $10^{-293}$  to  $10^{+322}$ .

**<sup>t</sup>** Can only occur from packing or Boolean operations.

#### REFERENCES

- 1. *Programmed Data Processor-6 Handbook*, Digital Equipment Corporation, Maynard, Massachusetts, DEC Publication Number F-65 and F-65 Change Notice number <sup>3</sup> (July, 1965).
- 2. David L. Pehrson, *Pagination and Segmentation of the PDP-6 and Other Problems in Memory Control,* Lawrence Radiation Laboratory, Livermore, California, UCRL-70084 (August 15, 1966).
- 3. *Control Data 6000 Series Computer Systems: Reference Manual,* second edition, Control Data Corporation, St. Paul, Minnesota, CDC Publication Number 60100000 (July, 1965).
- 4. Harry L. Nelson, *Program Optimizing Techniques for the CDC <sup>6600</sup> Central Processor,* Lawrence Radiation Laboratory, Livermore, California, UCRL-12489 (April 6, 1965).
- 5. *Control Data 7600 Computer System: Preliminary System Description,* Control Data Corporation, St. Paul, Minnesota, CDC Publication Number 60258400 (September, 1969).
- 6. *Control Data 7600 Computer System: Preliminary Reference Manual,* second edition, Control Data Corporation, St. Paul, Minnesota, CDC Publication Number 60258200 (1969).
- 7. John E. Ranelletti, *CPU76, <sup>A</sup> 6600/7600 Central Assembler,* Lawrence Radiation Laboratory, Livermore, California, Computer Information Center, CIC Report Ll-004 (March 5, 1970).

#### APPENDIX

#### CDC 6600/7600 Operation Codes

The instruction sets for the CDC 6600 and CDC 7600 computers are nearly identical, so they can be discussed together. The following conventions are used.

- A. Each character in the octal column represents three bits.
- B. Lower-case letters  $(i, j, k)$  designate the three-bit portions of an instruction (usually operating registers). Upper-case letters (KKKKKK) designate the eighteen-bit portion of some instructions.
- C. <sup>A</sup> dash indicates an unused portion of an instruction.
- D. The mneumonics are those used by the CPU76 Assembler.<sup>7</sup>
- E. Operating registers are designated by an upper case/lower case pair (Ai, Bj, Xk).
- F. The contents of an operating register are indicated by brackets ([Ai], [Bj], [Xk]).
- G. Instruction <sup>00</sup> has <sup>a</sup> different function on the two machines. Instructions Oli, for  $i \neq 0$ , are available on the 7600 only.



-44- LTSS-1 Edition <sup>2</sup>

 $\frac{1}{40}$ 

r

 $\frac{1}{3}$ 

 $\ddot{\phantom{a}}$ 





octal mnemonic function 70ijKKKKKK FR Set Xi to [Aj] + K. 71 ijKKKKKK FI Set Xi to [Bj] + K. 72ijKKKKKK FO Set Xi to  $[Xj] + K$ . 73ijk FOI Set Xi to  $[Xj] + [Bk]$ . 74ijk FRI Set Xi to [Aj] + [Bk]. 75i jk FRO Set Xi to [Aj] - [Bk]. 76ijk FII Set Xi to [Bj] + [Bk]. 77ijk FID Set Xi to [Bj] - [Bk].

-46-

#### LEGAL NOTICE

This report was prepared as an account of Government sponsored work. Neither the United States, nor the Commission, nor any person acting on behalf of the Commission:

A. Makes any warranty or representation, expressed or implied, with respect to the accuracy, completeness, or usefulness of the information contained in this report, or that the use of any information, apparatus, method, or process disclosed in this report may not infringe privately owned rights; or

B. Assumes any liabilities with respect to the use of, or for damages resulting from the use of any information, apparatus, method or process disclosed in this report.

As used in the above, "person acting on behalf of the Commission" includes any employee or contractor of the Commission, or employee of such contractor, to the extent that such employee or contractor of the Commission, or employee of such contractor prepares, disseminates, or provides access to, any information pursuant to his employment or contract with the Commission, or his employment with such contractor.