Notes 3

Computer Architecture

There are several main parts to the structure of a computer. We have talked about the main memory and the secondary storage systems. These are connected to each other by the system bus. Also attached top the bus is the cpu (Central Processing Unit). The CPU handles two primary functions. These are the control unit (CU) and the Arithmetic and Logic Unit (ALU). The control unit directs information and control signals around the machine. The ALU does calculations, both numeric and logical.

To do this work, the CPU has a number of specialized memory cells called registers. There are general purpose (GP) and special purpose (SP) registers. The GP registers are involved in most calculations in the CPU. When an add operation is performed, the two operands are moved from main memory to GP registers. The result of the addition is stored back in a GP register and then moved to memory. All the functions mentioned except the actual addition are performed by the CU. Much of the machine can be viewed as levels of memory.

At the lowest level, closest to the CPU we have the registers. Just above that and often in the CPU itself, is cache memory. Information that was just used or is expected to be used in the near future is stored here. The next level is the main memory. After that is secondary storage like a disk. Finally, we have tertiary storage like tape drives. The main difference between the levels of memory is speed. Registers and cache are the fastest and are physically closest to the CPU. Main memory is nearby, but has to go through the system bus. The others are further away and slower.

The system bus, which connects the pieces here is essentially a collection of wires. When data is supposed to be transferred from the CPU to memory, all the data bits (including parity) are put out on the bus along with a signal to cause the appropriate memory cell to be updated. It also tells the bus which memory cell by providing the address. This works the other way during a read.
 

Instructions

In order to be a computer, there are a set of things it has to be able to do. We have already seen some of these, like get data from memory, add two registers, store data, etc. The collection of all the operations  the CPU can perform is called its instruction set. While we could get away with only a handful of operations, most modern computers have hundreds. These don't really add to the theoretical functions of the computer, but they do make it easier to work with. At a very low level, the instructions are the inputs to the millions of AND , OR and XOR gates that make up the computer. If you imagine the machine as being made up of gillions of little on/off switches, the instructions tell which of these should be on or off at any instant. We are going to divide the instruction set of a machine into three broad categories.
 

Data Transfer

These instructions move data from one part of the machine to another. Some of them move data from main memory back and forth to the registers. The term 'move' is actually a little misleading. The data that is transferred is not removed or erased from  the original location, It is actually copied from one place to another. Some other instructions transfer data from memory to the I/O system. In some machines, the I/O devices are seen as just another set of memory locations so the same instructions can be used.

Arithmetic/Logical

These are operations that manipulate the contents of the registers in  the CPU. They include operations like AND and OR as well as arithmetic functions. There are some that are used as part of machine operations rather than calculation, These are things like shifting and rotating.

Control

A program consists of a sequence of instructions. The two groups above manipulate the data that the program operates on. The control group manipulates the program. These include operations to change the order of instructions like BRANCH. BRANCH instructions control what instruction will be processed next. They can either depend on some condition or they can be independent.

Some instructions we are used to seeing at a high level, like FOR loops are actually implemented as a sequence of these low level operations. The FOR loop would include instructions to get the start and end value from memory. compare the current value against the end value, jump to the beginning of the loop if it is ok and increment the counter. If the counter is bigger than the end condition, it jumps to the end.
 

Programs and instruction processing

In the early days of computing, the machines were not programmed in the way we think of it now. It was more like re-wiring. Imagine a TV with cable input, a converter box and a VCR. I can connect the the cable to the converter and the convertor to t he VCR. Then I can record whatever is on the convertor. But if I want to record directly from the cable, I can't. I can't just tell the VCR to use a different input. I have to move the cable from one device to the other. I didn't have to build a new kind of recording device, but I had to change the connections. The machines were programmed by changing the connections between the pieces. The old telephone switchboards worked this way. Calls were connected by plugging wires between holes in the board. A modern computer is more like a player piano. A roll of paper is inserted that has instructions on it about which keys to play in what order. The piano has no knowledge of either the Beatles or Chopin but can play either.

Our computer has the ability to perform a large number of instructions but knows nothing about the kinds of things we want it to do. We give it a long list of instructions to run and suddenly, it can play games or balance checkbooks. The idea of the program being recorded in the computer instead of in its physical structure is called stored program.  The key idea is the the program instructions look like data. They are both just bits. This is one of those ideas that seems obvious as soon as you here it but isn't the moment before. The computer is built to recognize certain bit patterns as instructions. These bit patterns are referred to as the machine language.  These patterns typically consist of two parts, the opcode, which is a number that tells us which machine instruction, like ADD to perform. The other part is the operand, which is data that the instruction uses somehow. For example, the JUMP instruction might have an opcode of 3 and an operand of the address to jump to.
 

The hypothetical machine

The book uses a n imaginary computer for discussion purposes. Its much simpler than real computers and has just the right properties to show off all the things the book wants to cover. The details of it are in Appendix C.

There are 16 General Purpose registers numbered 0 through F in hexadecimal (base 16). Each GP register is 1 byte wide. Main memory has 256 cells, each one byte, numbered from 00 to FF. There are two special purpose registers. One is the Program Counter (PS) and it is one byte. The other is the Instruction Register (IR) and it is 2 bytes wide.

Each instruction in the machine is 16 bits wide, consisting of 4 hex digits. The opcode is in the first 4 bits. This means there are at most 16 instructions. Our machine only uses 12, numbered 1 to C.

Each of the other 3 bytes in an instruction is an operand. The format is dependent on the operator. For example, there is a LOAD instruction that has an opcode of 1. The first operand byte is the register that is to be loaded. The next two operand bytes are the address of the memory cells whose contents are to be copied into  the register. So , to copy the contents of memory location A1 into register F, the instruction would be 1FA1.

One of the ADD instructions shows another way to use the operands. This one adds the contents of two registers and stores the result in another one. The first operand byte is the register to store the results in,. The next two are the source registers. To add the contents of register A to the contents of register B and store the result in register F ,  5FAB

A third type is 2FA1. This is another LOAD instruction, But it copies the literal bit pattern A1 to the register F. It doesn't look in memory.

The JUMP instruction is overloaded. Jumping is the process of changing the order in which instructions are executed. There are both conditional and unconditional jumps. Unconditional jumps change the execution order regardless of what is currently happening. Conditional jumps only happen if the condition is true. So BFA1 says to start executing the program at memory location A1 if the contents of register F are the same as the contents of register 0. Otherwise, the next instruction is run. To make this into an unconditional jump, use register 0. Thus B0A1 jumps to A1 if the contents of register 0 are equal to the contents of register 0. This is always true, regardless of the contents of the register.
 

An example program

Lets examine a small program in detail.
Line Number Address Instruction Comments
1 1,2 2101 R1 = 1
2 3,4 3130 mem(30) = R1
3 5,6 2102 R1=2
4 7,8 3131 mem(31)=R1
5 9,A 1530 R5=mem(30)
6 B,C 1631 R6=mem(31)
7 D,E 5056 R0=R5+R6
8 F,10 3032 mem(32)=R0
9 11,12 C000 Halt

The instructions are stored in memory starting at location 01. Since each instruction is 16 bits and memory is 8 bits, each instruction uses 2 memory locations. The computer knows this so when it retrieves and instruction from memory, it always fetches 2 bytes.

The first two instructions store a literal number 1 in the register 1 (2101) and then copy the  contents of R1 to the memory cell whose address is 30. Remember, 30 in hex is 48 in decimal. The next two instructions do  the same thing, storing a 2 in location 31.
Then these two values are loaded into registers 5 and 6 (1530, 1631). This is a set up for the addition in line 7. The 5056 instruction adds the contents of R5 and R6 and stores it in R0. Line 8 stores R0 into memory location 32.

This corresponds to the high level language statements
A=1
B=2
C=A+B

How the machine executes instructions

Instructions are stored in memory, just like data. In fact, in a stored program computer, you can't tell programs from data. Just like with integers and floating point numbers, the difference is in the interpretation. A program is a sequence of instructions. They are executed in the order they appear in memory, except for JUMP instructions.

Controlling this is where the PC and IR come in. The PC contains the memory address of the next instruction to be run. The IR holds the instruction currently being executed. It is used to decode and process the instruction.

The computer knows how to perform a very simple algorithm. It knows how to fetch an instruction, decode it and execute it. The cycle is to copy (fetch) from memory to the IR the instruction stored at the address in the PC. Then the first byte of the IR is used to determine (decode) what instruction this is. Finally, the instruction is executed.
The PC is incremented right after the fetch. So it always points at the next instruction.

A variation on this loop is the processing of a JUMP instruction. For example, when executing the instruction B315,  it first compares the contents of register 0 to R3. If R3 is not equal (!=) to R0, the execute part is over. Go to the top of the loop and fetch the next instruction in sequence. If R3 is equal to (==) R0, then change the value of the program counter to 15. This will fetch the instruction stored at that location, rather than the next one.

Here is another example, including a jump.
Line Number Address Instruction Comments PC mem(30) mem(31) mem(32) mem(33)
1 1,2 2101 R1 = 1 3 ???? ???? ???? ????
2 3,4 3130 mem(30) = R1 5 1 ???? ???? ????
3 5,6 2102 R1=2 7 1 ???? ???? ????
4 7,8 3131 mem(31)=R1 9 1 2 ???? ????
5 9,A 1530 R5=mem(30) B 1 2 ???? ????
6 B,C 1631 R6=mem(31) D 1 2 ???? ????
7 D,E 5056 R0=R5+R6 F 1 2 ???? ????
8 F,10 2303 R3=3 11 1 2 ???? ????
9 11,12 B317 jump 17 if R3==R0 ?? 1 2 ???? ????
10 13,14 3032 mem(32)=R0 15 1 2 3 ????
11 15,16 C000 Halt 17 1 2 3 ????
12 17,18 3033 mem(33)=R0 19 1 2 ???? 3
13 19,1A C000 Halt 1B 1 2 ???? 3

The question marks in the memory columns indicates that we don't know what is stored there. It doesn't matter to us because we are careful to not read any of those locations until we have stored something there.

The question marks in the PC column at line 9 are because the value of the PC depends on the result of comparing R3 and R0 at line 9. If the result is true, then the PC value here is 17. This results in the program running the instructions  that start there and the result 3 is stored in location 33. If it is false, the the PC is left with the value of 13 and then the 3 is stored in location 32.

In this particular case, since we are loading a 3 into R3 in line 8, the result is true. But if line 3 was 2302, the result would be false.

One consequence of programs and data being indistinguishable to the computer can be seen by changing line 2 in the above to be 3110 and running it again.

The following program changes itself while it runs.
Line Number Address Instruction Comments PC mem(20) mem(21)
1 1,2 2101 R1 = 1 3 ???? ????
2 3,4 3120 mem(20) = R1 5 1 ????
3 5,6 B007 Jump to 7 7 1 ????
4 7,8 1021 R0=mem(20) 9 1 ????
5 9,A 2101 R2=0 B 1 ????
6 B,C 5201 R2=R0+R1 D 1 ????
7 D,E 3221 mem(21)=R2 F 1 2
8 F,10 2015 R0=15 11 1 2
9 11,12 3006 mem(6)=R0 ?? 1 2
10 13,14 B005 Jump 5 15 1 2
11 15,16 C000 Halt 17 1 2