Notes 4

Computer Architecture

CISC vs RISC

Just to suddenly change topics, there are two major ways processors are designed. The CISC (Complex Instruction Set Computer) involves a large set of powerful instructions. There are many variations of instructions. The programs people write are shorter because many things can be done with a single instruction. But many instructions are rarely used and the design and building of the processor can be very difficult. An existing example of this is the Intel processors.

RISC (reduced Instruction Set Computers) machines, like the PowerPC, have a smaller set of instructions. The processors are simpler and the instructions are carefully implemented to be fast and efficient. But programs are long are somewhat more complicated. But since most programs at this level are written by other programs (compilers), this is less important.

Machine Performance

Increasing the processor speed only goes so far. There are limits to miniaturization. Since it takes 1 nanosecond (10^-9) for electricity to travel 1 foot, the distance between components in a computer will be a limiting factor. One way around this is pipelining. At any given moment, several instructions are in different stages of processing. While one is being executed, another is being decoded and a third is being fetched. Different parts of the CPU do these things at the same time. So while the processor speed hasn't changed, the throughput or instruction rate has tripled. If the instruction being executed is a JUMP, this removes the advantages, since the instructions being decoded and fetched are the wrong ones.

Parallel processing

In an earlier lecture, I mentioned that in WW2, rooms full of people were doing pieces of calculations for the Manhattan Project. The results were accumulated into the final answer. In a similar way, multiple processors can be used to split up the work on a problem within and between computers. There are several different architectures.

MIMD

Multiple Instructions, Multiple Data. Each processor is doing a different thing on a different piece of the data. This is a superset of SIMD and SISD.

SIMD

Single Instruction, Multiple Data. Each processor is doing the same thing on different pieces of the data. A lot of image processing falls into this category.

SISD

Single Instruction, Single Data. Seems a little redundant if you have multiple processors.

These are architectures that involve multiple processors in a single machine. We can go up a level and connect multiple machines together. Some of these structures are loosely-coupled and some are closely-coupled. In the closely-coupled category is the bank of computers used by Pixar to render frames in their animated films, like Toy Story. A loosely-coupled parallel machine is used in processing SETI data. People all over the world download a screen saver program and sections of SETI data. The screen saver analyzes the signal for anomalies and reports the results back to the project via the Internet. This technique brings thousands of computers to bear on a problem without most ofthem being physically connected to each other. While the individual parts of the problem are relatively simple, the coordination of these kinds of projects can be very difficult.

Inside the ALU

The ALU performs the logic and argithmetic functions for the computer. The most common logic operations are AND, OR and XOR. We have seen these before on single bit. They are often used on a collection of bits to do masking. This is a method of turning certain bits in s collection on or off. Or to test whether certain bits were on. This is used in working with bitmaps. These are not the kind of bitmaps used to represent pictures. The picture bitmaps have one bit for each pixel on screen. The bitmaps used here are a collection of bits where each bit represents something else. For example, on a disk drive, there might be a bitmap where each bit represents whether the corresponding sector is being used. Here's how it would be used. If we want to remove part of the bitmap, we can use AND.

	00001111	Mask
AND	10101010	Data bitmap
	00001010	Result

We can use the masking technique to do several other things. We can check the state of a bit.
If the bit I want to check is 1, then the result of the AND is not 0.

	00010000	Mask
AND	10010001	Data bitmap
	00010000	Result

If the bit I want to check is 0, the result is 0.

	00010000	Mask
AND	10000001	Data bitmap
	00000000	Result

We can also clear and set a bit. To set a bit, put a 1 in the mask where you want a 1 in the result and 0 everywhere else. Use the OR operation.

	00001000	Mask
OR	11000001	Data bitmap
	11001001	Result

To clear a bit, use AND with a mask of all ones, except for the bit we want to turn off.

	11101111	Mask
AND	11010001	Data bitmap
	11000001	Result

To complement a bitmap, use XOR.

	11111111	Mask
XOR	10101010	Data bitmap
	01010101	Result

There are instructions used to manipulate bit strings. Shift moves bits in a register one slot to the left or right. 0 is inserted at the other end . The rotate instruction moves the bit from one end to the other and shifts the rest.

Input/Output

Each device attached to a computer that is used for input or output is connected to the system bus. Physically, there are boards installed in slots in the backplane. These cards often have whole computers built into them. In memory-mapped I/O machines, each device is allocated a piece of the main memory address space. This space includes memory locations that are used to control the device. When the CPU wants to initiate some action or check the status of the device, it simply reads or writes bits into these memory address. In addition, the conttroller card may access the rest of memory, using Direct Memory Access (DMA). The CPU makes a read request from the device and give it an address in memory. The CPU could then go off and do something else while the device transferred information directly into the mechine memory. In all cases, the speed bottleneck is the bus as all data must pass through it.

Communication

There are two kinds of communication techniques between machines. Parallel and serial.

Parallel

There are the same number of wires connecting machines as there are bits you want to send. All the bits are sent at once. Very fast, but more expensive due to the number of wires.

Serial

Each bit is sent one at at time. This is slower but cheaper.

Baud Rate

This is a commonly used term but it is not the same as bits per second. Baud rate is the number of changes in the state of the communications line per second. If there is only one change per second, this corresponds to 1 bit per second. If there are 4 states, this can be used to send 2 bits so the bits per second is equal to twice the baud rate.

Compression

Data compression is a way to increase the throughput without increasing the data transfer rate. If we have a 1200 baud communications device which gives us a transfer rate of 14400 bits per second, we can get an actual transfer rate of 57,600 bps. This assumes we can get a 4 to 1 compression. One kind of compression is Huffman coding. In this case, you use shorter codes for frequently sent data. For example, the Morse code is a kind of Huffman code. Some characters are represented by single dots or dashes. Others are multiple characters, depending on how common they are in the language.