Notes 4
Computer Architecture
CISC vs RISC
Just to suddenly change topics, there are two major ways processors are
designed. The CISC (Complex Instruction Set Computer) involves a large
set of powerful instructions. There are many variations of instructions.
The programs people write are shorter because many things can be done with
a single instruction. But many instructions are rarely used and the design
and building of the processor can be very difficult. An existing example
of this is the Intel processors.
RISC (reduced Instruction Set Computers) machines, like the PowerPC,
have a smaller set of instructions. The processors are simpler and the
instructions are carefully implemented to be fast and efficient. But programs
are long are somewhat more complicated. But since most programs at this
level are written by other programs (compilers), this is less important.
Machine Performance
Increasing the processor speed only goes so far. There are limits to miniaturization.
Since it takes 1 nanosecond (10-9) for electricity to travel
1 foot, the distance between components in a computer will be a limiting
factor. One way around this is pipelining. At any given moment,
several instructions are in different stages of processing. While one is
being executed, another is being decoded and a third is being fetched.
Different parts of the CPU do these things at the same time. So while the
processor speed hasn't changed, the throughput or instruction rate has
tripled. If the instruction being executed is a JUMP, this removes the
advantages, since the instructions being decoded and fetched are the wrong
ones.
Parallel processing
In an earlier lecture, I mentioned that in WW2, rooms full of people were
doing pieces of calculations for the Manhattan Project. The results were
accumulated into the final answer. In a similar way, multiple processors
can be used to split up the work on a problem within and between computers.
There are several different architectures.
MIMD
Multiple Instructions, Multiple Data. Each processor is doing
a different thing on a different piece of the data. This is a superset
of SIMD and SISD.
SIMD
Single Instruction, Multiple Data. Each processor is doing the same
thing on different pieces of the data. A lot of image processing falls
into this category.
SISD
Single Instruction, Single Data. Seems a little redundant
if you have multiple processors.
These are architectures that involve multiple processors in a single
machine. We can go up a level and connect multiple machines together. Some
of these structures are loosely-coupled and some are closely-coupled.
In the closely-coupled category is the bank of computers used by Pixar
to render frames in their animated films, like Toy Story. A loosely-coupled
parallel machine is used in processing SETI data. People all over the world
download a screen saver program and sections of SETI data. The screen saver
analyzes the signal for anomalies and reports the results back to the project
via the Internet. This technique brings thousands of computers to bear
on a problem without most ofthem being physically connected to each other.
While the individual parts of the problem are relatively simple, the coordination
of these kinds of projects can be very difficult.
Inside the ALU
The ALU performs the logic and argithmetic functions for the computer.
The most common logic operations are AND, OR and XOR. We have seen these
before on single bit. They are often used on a collection of bits to do
masking.
This is a method of turning certain bits in s collection on or off. Or
to test whether certain bits were on. This is used in working with bitmaps.
These are not the kind of bitmaps used to represent pictures. The picture
bitmaps have one bit for each pixel on screen. The bitmaps used here are
a collection of bits where each bit represents something else. For example,
on a disk drive, there might be a bitmap where each bit represents whether
the corresponding sector is being used. Here's how it would be used. If
we want to remove part of the bitmap, we can use AND.
|
00001111 |
Mask |
AND |
10101010 |
Data bitmap |
|
00001010 |
Result |
We can use the masking technique to do several other things. We can
check the state of a bit.
If the bit I want to check is 1, then the result of the AND is not
0.
|
00010000 |
Mask |
AND |
10010001 |
Data bitmap |
|
00010000 |
Result |
If the bit I want to check is 0, the result is 0.
|
00010000 |
Mask |
AND |
10000001 |
Data bitmap |
|
00000000 |
Result |
We can also clear and set a bit. To set a bit, put a 1 in the mask where
you want a 1 in the result and 0 everywhere else. Use the OR operation.
|
00001000 |
Mask |
OR |
11000001 |
Data bitmap |
|
11001001 |
Result |
To clear a bit, use AND with a mask of all ones, except for the bit we
want to turn off.
|
11101111 |
Mask |
AND |
11010001 |
Data bitmap |
|
11000001 |
Result |
To complement a bitmap, use XOR.
|
11111111 |
Mask |
XOR |
10101010 |
Data bitmap |
|
01010101 |
Result |
There are instructions used to manipulate bit strings. Shift
moves bits in a register one slot to the left or right. 0 is inserted at
the other end . The rotate instruction moves the bit from one end
to the other and shifts the rest.
Input/Output
Each device attached to a computer that is used for input or output is
connected to the system bus. Physically, there are boards installed in
slots in the backplane. These cards often have whole computers built into
them. In memory-mapped I/O machines, each device is allocated a
piece of the main memory address space. This space includes memory locations
that are used to control the device. When the CPU wants to initiate some
action or check the status of the device, it simply reads or writes bits
into these memory address. In addition, the conttroller card may access
the rest of memory, using Direct Memory Access (DMA). The CPU makes
a read request from the device and give it an address in memory. The CPU
could then go off and do something else while the device transferred information
directly into the mechine memory. In all cases, the speed bottleneck is
the bus as all data must pass through it.
Communication
There are two kinds of communication techniques between machines. Parallel
and serial.
Parallel
There are the same number of wires connecting machines as there are bits
you want to send. All the bits are sent at once. Very fast, but more expensive
due to the number of wires.
Serial
Each bit is sent one at at time. This is slower but cheaper.
Baud Rate
This is a commonly used term but it is not the same as bits per second.
Baud rate is the number of changes in the state of the communications line
per second. If there is only one change per second, this corresponds to
1 bit per second. If there are 4 states, this can be used to send 2 bits
so the bits per second is equal to twice the baud rate.
Compression
Data compression is a way to increase the throughput without increasing
the data transfer rate. If we have a 1200 baud communications device which
gives us a transfer rate of 14400 bits per second, we can get an actual
transfer rate of 57,600 bps. This assumes we can get a 4 to 1 compression.
One kind of compression is Huffman coding. In this case, you use shorter
codes for frequently sent data. For example, the Morse code is a kind of
Huffman code. Some characters are represented by single dots or dashes.
Others are multiple characters, depending on how common they are in the
language.