Computer and Software Architecture
Introduction
- Course topics
- C++
The main thrust of the class is programming in C++. We will try to
cover most of the language.
- Object Oriented Programming
This is primarily the grouping of code and data together.
Encapsulation and inheritance are the main features.
This differs from structured programming
mostly in syntax and the inheritance idea.
-
The other major OO language these days is Java. Time permitting, I will
introduce examples in Java to show how they are related. But this isn't
a Java class.
Computer Basics
A computer consists of several components.
These are the CPU (Central Processing Unit), the I/O system
(keyboard, mouse and monitor), main memory and secondary memory (disk).
There are many variations on these
components, multiple CPUs, networked file systems, tertiary memory,
unusual input and output devices (sensors and effectors).
These make up the hardware part of the computer.
We are more interested in the software part
of the system.
"Software without hardware is an idea. Hardware without
software is a space heater".
The part we are most interested
in for now is the memory.
The main memory in a computer looks to us like
a long row of little boxes.
Each of these boxes has two properties.
One is its location or address.
This is simply its number in the list.
Since computers count from 0, the second box is number 1.
This explains why software people insist that the new millennium
didn't start until 2001.
Memory addresses start at 0 and go up to the number of boxes in the
system.
For example, a computer with 1Meg of memory would have boxes numbered
from 0 to 1048575.
Note that in the computer world, 1 meg doesn't mean
1 million, it means 1024 * 1024.
This is due to peoples interest in powers of ten and computers interest
in binary numbers.
The nearest power of two to 1000 is 1024 or 2^10.
This is close to the use of kilo in the metric
system so memory got measured in K's which is 1024 units, not 1000.
The units involved in measuring memory are bytes.
One byte is 8 bits.
A bit is a single binary digit.
Just like decimal arithmetic
consists of 10 digits, 0-9, binary arithmetic has two digits, 0 and 1.
So all numbers in the computer are ultimately represented as a bunch of
ones and zeros.
To make this a little easier to work with, people started
using hexadecimal arithmetic.
Decimal arithmetic is base 10, hex arithmetic is base 16.
The digits are 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F.
Each hex digit consists of 4 bits, so a single byte would be 2 hex digits.
Instead of writing 11000011, we could write C3.
The other property of memory cells is its content.
Each byte in memory can hold a single number, from 0 to 255.
In hex this is 00 to FF.
It is important to remember that the location of the memory cell doesn't tell
us anything about its contents.
The same memory cell could, over time, hold many different numbers.
While I just said that memory cells contain numbers, that is
not strictly true.
They contain 8 bits, which can be any combination of 1 or 0.
So there are 255 different combinations that can be contained in
any cell.
These bit patterns can be interpreted many ways.
They can be interpreted as a number (integer or floating point), a character,
a machine instruction, a pixel or other things.
How they are interpreted is largely up to the programmer.
There are a number of standard codes or mappings that describe how a certain
bit pattern will be interpreted.
For example, in two's complement arithmetic ( one way to represent integers),
the bit pattern 01000001 is the number 65.
But looked at another way, using the ASCII code for characters, this is the
letter A.
In general, to interpret a given bit pattern, you have to know the code that
is being used.
All this discussion of memory will become more valuable when we talk
about variables, arrays and pointers.
There is another level of memory in the computer called the secondary storage.
This is usually a disk drive.
The advantage of disk drives is they are larger, cheaper and keep the
information intact even without power.
These days, it is common for computers to have 128 megabytes (million bytes)
of main memory and 60-100 gigabytes (billion bytes) of disk storage.
Software Basics
The first kind of software most people come in contact with is the
operating system.
The OS controls the actions of the computer and allocates resources.
Common examples are Linux, Mac OS and Windows.
A program is a sequence of instructions that implements an algorithm.
Programs and instruction processing
In the early days of computing, the machines were not programmed in
the way we think of it now.
It was more like re-wiring.
Imagine a TV with cable input, a converter box and a VCR.
I can connect the cable to the converter and the converter to the VCR.
Then I can record whatever is on the converter.
But if I want to record directly from the cable, I can't.
I can't just tell the VCR to use a different input.
I have to move the cable from one device to the other.
I didn't have to build a new kind of recording device, but I had to change
the connections.
The machines were programmed by changing the connections between the pieces.
The old telephone switchboards worked this way.
Calls were connected by plugging wires between holes in the board.
A modern computer is more like a player piano.
A roll of paper is inserted that has instructions on it
about which keys to play in what order.
The piano has no knowledge of either the Beatles or Chopin but can play either.
Our computer has the ability to perform a large number of instructions
but knows nothing about the kinds of things we want it to do.
We give it a long list of instructions to run and suddenly, it can play
games or balance checkbooks.
The idea of the program being recorded in the computer instead
of in its physical structure is called stored program.
The key idea is that the program instructions look like data.
They are both just bits.
This is one of those ideas that seems obvious as soon as you
hear it but isn't the moment before.
The computer is built to recognize certain bit patterns as instructions.
These bit patterns are referred to as the machine language.
These patterns typically consist
of two parts, the opcode, which is a number that tells us which
machine instruction, like ADD, to perform.
The other part is the operand, which is data that the instruction uses
somehow.
For example, the JUMP instruction might have an opcode of 3 and an operand of
the address to jump to.
Fortunately, we won't be writing programs in machine language.
A simple statement like a = b + c would be many machine language
instructions.
We are interested in high level languages, in particular, C++.
But the machine can only understand directly programs written in machine
language.
So the high level version of the program is converted into the
low level version through a process of translation.
The translator is usually called a compiler.
The original version of the program, the one you write and work on,
is called the source code.
The result of the translation by the compiler is called the object code.
The object code from your program is combined with some standard, pre-compiled
code, called libraries, in a step called linking.
This results in a version of the program that is
executable, or runnable,
under the operating system, on the computer hardware.
Problem solving and algorithms
The first step in writing a program is developing the algorithm.
In general,
An algorithm is an ordered set of unambiguous ,executable steps
defining a terminating process.
At each step, it is clear what to do next.
Ambiguity is in the eyes of the beholder.
The algorithm maybe clear but the representation is not.
'Make a peanut butter sandwich' is clear enough for me but a small
child needs more details.
It must be something that can be done.
Calculate all digits in pi is not.
Count all the M&M's in a bag is.
Things like the pi calculation that never end
are of little use.
Some algorithms are designed to be non-terminating like
operating systems or medical monitors.
Algorithm representation is tricky.
Using natural language leads to ambiguities.
For example, the phrase, time flies like an arrow has several meanings.
It is important that all the
partners agree on the terms used to build the representation.
These are called primitives.
Programming languages are built to be unambiguous. We all agree what
the primitives are and how they are combined.
The compiler is the enforcer.
The primitives in a language have both syntax - what it looks
like and semantics - what it means.
The syntax of
assignment in C is
a = b;
The semantics are that the value stored in b is copied to a.
Programs
A program is different from an algorithm.
A program is a precise implementation
of an algorithm and is written in a formal computer language.
Designing one is the hard part of programming.
A good approach is to first state clearly what problem is being solved.
Then work out the algorithm that solves it.
Test the algorithm by running it in your head on some test cases.
Then implement the algorithm in a computer language, compile it and test
the actual running application.
We'll see examples of this as we go.
First, a quick overview of object oriented programming, which
is related to design.
Overview of OOP
- Why OOP?
What problem were we trying to solve?
Mostly complexity and re-use.
People noticed that code that was grouped together with the data it manipulated
was easier to follow.
Also, complex data problems could be simplified by
making a graph of them.
- Software re-use
Code is hard to re-use since everyone codes the interfaces slightly
differently.
At least differently than you want them to be.
C++ classes allow a developer to present an interface to the user
so they can access the data without messing with the internals.
- Control of complexity
Systems with a large number of related but different data objects were
hard to understand.
Inheritance and access control makes that easier.
Magic
number 7 + or - 2.
People can only hold about 7 things in their head at the same time.
So if you can lump things together, you can hold more of the program in
your head.
- Code understanding
Similar problem.
Break code into smaller modules to ease understanding.
Definitions and principles
- Encapsulation
- Data and code together
- Object data can only be used by object code
- Attached code called methods
Also known as member functions
- Like C structures with code fields
- Polymorphism
Multiple functions with the same name.
Allows more natural function naming.
Most data objects have a print operation.
Now all can be called print(). Also can overload operators like plus(+)
so that string concatenation can look like arithmetic.
- Inheritance
There is often a natural hierarchy to data objects.
All rooms have some things in common like walls and doors.
But some rooms have windows.
Some rooms have plumbing.
Some walls go to the ceiling and some don't.
Rather than describe everything for each kind of room, create a generic
room object and have the special rooms add to that, subtract from that
or overload that.