Computer and Software Architecture

Introduction

Computer Basics

A computer consists of several components. These are the CPU (Central Processing Unit), the I/O system (keyboard, mouse and monitor), main memory and secondary memory (disk). There are many variations on these components, multiple CPUs, networked file systems, tertiary memory, unusual input and output devices (sensors and effectors). These make up the hardware part of the computer. We are more interested in the software part of the system. "Software without hardware is an idea. Hardware without software is a space heater". The part we are most interested in for now is the memory. The main memory in a computer looks to us like a long row of little boxes. Each of these boxes has two properties. One is its location or address. This is simply its number in the list. Since computers count from 0, the second box is number 1. This explains why software people insist that the new millennium didn't start until 2001. Memory addresses start at 0 and go up to the number of boxes in the system. For example, a computer with 1Meg of memory would have boxes numbered from 0 to 1048575. Note that in the computer world, 1 meg doesn't mean 1 million, it means 1024 * 1024. This is due to peoples interest in powers of ten and computers interest in binary numbers. The nearest power of two to 1000 is 1024 or 2^10. This is close to the use of kilo in the metric system so memory got measured in K's which is 1024 units, not 1000.

The units involved in measuring memory are bytes. One byte is 8 bits. A bit is a single binary digit. Just like decimal arithmetic consists of 10 digits, 0-9, binary arithmetic has two digits, 0 and 1. So all numbers in the computer are ultimately represented as a bunch of ones and zeros. To make this a little easier to work with, people started using hexadecimal arithmetic. Decimal arithmetic is base 10, hex arithmetic is base 16. The digits are 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F. Each hex digit consists of 4 bits, so a single byte would be 2 hex digits. Instead of writing 11000011, we could write C3.

The other property of memory cells is its content. Each byte in memory can hold a single number, from 0 to 255. In hex this is 00 to FF. It is important to remember that the location of the memory cell doesn't tell us anything about its contents. The same memory cell could, over time, hold many different numbers.

While I just said that memory cells contain numbers, that is not strictly true. They contain 8 bits, which can be any combination of 1 or 0. So there are 255 different combinations that can be contained in any cell. These bit patterns can be interpreted many ways. They can be interpreted as a number (integer or floating point), a character, a machine instruction, a pixel or other things. How they are interpreted is largely up to the programmer. There are a number of standard codes or mappings that describe how a certain bit pattern will be interpreted. For example, in two's complement arithmetic ( one way to represent integers), the bit pattern 01000001 is the number 65. But looked at another way, using the ASCII code for characters, this is the letter A. In general, to interpret a given bit pattern, you have to know the code that is being used. All this discussion of memory will become more valuable when we talk about variables, arrays and pointers.

There is another level of memory in the computer called the secondary storage. This is usually a disk drive. The advantage of disk drives is they are larger, cheaper and keep the information intact even without power. These days, it is common for computers to have 128 megabytes (million bytes) of main memory and 60-100 gigabytes (billion bytes) of disk storage.

Software Basics

The first kind of software most people come in contact with is the operating system. The OS controls the actions of the computer and allocates resources. Common examples are Linux, Mac OS and Windows.

A program is a sequence of instructions that implements an algorithm.

Programs and instruction processing

In the early days of computing, the machines were not programmed in the way we think of it now. It was more like re-wiring. Imagine a TV with cable input, a converter box and a VCR. I can connect the cable to the converter and the converter to the VCR. Then I can record whatever is on the converter. But if I want to record directly from the cable, I can't. I can't just tell the VCR to use a different input. I have to move the cable from one device to the other. I didn't have to build a new kind of recording device, but I had to change the connections. The machines were programmed by changing the connections between the pieces. The old telephone switchboards worked this way. Calls were connected by plugging wires between holes in the board. A modern computer is more like a player piano. A roll of paper is inserted that has instructions on it about which keys to play in what order. The piano has no knowledge of either the Beatles or Chopin but can play either.

Our computer has the ability to perform a large number of instructions but knows nothing about the kinds of things we want it to do. We give it a long list of instructions to run and suddenly, it can play games or balance checkbooks. The idea of the program being recorded in the computer instead of in its physical structure is called stored program. The key idea is that the program instructions look like data. They are both just bits. This is one of those ideas that seems obvious as soon as you hear it but isn't the moment before. The computer is built to recognize certain bit patterns as instructions. These bit patterns are referred to as the machine language. These patterns typically consist of two parts, the opcode, which is a number that tells us which machine instruction, like ADD, to perform. The other part is the operand, which is data that the instruction uses somehow. For example, the JUMP instruction might have an opcode of 3 and an operand of the address to jump to. Fortunately, we won't be writing programs in machine language. A simple statement like a = b + c would be many machine language instructions.

We are interested in high level languages, in particular, C++. But the machine can only understand directly programs written in machine language. So the high level version of the program is converted into the low level version through a process of translation. The translator is usually called a compiler. The original version of the program, the one you write and work on, is called the source code.

The result of the translation by the compiler is called the object code. The object code from your program is combined with some standard, pre-compiled code, called libraries, in a step called linking. This results in a version of the program that is executable, or runnable, under the operating system, on the computer hardware.

Problem solving and algorithms

The first step in writing a program is developing the algorithm. In general,
An algorithm is an ordered set of unambiguous ,executable steps defining a terminating process. At each step, it is clear what to do next. Ambiguity is in the eyes of the beholder. The algorithm maybe clear but the representation is not. 'Make a peanut butter sandwich' is clear enough for me but a small child needs more details. It must be something that can be done. Calculate all digits in pi is not. Count all the M&M's in a bag is. Things like the pi calculation that never end are of little use. Some algorithms are designed to be non-terminating like operating systems or medical monitors.

Algorithm representation is tricky. Using natural language leads to ambiguities. For example, the phrase, time flies like an arrow has several meanings. It is important that all the partners agree on the terms used to build the representation. These are called primitives.

Programming languages are built to be unambiguous. We all agree what the primitives are and how they are combined. The compiler is the enforcer.

The primitives in a language have both syntax - what it looks like and semantics - what it means. The syntax of
assignment in C is

a = b;

The semantics are that the value stored in b is copied to a.

Programs

A program is different from an algorithm. A program is a precise implementation of an algorithm and is written in a formal computer language. Designing one is the hard part of programming. A good approach is to first state clearly what problem is being solved. Then work out the algorithm that solves it. Test the algorithm by running it in your head on some test cases. Then implement the algorithm in a computer language, compile it and test the actual running application. We'll see examples of this as we go. First, a quick overview of object oriented programming, which is related to design.

Overview of OOP

  • Definitions and principles