Early computers had no real OS. Programs were loaded into memory each time they were run. Usually they were read in from punched cards or paper tape. Any extra routines they needed were loaded in with them from standard decks of cards. There was only one program running in the computer at any time and it ran until it completed before anybody else could use the computer. After a while, the extra routines that everybody used would be loaded into the computer when it was turned on and left there so other programs could use them without having to load them. Saved wear and tear on the cards as well.
Later, a simple OS was developed that would allow a bunch of jobs to be loaded into the computer and put into a queue that would be run in order. A job is the program along with any setup instructions it needed. A queue is a data structure where things are put in at one end and removed at the other. This ensures that the jobs get done in the order they came in. We will study queues in detail later. This technique is known as batch processing because a batch of programs were loaded and run at a time.
In our machine, the program to be run is the one whose starting address is stored in the program counter. By changing the address stored in the PC, you can change what program is being run. This simple OS would assign each program a section of memory that it could use and store it there.
Often, before the program could be run, there
were actions that the operator had to take to setup things for the program.
Examples include mounting tapes, loading other programs, putting special
paper in the printer, etc. Some of these could be done automatically by
the computer and others required people. These instructions were put on
cards before the main program and were written in a special language called
JCL.
The Job Control Language was used to specify files, tapes and other setup
operations.
Another growing desire was to be able to use computers to monitor and control other machines. But in a batch mode, if a program was running and waiting for some device, the computer just sat idle. Since these machines were very expensive to buy and maintain, people wanted to get as much use out of them as they could.
There was also the observation that when interacting with people or some kinds of devices like printers, the machine was very much faster than the person or printer. So it spent idle time waiting for the printer to finish a line so it could send the next.
All this led to the concept of time-sharing.
This has nothing to do with condos in Aspen. Since the computer was so
much faster than humans, why not have it do something while it was waiting
for the user to type in something. By having it switch between jobs, it
give the illusion that it is doing multiple things at once. Each program
in memory is given a fixed amount of time to execute. This is called a
time-slice
or quantum. After that time, the computer switches to the next program.
Let's say that the quantum is 10 milliseconds. That doesn't seem very long.
But with a 100 MHz clock rate, some kind of instruction is being executed
every 10-8 seconds. That is 10 nanoseconds. So in 10 milliseconds
we could run 1000 instructions. Think about it another way. If you can
type 60 words a minute that's one word per second. If a word is 5 characters,
that's 200 milliseconds per character. So, while you are typing a character,
the computer can switch between programs 20 times.
Humans measure time in seconds, machines in nanoseconds.
This ability allows the computer to appear to do many things at the same time. Even with the overhead of switching between programs, the overall throughput of the machine goes up because there isn't so much idle time. Programs that have no interactions with the outside world (CPU bound) will run slightly slower in this scheme but most programs talk to something. I/O in general is very slow relative to CPU speeds so a lot of time that would be wasted waiting for disks to rotate or print heads to move (I/O bound) is used by other programs.
A variation on this is the need to handle real-time processing. The term refers to handling events as they happen. An example of this is the processing of data from measuring devices. If the computer isn't ready to accept the data when it appears, it is lost. The operating systems most of us see are not intended for real time use as they may be busy doing something else when the data comes in.
This ability to handle multiple programs at once is called multi-tasking. This is not the same thing as multi-processing. I think of multi-processing as the use of actual multiple processors (CPUs) in the machine. I can multi-task on a single processor machine using the time-sharing technique described above. There is also the concept of multi-user. This simply means that multiple users can work on the same machine. All multi-user OSes are multi-tasking. Not all multi-tasking OSes are actually used by multiple users. For example, Linux on a Intel box is a multi-tasking, multi-user, single processor system. Windows 95 is a single-user , single-tasking (mostly), single processor OS.
We talked about parallel processing machines earlier.
Another way to use multiple processors to good advantage is using a multi-tasking
OS. For example, the OS could hand separate programs to separate CPUs to
be run at the same time. Again, each program runs just as fast as if it
had only one CPU, but overall, the throughput of the machine increases.
This is often referred to as SMP, Symmetric Multi-Processing.
We can also go up one more level and connect
multiple computers together with a network and pass jobs around between
machines. This can be done to make use of computers that were idle.
So the file stored as follows:
Root -> usr1 (directory) ->play(directory) -> game.exe(file)
Would be referred to as /usr1/play.game.exe
The file manager creates file descriptors for
each file you are using. These descriptors are a collection of information
about a file such as its name, where it is on the disk, its size, etc.
These are used by various other kernel and API level programs to perform
the application level tasks of reading and writing. The file manager also
controls the buffers that the file system uses. To speed up file processing,
information read from disk is stored in chunks of memory. This way, if
the user wants to use the same data again, it is already in the memory.
The buffers are arranged as a linked list of blocks of memory. A
list is another data structure we will see later. In a list, each element
contains the location of the next one. Another part of this is the device
drivers. These are an interface between the device, like a disk, and
the OS. They provide a standard set of operations like open, close, read
and write on all devices. This way, a printer can be operated on just like
a file.
What do we do if the total memory needed by programs
is more than we have. This was a common problem. The solution is to pretend
that part of the disk drive is main memory. To a program, the only difference
between the real memory and the disk memory is that the disk memory is
slower. We call all of this storage space virtual memory. The space
is divided into fixed size chunks called pages. A given program
is loaded into a bunch of pages and these pages are read into real memory
as needed. When a new program comes along that needs memory, some of the
pages of the old program are written out to the disk. This is called swapping.
The space on disk used for this is called swap space. All
this effort results in us being able to pretend we have a gigabyte of memory
when we really only have 64 megabytes.
Your computer is built so that when the power
is turned on, it loads a certain address into the program counter and starts
running. That address is a program in ROM that is very short and simple.
Typically, it reads in a certain fixed chunk of the disk. This piece of
the disk typically holds a larger program that reads in the rest of the
operating system.
The set off all processes in a machine at anytime is kept in the process table. This table holds the process state, priority, owner, etc. The process can be ready meaning it can run as soon as a timeslice becomes available. It can be blocked which means it is waiting for some external event like user input. It can also be running. The scheduler examines this table and determines what runs next. There are a lot of scheduling algorithms.
When a process is started (or restarted) the dispatcher sets a timer for the time slice. At the end of the quantum, an interrupt goes off that causes the CPU to record where it is in the current process and handle the interrupt. The program that it calls at this point is called the interrupt handler. When this interrupt goes off, the scheduler wakes up, evaluates the process table based on priority and ready state, and selects the next process to run. There is an interesting symbiosis between modern processors and OSes. The processors have been built with special hardware to enable process switching in support of multi-tasking OSes. This kinds of mutual development has been going on for a long time. New software is developed to allow new capabilities and hardware is modified to support them. A recent example is the MMX extensions to Intel processors. Instructions were added to speed up some kinds of graphics processing that had been done in software before.
If a process starts to do I/O, it may stop before its time slice is up. In this case, it is marked blocked and a new process is started. The first processes may start again when the I/O is completed and its state is changed to ready.
Processes are allowed to communicate using inter-process
communication. There are a variety of ways to do this including shared
memory and networking protocols. An over all structure to this is provided
using the client-server model. This term is usually used to describe
a problem solution where part of the answer is on the users desk (client)
and part on the main machine (server). This can also apply to processes
within one machine. One process is the client. It uses information provided
by the server. An example is a database system. There is a server process
that actually retrieves data from the disk storage and there are multiple
client processes that ask for that data. Building system using this architecture
simplifies construction. Each component is the same whether it is being
run on the same computer as the client or not. It also allows specialization
and efficiency in server design which can lead to improved performance.