Software Engineering

Software has a lifecycle like most things. Software Lifetime
Most of the total effort in a software project is spent doing maintenance. I've read figures of 50% or more. Often different people do the maintenance than those who did the original development. So the maintenance crew needs to understand the code before they can fix it.

Reducing the cost of maintenance may increase the development cost but should greatly reduce the overall cost of the project. One sign of a mature development effort is the realization of the long term lifetime and costs of a project. Eventually, the code gets too hard to fix or to add new features to and it gets retired and a new program is written.

Think about the whole lifetime cost, not just the cost of developing the initial release. Small efforts during design and development can mean big reductions in maintenance cost. Documentation, modeling and processes are ways to help this.

Process Models

All development processes have at least these stages; requirements, design, implementation and testing. Each of these might involve several steps and there might be more.

Build And Fix

. The first, simple model of the development process is what the book calls build and fix. This is the technique is what beginners use. Get something working and ship it. Then fix bugs as they occur, add missing features and re-release. You end up with a badly constructed, kludgey mess that becomes harder to use and maintain as time goes on.

Waterfall

The waterfall model dates from the 1970s. Each stage must be completed before the next is begun. There is no provision for going backwards. This a clear model that seems to control the problems in the build and fix process. However, there are some problems. Each stage requires a great deal of effort all at once. Each stage must be DONE before the next. The people working on each stage must try to know the future so they can be sure the stage is complete.

This rarely works in the real world and since the process doesn't allow for going backward, it is expensive when to have to. It can take hundreds of times more effort to fix a design problem if it is caught in testing than if it is caught in design. The waterfall model has too much control, where the build and fix model has too little.

Iterative

This allows for going back to previous steps. It expects that each step will be completed but allows for surprises and prototyping. It doesn't expect the steps to overlap a lot. AT the point where each stage is mostly done, some of the next stage can begin. When a problem is found, we go back to the stage where it originated and redo that part. Then we go on to the next stage again.

Evolutionary

This is related to the iterative process in that we can go back to earlier stages. It has the principles of object oriented programming in mind.

It is built on the idea that we refine the definition of the system from high levels of abstraction to low. We start by breaking the system down into pieces during architectural design. Each piece is defined and its responsibilities and relationships with other pieces is determined. This is similar to defining classes.

Each piece goes through the refinement cycle. This takes the system piece through the steps from a high level design to code. This cycle can be repeated if problems are found. We can also run the system pieces through the process in parallel as we know the relationships between them. This is similar to how different classes can be coded separately once we have defined the methods. Let's look at the steps in the refinement cycle.

Establish refinement scope

What are the issues in this piece? This piece could be the system GUI, or the reporting subsystem or a set of support classes. It could be a prototyping effort to check out an algorithm or to test the feasibility of a requirement. It allows the developer to focus on one section of the system without the confusion or distraction of the whole system.

Identify classes and objects

We extract the requirements that describe this piece of the system and start associating them with software objects. One technique for this is to brainstorm the kinds of objects that exist in the system being modeled. Almost all programs are meant to represent or simulate some real world thing or process. Accounting software models the processes used by human accountants. On a side note, it's not always good to model the real world to closely. Computers don't do math the way people do and the reason for double entry accounting systems is to catch arithmetic errors that people make but machines don't.

Brainstorming is actually a formal process. It involves a group of people all contributing ideas about a problem. All suggestions are recorded and no discussion of them is allowed. We collect the ideas and analyze them later.

Objects can be physical things, like tables, rules, like gravity, actions, like falling and abstract things like errors. We can also have objects that represent roles for things, places, containers, events and data sources and sinks. Look for nouns and verbs in the requirements.

Then start generalizing the objects into classes. You may eliminate objects at this time or even add them to represent abstraction that don't have a real existence. Remember that classes can exist in the hierarchy just to hold common methods and data. You may also split objects found in the brainstorming into several classes.

Identify relationships

We have figured out the objects and abstracted them into classes. Now, how are they related to each other? There are three basic kinds of relationships to consider.
Inheritance
Consider common characteristics of classes and objects. Push them up the tree as far as possible. Maybe you need to create placeholder classes or abstract classes. Perhaps some interfaces are called for.
Composition
Some classes will contain other classes. Note if there are numerical relationships. Perhaps once class contains a specific number of another or maybe the number is not known. For example, bicycles have exactly 2 wheels, there is only one engine in a car and buildings have a variable number of windows between 0 and thousands. Do we need containers to hold these embedded objects and what sort?
General Association
Objects can use one another. In a simulation, a car might use a road object to get information about traction but it doesn't have a road object in it nor does it inherit from road. We have used the methods of the System class without creating an instance of it.

There are design languages like UML (Unified Modeling Language) that are designed to help document the relationships between objects.

Detailed Design

Here we identify all the methods, both public and private. Note that we are not writing code yet, just the declarations. We may find that we have to add methods to meet the needs of other system pieces. In this step, we define the data objects we will need. This may add more methods to provide ways to access and change these. We figure out the initialization needs of the data objects. If the methods have tricky algorithms, we might write pseudo code for them. Pseudo code is a kind of informal structured English that describes the algorithm more precisely than text would but without the detail of actual code. For example, the pseudo code for a linear search of an array might be
for each element in the array
	is this the one we want?
	if yes
		print it and exit
	if not
		check the next one
end for
print that it wasn't there

Detailed Design

If the other steps went well, this should be the easy part. Each of the methods should be small and clearly described. Coding them should be easy. It is crucial that the code be clear and readable. Remember that most of the life time and thus cost, of a project is spent in maintenance. Follow the projects coding guidelines if there are any. This will cover issues such as coding style (spacing, indentation, etc), use of javadoc to create documentation and naming conventions.

If the code in this piece used code in other pieces, you may have to write stubs for them. Stubs are classes and methods that look like the real thing but aren't complete. For example, a method that is supposed to count the number of characters in a file might not have been written yet. So we write one that has the same signature but always returns 42. This allows us to test our part even if the rest isn't built yet. We should try to break the system into parts so this isn't needed.

Unit and Integration Testing

Unit testing is done on individual methods or classes. This is usually done by the developer and often requires stubs to be done. Integration test can be done by the developer or by another, or both. This tests the interactions between components and between system pieces. System test involves the evaluation of the complete system. It is usually driven by the requirements. The testing is done from the user perspective, done using the same interface the user will. The goal is to find errors, not prove that it works. It is usually done by separate people from the development.

Prototypes

Prototypes are created to try something out. You make several in the course of a project to test out different concepts or algorithms. You might also make a small application to get familiar with a technique or a library. They can be used to test the performance of some subsystem, measure memory use or other resource consumption.

One common prototype is to build the GUI to test it on users. The prototype would have all the controls but none of the backend processing. Stubs are built to simulate the actions. Prototypes need not even be code. You could construct storyboards or hand-drawn pictures of the GUI components and present them to the users and get opinions.

Prototypes can be used to find inconsistencies in the requirements or design. They should be tossed out after the experiment is completed. You should resist the temptation to reuse the prototype code in the final project. Prototype code is usually written quickly without regard to quality, clarity or error handling.

More on Testing

Testing is more than just exercising the code. Reviews and inspections of various types are part of the testing process that begins well before the code is written. Each stage in the development process should include some sort of evaluation.

In the early, pre-code, stages this takes the form of reviews of the documents that are produced as part of the stage. For example, the document that contains the requirements is examined closely by a team of people that are familiar with the problem area. This often includes customers. Each requirement is examined in turn. You are looking for whether the requirement is needed, is clear, complete or if it is in conflict with other requirements.

The design document is reviewed as well. Here we are looking that the requirements are met, or at least covered. We look for conflicts between components, feasibility, efficiency, fault handling, performance, etc.

A walkthrough is a process for reviewing code or design. The author(s) present the code or design and explain the decisions that led to it. They present examples and scenarios of how it would be used. The reviewers look for common problems, inconsistencies and other problems. A code inspection is more formal. The code is examined line by line, looking for errors in use and whether the code performs the actions it is supposed to.

In both cases, the problems are notes but not solved. The purpose is to find the errors, they will be solved later. Problems can be classified into what part of the process they were introduced or by severity.

Looking for Errors

There are two major kinds of testing. Black box testing involves testing the system from the outside, without knowing how it works inside. Generally test cases are created by examining the requirements. Testing is done using the user interfaces. System testing is usually done like this.

By contrast, white box involves detailed knowledge of the code. This is what we described above under unit and integration testing.

There are some techniques that can be used to help find errors in the code or design. One is to consider equivalence classes. For a given method, there is a range of values it can take as arguments. This range can be broken into classes. In each class, the response of the method is the same or similar for any member.

For example, if a method takes integer arguments, we don't (and probably can't) try every possible value or combinations of values. But, there are groups of integers that have common effect on the method. It probably reacts the same to all positive numbers and all negative numbers so we only need to try a couple of examples from each class. Include some small and large values from each class. The boundaries between classes are the most interesting. Boundaries and interfaces are where most errors occur. So we might use a set of values like -999999,-500, -1, 0, 1, 500, 99999.

Measuring the completeness or effectiveness of testing is difficult. There are methods to determine the coverage of the tests. We can instrument the code (add extra code to record activities) and run the tests. Then we record which statements in the code got executed when the tests are run. We can try for complete statement coverage but this is not always possible. It can be difficult to simulate some system errors to test the error handling.

Another measure is to determine if every path through the code has been followed. We have to have test cases to exercise both branches in if statements or all cases in a switch statement.