Week1 Procedural Decomposition

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

CS 106B: Procedural Decomposition

Handout written by Marty Stepp

Procedural Design Heuristics

There are often many ways to divide a problem into functions, but some sets
of functions are better than others. Decomposition is a concept that is often
vague and challenging, especially for larger programs with complex
behavior. But the rewards are worth the efort, because a well designed
program is more understandable and more modular. This is important when
programmers work together or when revisiting a past program to add new
behavior or modify existing code. There is no single perfect design, but in
this handout we will discuss several heuristics (guiding principles) for
efectively decomposing large programs into functions.

Consider the following poorly structured implementation of a program to


compute a person's body mass index, or BMI. We'll use this program as a
counterexample, highlighting places where it violates our heuristics and
reasons why it is worse than the previous complete version of the BMI
program.

(see next page)


2

#include <iostream>
#include "console.h"
#include "simpio.h"
using namespace std;
// A poorly designed version of a program to compute
// a user's body mass index (BMI).
void person(int num);
void readWeight(int num, double height);
void reportStatus(int num, double height, double weight);
int main() {
cout << "This program reads data for two" << endl;
cout << "people and computes their body" << endl;
cout << "mass index and weight status." << endl;
cout << endl;
person(1);
return 0;
}
// process one person
void person(int num) {
cout << "Enter person #" << num << "'s information:" << endl;
double height = getReal("height (in inches)? ");
readWeight(num, height);
}
// read person's weight in pounds
void readWeight(int num, double height) {
double weight;
weight = getReal("weight (in pounds)? ");
reportStatus(num, height, weight);
}
// tell if the person is under/overweight
void reportStatus(int num, double height, double weight) {
double bmi = weight / (height * height) * 703;
cout << "body mass index = " << bmi << endl;
if (bmi < 18.5) {
cout << "underweight" << endl;
} else if (bmi < 25) {
cout << "normal" << endl;
} else if (bmi < 30) {
cout << "overweight" << endl;
} else {
cout << "obese" << endl;
}
if (num == 1) {
person(2); // handle second person
}
}
3

The functions of a program are like workers in a company. The author of a


program acts like the director of a company, deciding what employee
positions to create, what employees should be grouped together into groups,
which work to task to which group, and how groups will interact. Suppose a
company director were to divide work into three major departments, two
being overseen by middle managers:

Director
|
+----------------------+----------------------------------+
| | |
Marketing Design Engineering
Administrator Manager Manager
| |
+--------+--------+ +--------+--------+
| | | |
Secretary Architect Engineer Administrator

A good structure gives each group clear tasks to complete, avoids giving any
particular person or group too much work, and provides a balance between
workers and management. This leads to the frst of our procedural design
heuristics.

1. Each function should have a coherent set of responsibilities.

In the business analogy, each group must have a clear idea of what work it is
to perform. If each group does not have clear responsibilities, it's more
difficult for the company director to keep track of who is working on what
task. When a new job comes in, two departments might both try to claim it,
or a job might go unclaimed by any department.

The analogous concept in programming is that each function should have a


clear purpose and set of responsibilities. This is called cohesion.

cohesion
A desirable quality where the responsibilities of a function or process
are closely related to each other.

A good rule of thumb is that you should be able summarize each of your
functions in a single sentence such as, "The purpose of this function is to ..."
Writing a sentence like this is a good way to comment a function's header. A
bad sign is when you have trouble describing the function in a single
sentence, or if the sentence is long and uses the word "and" several times.
This can mean that the function is too large, too small, or does not perform a
cohesive set of tasks.
4

The functions of the bad BMI example have poor cohesion. The person
function's purpose is vague, and readWeight is too trivial and probably
should not be its own function. The reportStatus function would be more
readable if the computation of the BMI were its own function, since the
formula is complex.

A subtler application of this frst heuristic is that not every function must
produce output. Sometimes a function is more reusable if it simply
computes a complex result and returns it, rather than printing the result that
was computed. This leaves the caller free to choose whether to print the
result or to use it to perform further computations. In the bad BMI program,
the reportStatus function both computes and prints the user's BMI. The
program would be more fexible if it had a function to simply compute and
return the BMI value. Such a function might seem trivial because its body is
just one line in length, but it has a clear, cohesive purpose: capturing a
complex expression that is used several times in the program.

2. No one function should do too large a share of the overall task.


One subdivision of a company cannot be expected to design and build the
entire product line for the year. This would overwork that subdivision and
would leave the other divisions without enough work to do. Also this would
make it difficult for the subdivisions to communicate efectively, since so
much important information and responsibility would be concentrated among
so few people.

Similarly, one function should not be expected to comprise the bulk of a


program. This follows naturally from Heuristic 1, because a function that
does too much cannot be cohesive. We sometimes refer to functions like
these as "do-everything" functions because they do nearly everything
involved in solving the problem. You may have written a "do-everything" if
one function that is much longer than the others, hoards most of the
variables and data, or contains the majority of the logic and loops.
In the bad BMI program, the person function is an example of a do-
everything function. This may seem surprising, since the function is not very
many lines long. But a single call to person leads to several other calls that
collectively end up doing all of the work for the program.

3. Coupling and dependencies between functions should be


minimized.

A company is more productive if each of its subdivisions can largely operate


independently when completing small work tasks. Subdivisions of the
company do need to communicate and depend on each other, but such
5

communication comes at a cost. Inter-departmental interactions are often


minimized and kept to meetings at specifc times and places.
In programming, we try to avoid functions that have tight coupling.

coupling
An undesirable state where two functions or processes rigidly depend
on each other.

Functions are coupled if one cannot easily be called without the other. One
way to determine how tightly coupled two functions are is to look at the set
of parameters one passes to the other. A function should accept a
parameter only if that piece of data needs to be provided from outside, and
only if that data is necessary to complete the function's task. In other words,
if a piece of data could be computed or gathered inside the function, or if the
data isn't used by the function, it should not be declared as a parameter to
the function.
An important way to reduce coupling between functions is by using returns
and reference "output" parameters to send information back to the caller. A
function should return a result value if it computes something that may be
useful to later parts of the program. Because it is desirable for functions to
be cohesive and self-contained, returning a result is often more desirable
than calling further functions and passing the result as a parameter to them.

None of the functions in the bad BMI program returns a value. Each function
passes parameters to the next functions, but none of them return. This is a
lost opportunity because several values (such as the user's height, width, or
BMI) would be better handled as return values.

4. The main function should be a concise summary of the overall


program.

The top person in each major group or department of our company example
reports to the Director. By looking at the groups directly connected to the
Director at the top level of the company diagram, you can see a summary of
the overall work: design, engineering, and marketing. This helps the Director
stay aware of what each group is doing. Looking at the top-level structure
can also be useful if another employee wants a quick overview of the
company's goals.

A program's main function is like the director in that it begins the overall task
and executes the various subtasks. A main function should read as a
summary of the overall program's behavior. Programmers can understand
each other's code by looking at main to get a sense of what the program is
doing as a whole.
6

A common mistake that prevents main from being a good program summary
is when the program contains a "do-everything" function. The main function
will call the do-everything function, which will proceed to do most or all of
the real work.

Another mistake is when a program sufers from a property that we call


chaining.

chaining
An undesirable design where a "chain" of several functions call each
other, without returning the overall fow of control to main.

A program sufers from chaining if the end of each function simply calls the
next function. Chaining often occurs when a new programmer does not fully
understand returns and tries to work around this by passing more and more
parameters down to the rest of the program. The following fgure shows a
hypothetical program with two designs. The fow of calls in a badly chained
program might look like the diagram on the left.

main main
| |
+--- function1 +--- function1
| |
+--- function2 +--- function2
| | |
+--- function3 | +--- function3
| |
+--- function4 +--- function4
| |
+--- function5 +--- function5

The bad BMI program sufers heavily from chaining. Each function does a
small amount of work and then calls the next function, passing more and
more parameters down the chain. The main function calls person, which
calls readWeight, which calls reportStatus. Never does the fow of execution
return to main in the middle of the computation. So by reading main you
don't get a very clear idea what computations will be made.

One function should not call another simply as a way of moving on to the
next task. A more desirable fow of control is to let main manage the overall
execution of tasks in the program, as shown on the right side of the fgure
7

above. This doesn't mean that it is always bad for one function to call
another function; it is okay for one function to call another when the second
is a subtask within the overall task of the frst, such as in BMI3 when the
reportResults function calls reportStatus.

5. Data should be "owned" at the lowest level possible.

Decisions in a company should be made at the lowest possible level in the


organizational hierarchy. For example, a low-level administrator can decide
how to perform his/her own work without needing to constantly consult a
manager for approval. But the administrator does not have enough
information or expertise to design the entire fall product line; this goes to a
higher authority such as the manager. The key principle is that each work
task should be given to the lowest person in the hierarchy who can correctly
handle it.

This principle has two applications to programs. The frst is that the main
function should avoid performing low-level tasks as much as possible. For
example, in an interactive program main should not read the majority of the
user input and output lots of println statements.

The second application is that variables should be declared and initialized in


the narrowest possible scope. A bad design is to have main (or another high-
level function) read input and perform computations, and then pass this data
as parameters to the various low-level functions. A better design is to have
the low-level functions read and process the data, and return it to main only
if it is needed by a later subtask in the program.

A sign of poor data ownership is when the same parameter must be passed
down several function calls, such as the height variable in the bad BMI
program. If you are passing the same parameter down several levels of
calls, perhaps that piece of data should instead be read and initialized by one
of the lower-level functions.

Improved version of BMI program

After applying all of the heuristics discussed in this handout, we arrive at the
improved version of the BMI program shown on the following page. Notice
that the main function is a better concise summary of the overall execution
of the program, and that the functions have reduced coupling and chaining.
8

// A better designed version of a program to compute


// two users' body mass index (BMI) values.
void intro();
void person(int num);
double computeBMI(double height, double weight);
void reportStatus(double bmi);
int main() {
intro();
person(1);
person(2);
return 0;
}
// A welcome message to introduce the program to the user.
void intro() {
cout << "This program reads data for two" << endl;
cout << "people and computes their body" << endl;
cout << "mass index and weight status." << endl;
cout << endl;
}
// Read information about one person and compute/display BMI.
void person(int num) {
cout << "Enter person #" << num << "'s information:" << endl;
double height = getReal("height (in inches)? ");
double weight = getReal("weight (in pounds)? ");
double bmi = computeBMI(height, weight);
reportStatus(bmi);
}
// Computes/returns one person's BMI value based on height and weight.
double computeBMI(double height, double weight) {
return weight / (height * height) * 703;
}
// Prints the person's BMI and over/under weight status.
void reportStatus(double bmi) {
cout << "body mass index = " << bmi << endl;
if (bmi < 18.5) {
cout << "underweight" << endl;
} else if (bmi < 25) {
cout << "normal" << endl;
} else if (bmi < 30) {
cout << "overweight" << endl;
} else {
cout << "obese" << endl;
}
}

You might also like