Chap4 - EXPERIMENTAL DESIGN

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Chapter 4: EXPERIMENTAL DESIGN

4.1. Concepts & Definitions


4.2. Screening of Important Variables
4.3. Tolerance Intervals and Prediction Intervals
4.4. Experimental Design
4.1. Concepts & Definitions

Statistical experimental design refers to the work plan for


manipulating the settings of the independent variables
that are to be studied.

Three pairs of questions that lead to different experimental designs:


1) If I observe the system without interference, what function best
predicts the output y?
What happens to y when I change the inputs to the process?
2) What is the value of θ in the mechanistic model y = xθ? What
smooth polynomial will describe the process over the range [x1, x2]?
3) Which of seven potentially active factors are important?
What is the magnitude of the effect caused by changing two factors
that have been shown important in preliminary tests?

P3
 A clear statement of the experimental objectives will
answer questions such as the following
1. What factors (variables) do you think are important? Are there other
factors that might be important, or that need to be controlled? Is the
experiment intended to show which variables are important or to estimate
the effect of variables that are known to be important?
2. Can the experimental factors be set precisely at levels and times of
your choice? Are there important factors that are beyond your control but
which can be measured?
3. What kind of a model will be fitted to the data? Is an empirical model (a
smoothing poly- nomial) sufficient, or is a mechanistic model to be used?
How many parameters must be estimated to fit the model? Will there be
interactions between some variables?
4. How large is the expected random experimental error compared with
the expected size of the effects? Does my experimental design provide a
good estimate of the random experimental error? Have I done all that is
possible to eliminate bias in measurements, and to improve precision?
5. How many experiments does my budget allow? Shall I make an initial
commitment of the full budget, or shall I do some preliminary
experiments and use what I learn to refine the work plan?
 Principles of Experimental Design
Four basic principles of good experimental design are direct
comparison, replication, randomization, and blocking.

Comparative Designs:
If we add substance X to a process and the output improves, it is tempting
to attribute the improvement to the addition of X. But this observation may be
entirely wrong => X may have no importance in the process.
Replication:
Replication provides an internal estimate of random experimental error.
The influence of error in the effect of a factor is estimated by calculating
the standard error. All other things being equal, the standard error will decrease
as the number of observations and replicates increases.
This means that the precision of a comparison (e.g., difference in two means)
can be increased by increasing the number of experimental runs.
Increased precision leads to a greater likelihood of correctly detecting
small differences between treatments. It is sometimes better to increase
the number of runs by replicating observations instead of adding observations
at new settings
Randomization
To assure validity of the estimate of experimental error, we rely on the principle
of randomization. It leads to an unbiased estimate of variance as well as
an unbiased estimate of treatment differences. Unbiased means free of
systemic influences from otherwise uncontrolled variation

For fitting a straight line improve

Blocking
Blocking is a means of reducing experimental error. The basic idea is to partition
the total set of experimental units into subsets (blocks) that are as homogeneous
as possible. In this way the effects of nuisance factors that contribute systematic
variation to the difference can be eliminated.
This will lead to a more sensitive analysis because the experimental error will be
evaluated in each block and then pooled over the entire experiment
Attributes of a Good Experiment

1. The basic principles of randomization, replication, and blocking.


2. Simple:
a. Require a minimum number of experimental points
b. Require a minimum number of predictor variable levels c. Provide data
patterns that allow visual interpretation
d. Ensure simplicity of calculation
3. Flexible:
a. Allow experiments to be performed in blocks
b. Allow designs of increasing order to be built up sequentially
4. Robust:
a. Behave well when errors occur in the settings of the x’s
b. Be insensitive to wild observations
c. Be tolerant to violation of the usual normal theory assumptions
5. Provide checks on goodness of fit of model:
a. Produce balanced information over the experimental region
b. Ensure that the fitted value will be as close as possible to the true value
c. rovide an internal estimate of the random experimental error
d. Provide a check on the assumption of constant variance
Successful strategies for
blocking and randomization
in three experimental
situations
3000

2500

2000
Average
1500 Max.
Safety
1000

500

0
0 5 10 15 20
Lead-time

Overtime
Variable
Constant Production
Production

Inventories
4.2. SCREENING OF IMPORTANT VARIABLES
• A structured, organized method
– To determine whether some program or treatment
causes some outcome or outcomes to occur.
• If X, then Y
– Because there may be lots of reasons, other than
the program, for why you observed the outcome,
• If not X, then not Y needs to be addressed, too
• Identify the variable you will change – the independent variable
• Identify the variable you will measure – the dependent variable
• Write an experimental hypothesis – a statement predicting how
the X will affect the Y
• Is it directional or bidirectional? Why?
• Write a null hypothesis – a statement predicting that the Y will
have no effect on the X
Key terms
• Experiment: Process of collecting sample data

• Design of Experiment: Plan for collecting the sample

• Response Variable: Variable measured in experiment


(outcome, y)

• Experimental Unit: Object upon which the response y is


measured

• Factors: Independent Variables

• Level: The value assumed by a factor in an experiment

• Treatment: A particular combination of levels of the factors in


an experiment
Factorial Designs
• Careful selection of the combinations of factor levels in
the experiment
• Provide information on factor interaction
• Regression model includes:
– Main effects for each of the k factors
– Two-way interaction terms for all pairs of factors
–…
Process of Experimental Design
• To show that there is a casual relationship,
– Two “equivalent” groups
• The program or treatment group gets the program
• The comparison or control group does not
• The groups are treated the same in all other respects
– Differences in outcomes between two groups must be due to
“the program”
Steps in an Experiment
• Select factors to be included
• Choose the treatments
• Determine the number of observations to be made for
each treatment
• Plan how the treatments will be assigned to the
experimental units
Volume and “Noise”
• Volume: quantity of information in an experiment
– Increase with larger sample size, selection of treatments
such that the observed values (y) provide information on the
parameters of interest
• Noise: experimental error
– Reduce by assigning treatments to experimental units
Ex. => Question:
Does salted drinking water affect blood pressure (BP) in mice?
Experiment:
1. Provide a mouse with water containing 1% NaCl. (X)
2. Measure BP (Y)
Good experiments are comparative.
• Compare Y in a mouse fed salt water (case A) to Y in a mouse fed plain water
(Case B).
• Compare Y in strain A mice fed salt water to Y in strain B mice fed salt water.
Why replicate?
• Reduce the effect of uncontrolled variation
(i.e., increase precision).
• Quantify uncertainty.
A related point: An estimate is of no value without some
statement of the uncertainty in the estimate.

Stratification
• Suppose that some Y measurements will be made in the morning and some in
the afternoon.
• If you anticipate a difference between morning and afternoon measurements:
– Ensure that within each period, there are equal numbers of subjects
in each treatment group.
– Take account of the difference between periods in your analysis.
• This is sometimes called “blocking”.
4.4. Experimental Design
Factorial experiments
Suppose we are interested in the effect of both salt water and a
high-fat diet on blood pressure.
Ideally: look at all 4 treatments in one experiment.
Plain water Normal diet
Salt water High-fat diet

Interactions
– We can learn more.
– More efficient than
doing all single-
factor experiments.
Data presentation
Good plot Bad plot
40
35
30
25
20
15
10
5
0
A B
Group

Good table Bad table


Treatment Mean (SEM) Treatment Mean (SEM)

A 11.2 (0.6) A 11.2965 (0.63)

B 13.4 (0.8) B 13.49 (0.7913)


C 14.7 (0.6) C 14.787 (0.6108)
Several samples
Sampling data

Training data Validation data


(80%) (20%)
Test - Blind test
4.3. Tolerance Intervals and Prediction Intervals
Confidence intervals (CI)
• 6 mice, with mean = 103.6 and standard deviation (SD) = 9.7.
• We assume that Y in the underlying population follows a
normal (Gaussian) distribution.
• On the basis of these data, we calculate a 95% confidence
interval (CI) for the underlying average Y (BP):

103.6 ± 10.2 = (93.4 to 113.8)


• The plausible values for the underlying population average Y,
given the data on the six mice.
• There is a 95% chance of obtaining an interval that contains
the population average.
CI for difference

95% CI for treatment effect = 12.6 ± 11.5


Confidence interval:
The plausible values for the effect of salt water on BP.
Test of statistical significance:
Answer the question, “Does salt water have an effect?”
Null hypothesis (H0): Salt water has no effect on BP.
Alt. hypothesis (Ha): Salt water does have an effect.

• Type I error (“false positive”): Conclude that salt water has an


effect on BP when, in fact, it does not have an effect.

• Type II error (“false negative”) Fail to demonstrate the effect of


salt water when salt water really does have an effect on BP.
The truth

Conclusion No effect Has an effect

Reject H0 Type I error 

Fail to reject H0  Type II error


Conducting the test
• Ex. we could look at the difference between the average BP in
the treated and control groups; let’s call this D.)

• If D is large, the treatment appears to have some effect.

• How large is large? We compare the observed statistic to its


distribution if the treatment had no effect.
Significance level
• We seek to control the rate of type I errors.

• Significance level (usually denoted ) = chance you reject H0, if


H0 is true; usually we take  = 5%.

• We reject H0 when |D| > C, for some C.

• C is chosen so that, if H0 is true, the chance that |D| > C is .


If salt
has no effect

If salt
has an effect
P-values
• A P-value is the probability of obtaining data as extreme as was
observed, if the null hypothesis were true (i.e., if the treatment
has no effect).

• If your P-value is smaller than your chosen significance level


(), you reject the null hypothesis.

• We seek to reject the null hypothesis (we seek to show that


there is a treatment effect), and so small P-values are good.
• Test of statistical significance
Use the observed data to answer a yes/no question, such as
“Does the treatment have an effect?”
• P-value
– Summarizes the result of the significance test.
– Small P-value  conclude that there is an effect.
Never cite a P-value without a confidence interval.
Significance test

• Compare the BP of 6 mice Distribution of D


fed salt water to 6 mice fed when  = 0
plain water.

•  = true difference in
average BP (the treatment
effect).
• H0:  = 0 (i.e., no effect)
• Test statistic, D.
• If |D| > C, reject H0.
• C chosen so that the chance
you reject H0, if H0 is true, is
5%
Statistical power

Power = The chance that you reject H0 when H0 is false (i.e.,


you [correctly] conclude that there is a treatment effect when
there really is a treatment effect).
Effect of sample size

6 per
group:

12 per
group:
Effect of the effect

 = 8.5:

 = 12.5:
Various effects
• Desired power   sample size 

• Stringency of statistical test   sample size 

• Measurement variability   sample size 

• Treatment effect   sample size 


Determining sample size
• Structure of the experiment
• Method for analysis
• Chosen significance level,  (usually 5%)
• Desired power (usually 80%)

• Variability in the measurements if necessary, perform a pilot study, or


use data from prior publications.

• The smallest meaningful effect


Reducing sample size
• Reduce the number of treatment groups being compared.
• Find a more precise measurement (e.g., average time to effect
rather than proportion sick).
• Decrease the variability in the measurements.
– Make subjects more homogeneous.
– Use stratification.
– Control for other variables (e.g., weight).
– Average multiple measurements on each subject.

Final conclusions
• Experiments should be designed.
• Good design and good analysis can lead to reduced sample
sizes.
• Consult an expert on both the analysis and the design of your
experiment.

You might also like