Summary Data
Summary Data
Summary Data
Higher order interactions, for example, TPF (temperature, pressure, flow rate) are negligible, since
they are hard to be found in nature.
Take the difference between the average of all values with a + and the average of all values with a –
(tutorial 1 question 2a)
Standard deviation
An effect is usually significant if it is 2-3 times larger than the standard error.
1. (per experiment)
2. (per experiment)
3.
4.
Design matrix:
1. Fill in a, this is the intercept, this is represented by a column of +. So a is the average of all
values
2. Fill in the main and interaction effects for b, c, d…
The regression model is the line with the values for a, b, c… filled in (tutorial 1 question 3a)
In fractional design only the most important factors are selected with screening
1. Look at the number of experiments, then decide if you can use a full factorial design or a
fractional design
2. Make the design matrix, vary the combinations of + and -
Variable scales
(4.1)
Mean = average
Variance: (4.3)
(4.6)
Random errors = inevitable errors, related to precision, increase the spread around the central value.
Systematic errors = influence the result in a specific direction, instrumental errors, solutions, reading
off values
Distributions
Of discrete variables:
(5.3)
p = probability to find exactly k blue balls in n attempts
- Poisson distribution for low probability of succes
The probability for k events to occur (observe a given number of events within a fixed interval
of time)
with p the probability
- Continuous uniform: the probability for each possible value of the variable is equal
(5.5)
- Exponential (wait times between number of events in Poisson distribution)
Pdf: (5.6)
Cdf: (5.7)
- Normal
Pdf: (5.9)
- Lognormal: measurements are not symmetrically spread around the mean
Pdf: (5.10)
- Students t: For , the student’s t-distribution is exactly equal to the standardized normal
distribution. The smaller the number of degrees of freedom, the larger the difference
between the normal distribution and the student’s t-distribution
Confidence interval
HOW TO CALCULATE A CONFIDENCE INTERVAL
(6.6)
(or
68% -> t = 1
95% -> t = 2
99% -> t = 3)
Hypothesis tests
Ho = no significant difference
H1 = significant difference
One sided = can be either smaller or higher -> look at question, not data
(7.1)
Look for utab in table B1 or B2 (two-sided test) or B3 or B4 (one-sided test)
(7.2)
(7.3)
(7.7)
HOW TO DO AN F-TEST
While an t-test compares averages between two groups, an f-test is used to compare the standard
deviations between two groups
1. (7.8)
The largest variance is the numerator, the smallest variance is the denominator
2. Look at table B7 or B8 for Ftab, if Fcal < Ftab, H0 is accepted (the variances are not different) and
you can pool the variances. If Fcal > Ftab, use Welch’s test
When no F-test:
- Simple comparison of means (you can assume the variances are the same)
- Non-normal data
- One-sample t-test
Because you put the largest variance in the numerator, and the smallest in the denominator, the f-
test is one-sided
Non-parametric tests:
1. Calculate the differences between the observations and the reference value
2. Rank the differences ignoring the minus sign
3. Restore the original minus sign in the ranked differences
4. Rank the ranked differences (1, 2, 3, 4…) (when 2 values are equal do _.5 for both)
5. Calculate the sum of the absolute value for both the positive and negative ranks
6. The smallest of the 2 sums is Tcal
7. Table B9 lists Ttab, if the null hypothesis is rejected, this is opposite from normal
(8.1)
1. (8.2)
Ttab in table B5, degrees of freedom is n-2 because you have x and y
2. Calculate the confidence interval of the slope and intercept, if 1 lies inside the confidence
interval of the slope and 0 in the one for the intercept, the methods give the same results
a = average y – b * average x
residuals:
(8.8)
(8.9)
(8.10)
ANOVA
Compares more than 2 series of measurements
(9.2)
(9.3)
(9.4)
- ANOVA doesn’t test whether the variances of different groups are equal