Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Statistics Sampling Issues

Prelims Convenience Sampling: selects participants


on the basis of proximity, ease-of-access,
I. Research and Concepts and willingness to participate.
Statistic – a fact or piece of data from a II. Variables and Scales of
study of large numerical data. Measurement
Index – an indicator, sign or measure of Variables and Constants
something.
A variable is pretty much anything that can
Why study statistics? be codified and has more than a single
1. It helps us process data/information; value.
data being the raw material of For example: sex, age, height, attitudes
knowledge; and. about school, score on a test.
2. To learn things and inform our lives
A constant, in contrast, has only a single
Statistics – is the study of how best to score
collect, analyze, and draw conclusions from
data. Quantitative and Qualitative Variables

The Scientific Method: A quantitative (continuous/discrete)


variable is one that is scored in such a way
1. Ask a question or Address a problem that the numbers, or values, indicate some
2. Research sort of amount.
3. Hypothesis
4. Experiment E.g.: Height, Age, Number of
5. Analysis Children, Year Level
6. Conclusion
Qualitative variables are those for which
Social Science Research and the assigned values do not indicate more or
Terminologies less of a certain quality.

A population (N) is an individual or group E.g.: Class, Sex (dichotomous),


that represents all the members of a certain Ethnicity, Religion, Year Level
group or category of interest.
Scales of Measurement
A value generated from, or applied to, a
population is called a parameter. Qualitative Scales:

A sample (n) is a subset drawn from larger 1. Nominal: Categories or labels.


population. E.g., Sex (male/female), civil status
A value generated from, or applied to, a (single/married/divorced), Jersey #
sample is called a statistic. 1. Ordinal: Categories or labels
Descriptive and Inferential Statistics indicating rank or “order” (hence,
ordinal), but no meaningful distance
Descriptive statistics apply only to the between scores. Order matters,
members of a sample or population from distance does not.
which data have been collected.
E.g., Eldest/Middle/Youngest,
It exists to describe and simplify information. 1st/2nd/3rd, Champion/Runner-up/Last

Inferential statistics: using sample data to Quantitative Scales:


infer/conclude about the characteristics of
the larger population. Interval and Ratio Measures.

It bridges the known (sample) to the Note: Continuous - infinite numbers


unknown (population). between measures.

Discrete - absolute values without values


in between.

1
Interval Variables Strengths and Weaknesses

Numbers have order (like ordinal), PLUS The HOUSE OF MEAN


clear, equal, and meaningful intervals
between values. Hero (Central Measure): Mean

E.g., Grade; the difference between 100 Sidekicks (Variability Measures):


and 99 (1 interval between 2 values) is the Standard Deviation and Variance
same as difference between 82 and 81 (1 Strength: Precise/Exact
interval between 2 values).
Weakness: Prone to being influenced by
Rating scales (1-5); 1-Strongly Disagree, outliers (not Robust)
2-Disagree, 3-Neutral, 4-Agree, 5-Strongly
Agree The HOUSE OF MEDIAN

Ratio Variables Hero (Central Measure): Median

Like interval (numerical and with clear, Sidekicks (Variability Measures):


equal, and meaningful intervals), PLUS Minimum, Maximum, Inter-quartile Range
ratios are meaningful (twice as much) and
has a true zero point (zero means absence of Strength: Not influenced by outliers
what you are measuring). (Robust)

E.g., Weight; 10 lbs. Is twice as much as 5 Weakness: Not as precise/exact as the


lbs. (ratio); 0 lbs. Means no weight or mean.
absence of weight (true, meaningful 0 point).
Outliers - Data that is very much bigger
No. of Children; 4 children is twice as much or smaller than the next nearest data
as 2 children (ratio); 0 children means point.
absence of children (true zero point).
IV. Measures of Variability
Comparing Scales Range
Ordinal vs. Interval Difference between largest and smallest
values in a distribution.
Interval: Grade; a one (1) point difference
is the same at all points in the scale. Range = Max - Min
Ordinal: Place in race; 1st, 2nd, 3rd. Range defines the broadness
Difference between 1st and 2nd places may of the base of a distribution
not be as close/far as difference between 3rd
and 4th places. Variance

III. Measures of Central Tendency The variance measures how far each number
in the set is from the mean.
Distribution – any collection of scores on a
variable Standard Deviation

Mean, Median, and Mode Standard = average; Deviation = distance of


values from the mean
Mean = Average; central point in a set of
data; balance point. Standard deviation is the average distance of
values from the mean.
Median = “Middle value”; cuts distribution
into upper and lower halves. V. Mean, Median, and Outliers

Mode = Most frequent value; “Peak(s)” of 5 Number Summary


the distribution
1. Median
Ordinal data has a median and mode only*,
2. Minimum
and nominal data has only a mode.
3. Maximum
* -- a consensus has not been reached
among statisticians about whether the mean
can be used with ordinal data.

2
4. Inter-quartile Range - Divides the The Standard Deviation defines the Spread
distribution into 4 equal quarters
- 68% of values are within
IQR = Q3 - Q1 where: 1 standard deviation of the mean

Q1 - Median of Lower Half - 95% of values are within


(25% of the data is below q1) 2 standard deviations of the mean

Q3 - Median of Upper Half (75% of - 99.7% of values are within


the data is below q3) 3 standard deviations of the mean
Boxplots - A.k.a.
Boxplots are standardized graphical - The empirical rule,
representation of the 5 number summary
- The three-sigma rule, or
Median vs. Mean
- The 68-95-99.7 rule
1. We still want the mean because of
its exactness (for n) and precision (to Normal Distributions and your world
the mean of N); and
1. It is the most common
2. The median helps us track and trim distribution in nature (as
outliers. distributions go)

VI. The Normal Distribution - Normal distributions are also called


“natural distributions”
The normal distribution refers to a family
of continuous probability distributions - Height, weight, intelligence, etc.
described by the normal equation.
1. Statistical relationships become
The value of the random variable Y is: clear if one assumes the normal
distribution.
The normal curve is a mere representation of - Does IQ lead to success?
probabilities.
- Will material things lead you to more
Normal Distribution (Bell Curve; Gaussian happiness?
Distribution)
- Is studying important to career
All normal distributions are symmetric and growth?
have bell-shaped density curves with a
single peak (Unimodal). VII. Normal Distribution Concepts

Two specific measures: The distribution curve is just a model that


seeks to reflect reality.
1. The MEAN (center and peak of
density) Describing Distributions: Skewness and
Kurtosis
2. The STANDARD DEVIATION
(spread or girth of the bell curve). Skewness

The Mean and Standard Deviation Skew: Degree of deviation from the
normal in terms of asymmetrical
The Mean Defines the Center extension of the tails.
In a normal distribution... Normal distributions have a skew of 0.

1. Mean = median = mode Kurtosis

2. Symmetry about the center The shape of a distribution of scores in terms


of its flatness or peakedness.

50% of values less than the mean and 50% Normal distributions have a standard kurtosis
greater than the mean of 3.

3
Platykurtic=flat; k<3

Leptokurtic=thin; k>3

Locating Values via Standardization:


Percentiles and Z-scores

Standardization: The process of converting


a raw score into a standard score.

Raw scores: Individual observed scores on


measured variables.

Percentile (%ile): % scores below a


certain value in the distribution.

Standard score (z-score): A raw score that


has been converted to a z score.

VIII. Central Limit Theorem

1. Theorems are found in math.

2. They are statements that have been


proved using proved statements
(theorems).

3. Unlike theories, theorems are


deductive.

The CLT (in simple terms)

1) Sample means will be distributed


roughly as a normal distribution
around the population mean.

All of this will be true no matter what the


distribution of the underlying population
looks like.

1) The distribution of means will


approach a normal shape faster if...

a) The population from which


the sample it is taken
from is a normal
distribution; OR

b) The sample size (n) is


relatively large (30 or
more).

Making Inferences

If we have detailed information about some


population…

Then we can make powerful inferences


about any properly drawn sample from
that population.

4
If we have detailed information about a Confidence Intervals
properly drawn sample (mean and standard
deviation)... A confidence interval for the mean is a range
of scores where the population mean will fall
We can make strikingly accurate within this range in 95% of samples.
inferences about the population from
which that sample was drawn. In 100 samples, the confidence intervals of
95 samples would contain the true value of
If we have data describing a particular the mean in the population.
sample, and data on a particular population…
X. Hypothesis
We can infer whether or not that sample
is consistent with a sample that is likely Definition: A hypothesis is an educated
to be drawn from that population. guess that can be tested.

Last, if we know the underlying An educated guess is a


characteristics of two samples… guess that is based on theories.

We can infer whether or not both That can be tested means that it is not a
samples were likely drawn from the truism (e.g., I hypothesize that the sun will
same population. shine and set tomorrow).

Midterms Two Types

IX. Inferential Statistics 1. Alternative Hypotheses (H1)


denote the presence of an effect.
1. Samples allow you to make good
inferences about the sample itself; A.k.a Experimental Hypothesis if the
and methodology is experimental.

2. Samples allow you to make good 2. The Null Hypotheses (H0) denote the
inferences about the population as a absence of an effect.
whole. The null hypotheses was
#1 is called internal validity (the state of made to be rejected. =(
being factual or logically sound). We need H0 because we cannot prove the
#2 is called external validity. alternative hypothesis using statistics... But
we can reject the null hypothesis.
Inferential statistics are techniques that allow
us to use samples in making If our data give us confidence to reject the
generalizations about the populations null hypothesis then this provides support
from which the samples were drawn. (not proof) for our alternative/experimental
hypothesis.
The Standard Error
Science is not above proving an effect
The standard error is the standard deviation (accepting H1), but disproving the absence of
of sample means. It is a measure of how an effect (rejecting H0).
representative sample is likely to be of the
population. Directional and Non-Directional
Hypotheses
Interpreting Standard Errors
Directional hypotheses state that an effect
Large standard errors (relative to the sample will occur, but it also states the direction
mean): high variability between the of the effect.
means of different samples…
“Students will know more about research
Some samples might not actually represent methods after taking EDP 211.”
the population.
A non-directional hypothesis states that an
Small standard errors: most sample means effect will occur, but it doesn’t state the
are similar to the population mean… direction of the effect.

Our samples are likely to be accurate “Students knowledge of research


reflections of the population. methods will change after EDP 211.”

5
Hypothesis Testing One of the four things can happen.

Steps in Hypothesis Testing 1. You can say it ain’t (null) and it don’t
(no effect in reality)
1. Determine H0 and H1;
2. You can say it be (alternative) and it
2. Collect Data and Calculate Test do (effect is present in reality)
Statistic;
3. You can say it ain’t (null) and it do
3. Check the p-value to determine if (effect is present in reality)
the effect just happened by chance;
and 4. You can say it be (alternative) and it
don’t (no effect in reality)
4. Given the p-value, make decision:
#3 and #4 are “false alarms”. False
a. If effect is likely to have alarms are allowable based on a p-value
happened by chance, do not
reject H0; OR
Type I and Type II Errors
b. If effect is unlikely to have
happened by chance, reject False Alarms:
H0.
1. You can say it ain’t (null) and it do
P-Value (effect is present in reality)

Developed by Ronald Fisher, the p-value is #1 is called Type II error: When we


the probability that an effect happened believe that there is no effect in the
by chance (false alarm). population when, in reality, there is (denial).

The probability that the value observed (or a 1. You can say it be (alternative) and it
more extreme one) happened by chance if don’t (no effect in reality)
the Null Hypothesis was true.
#2 is called Type I error: When we believe
P-values and Directionality that there is a genuine effect in our
population, when in fact there isn’t
One-tailed and Two-tailed Test (assumption).
One-tailed tests are statistical tests that look Sampling issues and other statistical
out for an effect on one tail of the distribution mistakes lead you to wrong generalizations
(directional). (Type I or Type II Errors).
P-value significance level for one-tailed EFFECT SIZE
test: 0.05
p values are never enough...
Two-tailed tests are statistical tests that look
out for an effect on both tails of the “Just because a test statistic is significant
distribution (non-directional). doesn’t mean that the effect it measures is
meaningful or important.”
P-value significance level for two-test:
0.025 An effect size is a standardized measure (0
to 1) of the magnitude of observed effect in a
Decision Errors: Type I and Type II sample.
Reality 1: There is, in reality, an effect in the Statistical Power
population; or
Statistical power is the measure of a
Reality 2: There is, in reality, no effect in the statistical test’s ability to find effects in a
population. population (assuming that the effect is
Statistics will NOT tell us which reality is present).
TRUE. But statistics can show us the Low statistical power will under-
probability of which reality is MORE LIKELY report significance and effect size.
(and whether the effect is strong or not).
High statistical power will correctly
report significance and effect size.

6
0.8 Statistical Power is generally The correlation coefficient (r) is a
acceptable. standardized measure of how much variables
are in sync when they change (covariance).
Two Uses
In short: r is a standardized measure of
Statistical Power help us… covariance.
1. See how powerful your test The r can be anywhere between -1.00 to
statistics are; and +1.00
2. If you have a desired effect size, R denotes direction.
calculate the sample size
necessary to achieve a given A positive (+) r means that the direction
level of power of a test (using of the correlation is positive:
softwares such as G*Power or Tables
by Cohen). Both variables covary in the same direction.

XI. Correlation A negative (-) r means that the direction


of the correlation is negative:
Correlation: When is a relationship really a
“relationship”? Both variables covary in opposite directions.

The logic of statistical correlations The p-value in correlations.

Suppose X and Y are two correlated The p value is the probability that the
variables… correlation/covariance happened by chance
(random or accidental) if the null hypothesis
Changes in X causes Y to change... was true.

AND If P > 0.05, the correlation is not


significant (effect may have happened
Changes in Y causes X to change. by chance).
“Covariance”: comparing how variables are If P <=0.05, the correlation is
in sync when they change. significant (effect did not happened by
Correlation: How do we describe chance).
relationships? Reporting Correlations
Two fundamental characteristics: “There was a significant correlation
1. Direction between <Variable x> and <Variable Y>, r
= ____, p (one/two-tailed) = _____.”
Positive (+) Correlations:
variables move in the same direction. The direction and strength of the correlation
is then explained adn analyzed in the
Negative (-) Correlations: discussion of research results.
variables move in opposite directions.
“There was no significant correlation
2. Magnitude between <Variable X> and <Variable Y>, r
= ___, p (one/two-tailed) = ____.”
Strength of correlations range from
weak to perfect. Parametric and Non-Parametric Tests
for Correlation
A perfect correlation indicates that the
correlation is taking place in EVERY Chi-Square Test of Independence
member of the sample or population.
H0 = Variables are independent of each
A weak correlation indicates that the other (no correlation).
correlation is taking place in a “FEW”
members of the sample or population. Example research questions that use the Chi-
Square:
Measures of Correlation
1. Medicine - Are children more likely
The Correlation Coefficient (r) to get infected with virus A than
adults?

7
2. Psychology - Are males likely to do XII. Regression: The Power to
better in exams than females? Predict

Spearman Ranks (Rho) A correlation in statistics becomes a


causation when there is a theory that
Kendall’s Tau is used if there are many tied accurately describes a link between the
ranks. two.
Example research questions that use the Independent and Dependent Variables
Spearman Rho:
A variable that we think is a cause is
1. Sociology - Do people with a higher known as an independent variable
level of education have a stronger (because its value does not depend on any
opinion of whether or not tax reforms other variables).
are needed?
A variable that we think is an effect is
2. Psychology - Does one’s general called a dependent variable because the
cognitive ability correlate to success value of this variable depends on the cause
in college? (independent variable).
Point Biserial Correlation Cause and Effect; IV -> DV
Point biserial correlations are used to In experimental psych, we manipulate the IV
correlate a BINARY (two values) nominal (cause) to trigger a change in the DV (effect)
variable and a scale variable.
Predictions
Example research questions that use the
Point Biserial Correlation: Regression Analysis

1. Sociology - Are males more likely to Regression analysis is a way of…


earn than females?
Predicting an outcome variable from one
2. Social psychology - Is satisfaction predictor variable (simple regression),
with life higher the older (elderly vs.
not elderly) you are? Or several predictor variables (multiple
regression).
Pearson Correlation
Logistic Regression
Example research questions that use the
Pearson Correlation: Logistic regression is multiple regression
but…
1. Medicine - Will increased water
intake significantly bring down a ... with an outcome variable that is a
fever? categorical variable, and

2. Psychology - Will performance in a ... predictor variables that are


previous reading test affect the next continuous or categorical.
reading test? Binary and Multinomial (Polychotomous)
Partial Correlations and the “Third Binary Logistic Regression: When we are
Variable Problem” trying to predict membership of only two
Spurious correlations: two effects that can categorical outcomes,
be statistically linked by correlation despite Multinomial Logistic Regression: When
having no clear causal relationship. we want to predict membership of more than
E.g.: Do storks bring babies? Statistics have two categories.
shown that increased number in storks are Interpreting Regression Analysis
related to increased number of births.
R, R-Squared (R2), and goodness-of-fit
Spurious correlations are caused by a “third
variable”, and partial correlations control R-squared is a statistical measure of how
these third variables. close the data are to the fitted regression
line.

8
A.k.a. coefficient of determination, or the Designing Tests of Difference
coefficient of multiple determination (for
multiple regression). 1. We can either expose different
people to different experimental
R-squared is always between 0% to 100%. manipulations (between-group or
independent design)...
100% indicates that the model explains all
the variability of the response data around its 2. or take a single group of people
mean. and expose them to different
experimental manipulations at
R (correlation coefficient) is the square different points in time (repeated-
root of R2. measures design).
Methods in Regression Between-Group or Independent Design
Regression Methods Pre-test Post-test Design
1. Forced Entry - All variables are Control:
entered simultaneously. Pretest > Post-test (OO)
2. Hierarchical - Variables are entered Experimental: Pretest > Treatment
one-by-one in order of importance. > Post-test (OXO)
3. Stepwise Method - Order of predictors Post-test only Design
are decided upon by a computer, depending
on the t-statistic. Control: Post-test (O)

Forward stepwise means strongest predictors


are chosen first, and backward stepwise Experimental: Treatment > Post-test (XO)
means weakest predictors are chosen first.
Repeated Measures Design
Logistic Regression
One group with two or more treatments
In logistic regression, only forced entry
and stepwise methods are used. Experimental Group:

Prefinals Pretest (optional) > Treatment > Post-test >


Treatment > Post-Test
XIII. Comparing Two Means
Repeated measures designs are relatively
Differences more powerful than independent designs.

In experimental research, we try to Independent Samples (Between Groups)


manipulate what happens to people so T-Test
that we can make causal inferences.
Conditions:
1. Manipulation of a variable (IV)
1. Scales: Interval/Ratio
2. Measuring effect (DV)
2. Distribution: Normal
3. Looking at differences between
groups to see if the effect was 3. Equality of Variance: Not equal
significant (or not) across the groups 4. Two different groups/samples
studied.
Interpreting Independent Samples
E.g. Diet pills. (Between Groups) T-Test
Groups: Control and Experimental LEVENE’S TEST
1. Control Group - the control group F - ratio of variance (variance1/variance2)
does not receive treatment or
intervention. Generally, an F-value far greater than 1
means that the difference between the two
2. Experimental Group - the groups are great.
experimental group(s) receive
treatment or intervention.

9
T-test for Equality of Means Example: DNA Testing & the (little) variance
we share in the genome. (Biology)
T-statistic - ratio of the means (mean group
1/mean group 2) Membership of a certain
culture. (Social Sciences)
- T = 1 means no difference (null
hypothesis) Writing ANOVA Hypotheses

- A t-stat far greater than one means H0: There is no significant difference
one group’s mean is significantly between groups/measures.
greater/lower than the other group’s
mean x1 = x2 = x3

Sig. - significance between the ratio of H1: There is a significant difference between
means. the groups/measures.

- Is the difference between the means - Three possible scenarios


of the groups happening by chance?
One Way Analysis of Variance (ANOVA)

Analysis of the variation within each group,


Repeated Measures (One-sample or
v.s. the amount of variation between
Dependent Samples) T-test
each group.
Conditions:
- If there's a lot of variation within
1. Scales: Interval/Ratio each group and only a little bit of
variation between each group, then
2. Distribution: Normal it's harder to say that the result is
"significant" -- it might be due to
3. Equality of Variance: Not equal
chance alone.
4. One sample, repeated measures
Example: Height of students from Manresa,
XIV. Comparing Several Means La Storta, and Loyola.

If we have one control group and two Repeated Measures ANOVA


experimental groups to study, how many t-
Analysis of the variation within measure,
tests do we need to compare the means?
v.s. the amount of variation between
ANOVA still compares means, but why does it each measure.
“analyze” variances?
- If there's little variation within each
The Idea Behind Analysis of Variance measure and a lot of variation
between each measure, then it's
Variance: easier to say that the result is
"significant" -- it might NOT be due
- At the sample level: The square of
to chance alone.
the average distance of the data
points from the mean (square of the Example: Prelim, Midterms, and Prefinals
standard deviation) Exams.

- At the population level: the ANOVA: Analysis of Variance is a variability


squared distance of the average Ratio
distance of all samples from the
population mean. Variance between/variance within

Simply put: Analysis of variance is the LARGE/SMALL = reject h0 = at least one


analysis of how far samples are from the mean is an outlier and each distribution is
population mean. narrow; distinct from each other

Recap: Central Limit Theorem SIMILAR/SIMILAR =fail to reject h0 = means


are fairly close to overall mean and/or
CLT: Population characteristics (central distributions overlap a bit; hard to
tendency and variability) are generally distinguish.
reflected by its samples.

10
SMALL/LARGE = fail to reject h0 = the
means are very close to overall mean and/or
distributions “melt” together

Some additional points:

1. ANOVA proceeds with the


condition that the variances are
equal. Generally, the variances will
equalize given a good sample size
and representativeness.

2. We can still do ANOVA for two


samples/measures. But t-tests will
generally be enough.

3. ANOVA is an omnibus test. It can


only say if there is a significant
difference between the
groups/measures, but now HOW the
groups/measures are different.

11

You might also like