Describing Data Pt.2
Describing Data Pt.2
Describing Data Pt.2
PICTURE
STATISTICAL ANALYSIS
Presentation Outline
• Review
o Structure of Research
o Dimensions of Research
o Research Process
o Operationalization and Levels of Measurement
o Study Designs
• Statistical Analysis
o Sampling Distributions and z-scores
o Hypothesis Test
o Estimation
• Study Design Considerations
• Points of Confusion
The Structure of Research:
Deduction
The “Hourglass" Notion of Research
Theories
Deductive
Analysis
Reasoning
Hypotheses Propositions
SCIENTIFIC METHOD
Variables Concepts
Measurement Postulates
Data Analysis:
In the Big Picture of Methodology
Question to Answer
Hypothesis to Test Note: Results of empirical scientific studies
Theory always begin with the Descriptive Statistics,
whether results conclude with Inferential Statistics
depends of the Research Objectives/ Aims
Study Design:
Data Collection Method & Analysis
Inferential Statistics
Causal Inference
Collect Data: Test Hypothesis, Conclusions,
Measurements, Observations Interpretation, &
Identification Relationships
• Descriptive Statistics
o Summarization & Organization of variable values/scores
for the sample
• Inferential Statistics
o Inferences made from the Sample Statistic to the
Population Parameter.
o Able to Estimate Causation or make Causal Inference
• Isolate the effect of the Experimental (Independent) Variable
on the Outcome (Dependent) Variable
Data Analysis:
Descriptive Statistics
• Descriptive Statistics are procedures used for organizing and summarizing
scores in a sample so that the researchers can describe or communicate the
variables of interest.
• Note: Descriptive Statistics apply only to the sample: says nothing about how
accurately the data may reflect the reality in the population
• Attempts to rule out chance as an explanation for the results: that results reflect real
relationships that exist in the population and are not just random or only by chance.
• Before you can describe or evaluate a relationship using statistics, you must design
your study so that your research question can be addressed.
: summation
X : Independent Variable, typically
Y: Dependent Variable, typically
N= Size of the Population
n= Size of the Sample
≤ ≥ ≠ = : Equalities or Inequalities
± × ÷ + - : Mathematical Operators
α: alpha, refers to constant/ intercept
µ: mu, sample mean
β: beta coefficient/ standardized
δ: sigma, sample standard deviation
δ2: sigma squared, sample variance
Data Analysis:
Inferential Statistics & Types of Tests
Data Analysis:
• Class Intervals all have the same width: typically, a simple number such as 2, 5, 10,
and so on.
• Each Class Interval begins with a value that is a multiple of the Interval Width.
o The Interval Width is selected so that the distribution will have approximately 10 intervals.
Data Analysis: Grouped Frequency Distribution
• Choosing a width of 15 Relative
Class Intervals produces Class Interval Frequency Frequency
the following Frequency 100 to <115 2 0.025
115 to <130 10 0.127
Distribution.
130 to <145 21 0.266
145 to <160 15 0.190
• Age is typically 160 to <175 15 0.190
displayed as Grouped 175 to <190 8 0.101
Frequency Distribution: 190 to <205 3 0.038
o For Example: 205 to <220 1 0.013
• 45 to 54 Years 220 to <235 2 0.025
235 to <250 2 0.025
• 55 to 64 Years
79 1.000
Copyright © 2005 Brooks/Cole, a
division of Thomson Learning, Inc.
o The score categories (X values) are listed on the X axis and the
frequencies (Number of categories of X values) are listed on the Y axis.
Table
Histograms A frequency distribution
histogram: same set of
quiz scores as a table
Also see Age Distribution of and in a histogram.
Martians examples from
Sampling PowerPoint
• The Smooth Curve emphasizes the shape of the distribution: not the exact
frequency for each category
• Negatively Skewed: the scores tend to pile up on the right side and the
tail points to the left.
Data Analysis: Percentiles, Percentile Ranks, &
Interpolation
• Percentiles and Percentile ranks describe: the relative location
of individual scores within a distribution: for example, the 90th
percentile of infant weight
• The Percentile Rank for a particular X value is the percentage
of individuals with scores equal to or less than that X value.
• An X value described by its rank is the Percentile.
Data Analysis:
X to z and z to X
• The basic z-score definition is usually sufficient to complete most z-score
transformations.
X = μ + zσ
The relationship between z-score values
and locations in a population
distribution.
Why are z-scores important? Because if you know the distribution of your
scores, you can test hypothesis, and make predictions.
Data Analysis: Characteristics of z Scores
• Z scores tell you the number of standard deviation units a score is above or
below the mean
• The mean of the z score distribution = 0
• The SD of the z score distribution = 1
• The shape of the z score distribution will be exactly the same as the shape of
the original distribution
• z=0
• z2 = SS = N
• 2 = 1 = ( z2/N)
Data Analysis:
Sources of Error in Probabilistic Reasoning
• Type II Errors
o Failure to reject a false null hypothesis
o Sometimes called a “Beta” Error.