1 BASICemm
1 BASICemm
1 BASICemm
COLLEGE OF AGRICULTURE
Adv. STATISTICS
Your professor:
RESEARCH
RESEARCH
CONCLUSIONS AND DESIGN
DESIGN
RECOMMENDATIONS/
IMPLICATIONS
DATA
DATA COLLECTION
INTERPRETATION
AND DISCUSSION
DATA
PROCESSING
AND ANALYSIS
Introduction to Statistics
1. Overview
2. Types of Data
3. Critical Thinking
4. Design of Experiments
1. Overview
Overview
A common goal of surveys
and other data collecting
tools is to collect data from a
smaller part of a larger group
so we can learn something
about the larger group.
In plural sense, statistics refers
to set or sets of data
Examples:
Sample: a sub-collection of
elements drawn from a population.
Survey- collection of data from sample.
Key Concepts
Sample data must be collected in
an appropriate way, such as
through a process of random
selection.
• If sample data are not collected
in an appropriate way, the data
may be so completely useless
that no amount of statistical
torturing can salvage them.
Slide 14
2. Types of Data
Definitions
Parameter: a numerical
measurement describing some
characteristic of a population
population
parameter
Population Sample
Subset
Parameter Statistic
• Populations have • Samples have
Parameters. Statistics.
Definitions
Statistic: a numerical
measurement describing some
characteristic of a sample.
sample
statistic
Definitions
Quantitative data: numbers
representing counts or
measurements of quantitative
variable.
Examples are data on temperature,
weight of eggs, number of fruits,
scores in examination, income,
speed
Definitions
Qualitative (or categorical or
attribute) data
can be separated into different categories
that are distinguished by some
nonnumeric characteristics.
Examples: genders (male/female) of
professional athletes.
: design of machines
: Ethnicity
Working with
Quantitative Data
2 3
Examples:
- survey responses (yes, no, undecided)
- gender (male, female)
- field of specialization (science, math, English)
- favorite food store (Jollibee, McDo, KFC)
ordinal level of measurement
involves data that may be arranged in
some order, but differences between data
values either cannot be determined or are
meaningless
Examples:
- Course grades (A, B, C, D, or E)
- Efficiency Ratings (O, VS, S, US, P)
- Places in contests (1st, 2nd, 3rd)
interval level of measurement
like the ordinal level, with the additional
property that the difference between any two
data values is meaningful. However, there is
no natural zero starting point (where none of
the quantity is present)
Examples:
Years 1000, 2000, 1776, and 1492
Temperature
Frequency of visits
ratio level of measurement
the interval level modified to include
the natural zero starting point (where
zero indicates that none of the
quantity is present). For values at this
level, differences and ratios are
meaningful.
3. Critical Thinking
Success in Statistics
Success in the statistics course
typically requires more common sense
than mathematical expertise.
Figure 1
To correctly interpret a graph,
we should analyze the
numerical information given in
the graph instead of being
mislead by its general shape.
Misuses of Statistics
Bad Samples
Small Samples
Misleading Graphs
Pictographs
Misuses of Statistics
Bad Samples
Small Samples
Misleading Graphs
Pictographs
Distorted Percentages
Loaded Questions
97% yes: “Are you in favor of the
no-to-drugs campaign of the
president ?”
4. Design of Experiments
Major Points
If sample data are not collected
in an appropriate way, the data
may be so completely useless that
no amount of statistical tutoring
can salvage them.
Randomness typically plays a
critical role in determining which
data to collect.
Observational Study
observing and measuring
specific characteristics
without attempting to
modify the subjects being
studied
Experiment
Some treatments are
applied and then effects
on the subjects are
observed.
Confounding
occurs in an experiment when
the experimenter is not able to
distinguish between the effects
of different factors
Try to plan the experiment so confounding
does not occur!
Controlling Effects of Variables
Blinding. Subject does not know he or she is
receiving a treatment or placebo
Blocking. Grouping subjects with similar
characteristics
Subset
Parameter Statistic
• Populations have • Samples have
Parameters, Statistics.
Population
Sample
Subset
Sub
se t
Replication and Sample Size
Replication. Repetition of a
treatment in an experiment. In
general, the greater number of
replications, the better.
Methods of Randomization
- draw lots
- drawing cards
- use of random numbers
Systematic Sampling. Some starting
point is selected and then every kth
element in the population
1 13 25 37 49 61 73 85 97 109
Determine
sample size 2 14 26 38 50 62 74 86 98 110
e.g., n=12
3 15 27 39 51 63 75 87 99 111
4 16 28 40 52 64 76 88 100 112
Determine
interval 5 17 29 41 53 65 77 89 101 113
N 6 18 30 42 54 66 78 90 102 114
i=
n 7 19 31 43 55 67 79 91 103 115
120 8 20 32 44 56 68 80 92 104 116
i= =10
12
9 21 33 45 57 69 81 93 105 117
Choose a 10 22 34 46 58 70 82 94 106 118
random start
from the first 11 23 35 47 59 71 83 95 107 119
10 elements 12 24 36 48 60 72 84 96 108 120
Convenience Sampling- based on
ease of access to respondent
and collecting data
Population
Sample
Subset
Sub
se t
Stratification
Stratified Sampling- subdividing the population
into at least two different subgroups that share
the same characteristics, then drawing a sample
from each subgroup (or stratum)
Cluster
Sampling-
dividing the
population into
sections (or
clusters);
randomly
selecting some
of those
clusters;
choosing all
members from
selected clusters
Methods of Sampling
Random Sampling Non-Random Sampling
Simple Convenience
Random Quota
Systematic Purposive
Multi-stage Judgment
Stratified Incidental
Cluster Snowball
Sampling Error
Sampling Error- the difference between
a sample result and the true population
result; such an error results from chance
sample fluctuations
Nonsampling Error- sample data that
are incorrectly collected, recorded, or
analyzed (such as by selecting a biased
sample, using a defective instrument, or
copying the data incorrectly)
Recap