Ecological Statistics Review: Sampling Strategies

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Ecological Statistics Review

¨ Sampling strategies
¨ Descriptive statistics
¨ Comparing populations
¨ Goodness of fit
¨ Regression and correlation
¤ Population average (mean) = µ
Descriptive statistics ¤ Sample average
n ! = (Σ!)/' = 3.2
2 Infiltration n n = # of observations = 16
Num Facility Rate
(in/hr) ¤ Median
1 Tryon Headwaters 0.8 n Middle value = 2.75
2 Sandy & Davis 1
3 Glencoe Elementary 1.2 ¤ Range
4 Sylvania & 29th 1.2 n = 8.8. - 0.8 = 8
5 Glendoveer Commons 1.8
6 Pettygrove & 26th 1.8 ¤ Histogram
7 Central & St Johns 1.9 ¤ frequencies of values
8 Siskiyou & 35th 2.3 ¤ all points assigned a “bin”
9 People’s Coop 3.2 8
10 Sandy & 21st 4.1
6
11 Fremont & 131st 4.3
Frequency

12 Lambert & 17th 4.5 4


13 12th & Montgomery 4.6
2
14 56th & Ankeny 4.8
15 Belmont & 42nd 4.8 0
0 2 4 6 8 10
16 Willamette & Denver 8.8
Infiltration rate (bin)
Descriptive statistics
Variance (s2): the spread of data
Standard Deviation

average = 50, s = 5
average = 50, s = 10

Standard Error
• compare averages using error bars of ± 2SE
• if no overlap, difference is significant
Statistics

Distributions
• Uniform: all observations have about the
same frequency. Uniform
• Ex: precipitation patterns in the tropics

• Normal: Most observations are clustered


around the average.
• Ex: tree heights, body temperatures
• Many statistical tests assume data is
approximately normally-distributed
• Transform/normalize data before
Normal
analyzing
Measurements & Bias
5

¨ Measurements
¤ Accurate: reflects true value
¤ Precise: low variance
¤ Sample 1: low P, high A
¤ Sample 2: high P, low A
¨ Error & Bias
¤ Random error
n Unknown/unpredictable changes
n Reduces precision
¤ Systematic bias
n Caused by flaw in equipment or
experimental design
n Cannot be estimated by repeating
experiment
(Each sample represents 10 measurements of
n Reduces accuracy A single tree whose true height is 22m)
Statistics: t-test
6

¨ Are the means of the two groups statistically different?


¨ What is the probability that the two groups (samples) come from the
same statistical population?
Statistics: t-test
7

¤ Null hypothesis, Ho
n Means of both categories are the same: 2-tail
¤ Alternative hypothesis, Ha?
n Means are different (2-tailed)
n Mean of one category greater than mean of the other (1-tailed)
¤ P-value
n Reject null hypothesis if p < .05
n p < .05 means there is only a 5% chance that the two categories aren’t
part of the same population. There is only a 5% chance of a false positive
(a false declaration of difference between means).
¤ Parametric test
n Use non-parametric tests for non-parametric data (e.g. counts, ranks,
percentages).
n Transform parametric data if necessary/possible to make it normally-
distributed
Statistics: t-test
•8 Assesses differences in means between 2 categories
– temperature change of wool vs. cotton socks

Assumes data in each category


• Excel/Sheets has equal variance (spread)
=ttest(A1:A4,B1:B4,1,2)
One-tail
Hypothesized that one
category > other category
Two-tail
Hypothesized that categories
will differ, but not sure which will
be bigger..

p value (after hitting return)


Statistics: t-test
Ave. temperature reduction
after 5 minutes (∘C)

-4
p =0.01 -4
p =0.3

-3 -3

-2 -2

-1 -1

Wool Cotton Wool Cotton


Only 1% likelihood that these two 30% likelihood that these two
categories are samples from the categories are samples from the
same population, therefore same population, therefore
difference is statistically significant. difference is not statistically
significant.
Statistics: Chi Squared
10

¨ Does observed data differ from the hypothesized distribution?


¨ χ2=∑(O−E)2/E
¤ O = observed data

¤ E = expected distribution (assuming hypothesis is true)

¤ Count data (not continuous variables)

¤ Ex: Do fish have preferences for certain substrates?


30 fish x 50% of habitat area
n Ho= fish have no preference
Statistics: Chi Squared
11

¨ χ2 = ∑(O−E)2/E = 12.933
¨ Reject Ho if χ2 > “critical value” at
p < .05 with appropriate degrees
of freedom
¨ Degrees of Freedom 12.933 > 5.99 therefore reject
the null hypothesis that fish
¤ Number of categories – 1 have no substrate preference.
n 3 substrates – 1 = 2
Statistics - Correlation

• Assesses linear association between two variables.


• Observational studies with continuous variables
• Excel/Sheets
R value “=correl(A1:A4,B1:B4)”
R value
0 = no correlation
1 = strong positive correlation
-1 = strong negative correlation

P value:
• socscistatistics.com/pvalues/pearsondistribution.aspx
• Enter your values for Pearson (R) and the number of pairs in your sample
Statistics – Linear Regression
• Assesses linear effect of independent variable on
dependent variable. “Attempts to establish causation.”
• Experiments with continuous variables.
• Excel
– Read tutorial to install & use Excel Analysis ToolPak
– www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/

R2 value
Amount of variation in the dependent
variable (temperature) explained by
changes in the independent variable.

p value
Statistics – Linear Regression

R2 = 0.8, p =.03 R2 = 0.1, p =.49


Dependent

Dependent
Independent Independent

80% of the change in the 10% of the change in the


dependent variable is due to the dependent variable is due to the
change in the independent change in the independent
variable, and there is only a 3% variable, and there is a 49%
chance that randomly selected chance that randomly selected
data could produce this linear data could produce this level of
relationship. linear relationship.

You might also like