Geostatics
Geostatics
Geostatics
A
Notes B
SHALE
SHALE
GAS
A1 A2
OIL A3
A4 A6
OIL
A5
WATER
Cross Section Through Effective Porosity Model. Green = High Effective Porosity.
KULIAH GEOSTATISTIK
GL-5043
KULIAH GEOS 1
TATISTIK GL
Course Organization
Introduction
Geostatistics in Reservoir Management
Introduction to Important Concepts and
Vocabulary
Limitations of Geostatistics
Exploratory Data Analysis
Univariate and Bivariate Statistics
Univariate Spatial Statistics
Covariance and Semi-Variogram Function
Variogram Modeling of Continuous Data
Categorical Data and the Indicator
Transform
Random Functions and Spatial
Models
KULIAH GEOS 2
TATISTIK GL
Course Organization
(continued)
KULIAH GEOS 3
TATISTIK GL
Course Organization
(continued)
KULIAH GEOS 4
TATISTIK GL
Introduction
Statement of Learning
Objectives
Definition of Geostatistics and
Important Assumptions
Brief History
Important Reference Books
Role of Geostatistics in Reservoir
Management
Limitations of Geostatistics
Introduction to Key Concepts
Deterministic Vs Stochastic
Estimation Vs Simulation
Key Steps in a Geostatistical
Study
KULIAH GEOS 5
TATISTIK GL
Definition of
Geostatistics
Geostatistics
Branch of Statistics that Deals
with Spatially Correlated Data
Basic Assumptions
Sample Values are Not
Independent
Spatial Continuity Exists
Goal of Geostatistics
Model Spatial Continuity
Use Model for Estimation and/or
Simulation of Spatial Distribution
KULIAH GEOS 6
TATISTIK GL
A Brief History
of Geostatistics
Geostatistics Term Coined by Hart (1952)
- Application of Statistics in a Geographic
Context
Matheron (1962, 1963) Used Term in a
Geological Context for Inferring Ore
Reserves from Data Spatially Distributed
Within an Ore Body
Developed Theory of Regionalized Variables
Formal Introduction of New Statistic - the
Semivariogram
KULIAH GEOS 7
TATISTIK GL
Semi-Variogram
Example
Model Form = EXPONENTIAL
Sill
= Data Points
Nugget
(may be zero)
Range
Lag or Separation Distance
KULIAH GEOS 8
TATISTIK GL
Reference
Books for
Geostatistics
Principle References - Textbooks
Isaaks, E. H. and Srivastava, R. M., 1989. Applied
Geostatistics, Oxford University Press, New York
Yarus, J. M. and Chambers, R. L., 1995. Stochastic Modeling and
Geostatistics - Principles, Methods, and Case Studies, AAPG, Tulsa
Hohn, M. E., 1999 (second edition). Geostatistics and
Petroleum Geology, Kluwer Academic Publishers
Goovaerts, Pierre, 1997. Geostatistics for Natural
Resources Estimation, Oxford University Press, New York
Deutsch, C. V., and Journel, A. G., 1997. GSLIB Geostatistical
Software Library and User Guide, Oxford University Press, New
York (includes FORTRAN code on CDROM).
Clark, I., 1979. Practical Geostatistics, Applied Science
Publishers, Englewood, New Jersey
Journel, A. G. and Huijbregts, C. J., 1978. Mining Geostatistics,
Academic Press, London
Cressie, N., 1991. Statistics for Spatial Data, John Wiley & Sons,
New York
Armstrong, Margaret, 1998. Basic Linear Geostatistics.
Springer.
Olea, Ricardo, 1999. Geostatistics for Engineers and Earth
Scientists. Kluwer Academic Publishers
Deutsch, Clayton V., 2002. Geostatistical Reservoir
Modeling. Oxford
Webster, Richard and Margaret A. Olivier, 2001.
Geostatistics for Environmental Scientists. Wiley
KULIAH GEOS 9
TATISTIK GL
Application of Statistics,
Including Geostatistics,
to Reservoir
Characterization
A Description of a Reservoir is a
Necessary Part of Reservoir
Management
Description is Based on a Variety of
Data (Core Data, Well Log Data,
Seismic Data, Fluid Data, Production
Data, etc.)
Little, if Any, Direct Data
Indirect Data Generally Represents
Very Small Part of Reservoir Volume
Much Uncertainty in Reservoir
Description Due to Limited Data
Statistics Provides a Systematic Way
of Describing and Handling the
Uncertainty
KULIAH GEOS 10
TATISTIK GL
Geostatistics in
Reservoir
Characterization
Well Spacing Vs Reservoir
Volume Sampled by Core and
Well Log Data*
3.50E-05
Reservoir Volume Sampled
3.00E-05
2.50E-05
2.00E-05 Core Data
1.50E-05 Well Log Data
1.00E-05
5.00E-06
0.00E+00
5 10 20 40 80 160 320 640
Well Spacing (Acres)
KULIAH GEOS 11
TATISTIK GL
Geostatistics in
Reservoir
Management
Some Reasons for Strong
Industry Interest in Geostatistics
Geostatistical Estimation and Simulation
Methods Allow Detailed Reservoir Property
(k, Sw) Distributions to be Generated.
Geostatistical Distributions Are Better than
Traditional Methods!
Realistic - Both Geologically and
Statistically
Easy to Generate Using Available
Software
Geostatistical Techniques Such As
Collocated Cokriging Offer Quick
Integration of Well Based Data and Seismic
Data
Stochastic Techniques Allow the
Uncertainty of a Reservoir Description to be
Quantified
KULIAH GEOS 12
TATISTIK GL
Geostatistics in
Reservoir Management
Fundamental Goal is to Calculate
a Reservoir Property Distribution
Using the Available Well Log,
Core, and/or Seismic Data
Geostatistical
Geostatistical
Analysis
Analysis
Raw
Raw
Data
Data
Selection
Selection of
of
Model
Model Appropriate
Appropriate
Estimation
Estimation or
or
Stochastic
Stochastic Algorithm
Algorithm
KULIAH GEOS 13
TATISTIK GL
Geostatistics in
Reservoir Management
RESERVOIR n
1 3
1 FLOW 6
2
SIMULATOR
3
5
4
4 2
5
6 OUTCOME
PROPERTY (PHI, K) (RECOVERY)
DISTRIBUTIONS
KULIAH GEOS 14
TATISTIK GL
Limitations of
Geostatistics
Geostatistics Does Not Create
Data or Eliminate the Value of
Obtaining Additional Good Data
Geostatistics Does Not Replace
Sound Qualitative Understanding
and Expert Judgment
Geostatistics Does Not
Necessarily Save Time, At Least in
the Short Term.
Geostatistics Does Not Work Well
Porosity at X is 13.7%
as a Black Box
KULIAH GEOS 15
TATISTIK GL
Important Concepts and
Vocabulary
Deterministic Vs Stochastic
Estimation Vs Simulation
KULIAH GEOS 16
TATISTIK GL
Deterministic Vs
Stochastic
Modeling Is Not
Magic!
KULIAH GEOS 17
TATISTIK GL
Deterministic Vs
Stochastic
Deterministic Models Depend on Outside
Information Not Contained in the Data Values
(i.e. Quantitative Process Description) and the
Context of the Data
Deterministic Model Examples:
Distance a Ball Will Travel When Thrown
Information Needed
Equation
Velocity and Angle Ball Is Thrown
Gravitational Constant (g)
Diffusion of a Trace Element When a Pure
Metal Bar and a Contaminated Metal Bar
Are Joined
Information Needed:
Diffusion Constant and Temperature
Dependence of Diffusion Constant
Diffusion Equation
Initial Concentration of Trace Element in
Contaminated Bar
Concentration of Trace Element at Any Time
or Location in Metal Bar May Be Calculated
KULIAH GEOS 18
TATISTIK GL
Deterministic Vs
Stochastic
Stochastic Models
Stochastic Models Are Useful When the
Process Responsible for the Distribution
of Values is Not Well Understood
A Stochastic Model is a Random Model
Controlled by a Spatial Correlation Model
Stochastic Models are a Useful Reservoir
Characterization Tool Because a
Reservoir is the End Product of Many
Poorly Understood Processes Including
Some or All of the Following:
Sedimentation Compaction
Bioturbation Diagenesis
Burial Erosion
Local Tectonics Regional Tectonics
KULIAH GEOS 19
TATISTIK GL
Estimation Vs
Simulation
Estimation is Process of Obtaining the
Single Best Value of a Reservoir Property
at an Unsampled Location. Local
Accuracy Takes Precedence Over Global
Spatial Variability. Estimation Methods,
Therefore, Tend to Produce Smooth
Property Distributions.
Many Traditional Methods
Block Averages
Inverse Distance Weighted
Interpolation
Triangulation
Many Geostatistical Methods
Ordinary Kriging
Collocated Cokriging
KULIAH GEOS 20
TATISTIK GL
Estimation Vs
Simulation
KULIAH GEOS 21
TATISTIK GL
Estimation Vs
Simulation
Estimation Simulation
Note
NoteSmooth
SmoothContours
Contours
On
OnEstimation
EstimationMap
Map
Compared
Comparedto toSimulation
Simulation
(Stochastic)
(Stochastic)Map.
Map.
Note
Notethat
thatAreas
Areasof of
Greatest
GreatestDifference
Difference
Between
BetweenthetheTwo
TwoMaps
Maps
Are
AreIn
InAreas
AreasofofLittle
Little
or
orNo
NoWell
WellControl.
Control.
Effective Porosity
KULIAH GEOS 22
TATISTIK GL
Principle Steps in a
Geostatistical Reservoir
Characterization Study
Basic Geological Study Provides
Structural and Stratigraphic
Framework
Data Quality Control and Clean-
Up (Univariate and Multi-Variate
Statistical Analysis)
Define Region(s) in which
Stationarity is Applicable
Characterize and Model Spatial
Continuity (Variogram
Modeling) in Selected Regions
Obtain Reservoir Property
Distribution(s) by Estimation
and/or Conditional Simulation
Using Model(s) of Spatial
Continuity
Document Results
KULIAH GEOS 23
TATISTIK GL
Introduction
Learning Objectives
Definition of Geostatistics and
Important Assumptions
Brief History
Important Reference Books
Role of Geostatistics in Reservoir
Management
Limitations of Geostatistics
Introduction to Key Concepts
Deterministic Vs Stochastic
Estimation Vs Simulation
Key Terms
Attribute, Variable, Individual,
Population
Parameter, Statistic
Key Steps in a Geostatistical Study
KULIAH GEOS 24
TATISTIK GL
Univariate Statistics
KULIAH GEOS 25
TATISTIK GL
Univariate Statistics
Statement of Learning Objectives
Review of Measures of Location and
Spread
Mean
Variance
Standard Deviation
Review of Univariate Plots
Histogram
Probability Density Function - pdf
Cumulative Density Function - cdf
Handling Outliers
Review of Types of Distributions
Parametric
Normal (Gaussian)
Log-Normal
Non-Parametric
KULIAH GEOS 26
TATISTIK GL
Univariate Statistics
The Mean, Usually Denoted by m, is the Central
Value of a Distribution. The Arithmetic Mean is
Given by
1 n
m z (i )
n i 1
where z(i) = sample value and n = number of
data values
For Log-Normal Distributions, the Geometric
Mean is a Better Measure of the Central Value.
The Geometric Mean Given by
1 n
KULIAH GEOS 27
TATISTIK GL
Univariate Statistics
Measures of Location
(continued)
Normal Distribution
Mode = Median = Mean
Mean
Mode
0.08 Median
0.07
Frequency
0.06
0.05
0.04
0.03
0.02
0.01
0
Porosity
KULIAH GEOS 28
TATISTIK GL
Univariate Statistics
Measures of Location
(continued)
Lognormal Distribution
Mode < Median < Mean
Mode
2
1.8 Median
1.6
Frequency
1.4 Arithmetic
1.2 Mean
1
0.8
0.6
0.4
0.2
0
Permeability
KULIAH GEOS 29
TATISTIK GL
Univariate Statistics
2
KULIAH GEOS 30
TATISTIK GL
Univariate Statistics
KULIAH GEOS 31
TATISTIK GL
Univariate Statistics
Measures of Spread and
Shape (continued)
12 m = 0.128 20
10
15
8
6 10
4
5
2
0 0
0.1
0.06
0.02
0.14
0.18
0.22
0.26
KULIAH GEOS 33
TATISTIK GL
Univariate Statistics
KULIAH GEOS 34
TATISTIK GL
Univariate Statistics
Example of Histogram, pdf, and
cdf Calculations for Example Data
Set #1 (Porosity)
Basic Steps Are:
Set Up Bins (Classes)
Determine Number of Data Values in
Each Bin
Normalized Frequency = Number of
Values in Each Bin Divided by Total
Number of Values
Cumulative Frequency is Obtained by
Summing the Normalized Frequency
of Occurrence of the Current Bin or
Class and all Lower Bins or Classes.
Bin Raw Frequency Normalized Frequency Cumulative Frequency
Bin Raw Frequency Normalized Frequency Cumulative Frequency
0 1 0.05 0.05
0 1 0.05 0.05
2 3 0.15 0.2
2 3 0.15 0.2
4 2 0.1 0.3
4 2 0.1 0.3
6 2 0.1 0.4
6 2 0.1 0.4
8 1 0.05 0.45
8 1 0.05 0.45
10 2 0.1 0.55
10 2 0.1 0.55
12 1 0.05 0.6
12 1 0.05 0.6
14 1 0.05 0.65
14 1 0.05 0.65
16 2 0.1 0.75
16 2 0.1 0.75
18 3 0.15 0.9
18 3 0.15 0.9
20 2 0.1 1
20 2 0.1 1
KULIAH GEOS 35
TATISTIK GL
Univariate Statistics
Calculation of
Histogram, PDF, and
CDF Using
Spreadsheet Program
(Excel)
Enter Data Values in Column
Enter Bin Values in Different
Column
Select Tools Options
Select Data Analysis Option
Select Histogram Function
Execute Histogram Function
by Completing Pop-up at
Right
Histogram
CDF
Plot Results Using Graph
Wizard
KULIAH GEOS 36
TATISTIK GL
Univariate Statistics
Histogram
Histogram
3
3
2.5
2.5
1.5
2
1.5
of
1
1
0.5
0.5
0
Histogram
0 0 2 4 6 8 10 12 14 16 18 20
0 2 4 6 8 10 12 14 16 18 20
Porosity
Porosity
cdf Plots
PDF
0.16
0.16
0.14
for
"Normalized" Frequency
0.14
"Normalized" Frequency
0.12
0.12
0.1
0.1
Porosity
0.08
0.08
0.06
0.06
0.04
0.04
Values in
0.02
0.02
0
0 0 2 4 6 8 10 12 14 16 18 20
0 2 4 6 8 10 12 14 16 18 20
Example Porosity
Porosity
#1
1
1
0.8
Cumulative Frequency
0.8
Cumulative Frequency
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0 2 4 6 8 10 12 14 16 18 20
0 2 4 6 8 10 12 14 16 18 20
Porosity
Porosity
KULIAH GEOS 37
TATISTIK GL
Univariate Statistics
Histogram - Class Size = 5
Histogram - Class Size = 5
Univariate 6
6
Distribution
5
5
Raw Frequency
4
Raw Frequency
4
Plots
3
3
2
2
(continued) 1
0
1
0
0 5 10 15 20 25
Selection of 0 5 10
Porosity
Porosity
15 20 25
Bin or Class
Size Is a Histogram - Class Size = 2
Function of 3
Histogram - Class Size = 2
2
Raw Frequency
2
1.5
and 0.5
0.5
0
Distribution 0
0
10
12
14
16
18
22
24
20
26
0
10
12
16
18
22
24
26
14
20
Porosity
Information Porosity
that Is
Needed. Histogram - Class Size = 1
Histogram - Class Size = 1
Same Data 2
2
All
Raw Frequency
1
1
Histograms. 0.5
0.5
0
0
1
13
15
17
19
21
23
25
11
1
13
15
17
19
21
23
25
11
Porosity
Porosity
KULIAH GEOS 38
TATISTIK GL
Univariate Statistics
KULIAH GEOS 39
TATISTIK GL
Depth Permeability (md)
2701 11
2702 5
2703 6
Univariate
2704 3
2705 5
2706 8
2707 12
2708 13
Statistics
2709 256
2710 390
2711 44
2712 11
2713 2
2714 1
2715 2
2716 4
2717 5
2718 4
Set with
2723 14
2724 59
2725 389
2726 17
Extreme
2727 452
2728 12
2729 11
2730 5
Values
2731 6
2732 8
2733 6
2734 3
Summary
2735 2
2736 1
2737 5
2738 7
Statistics 2739
2740
6
8
16 1
0.9
14
0.8
12
Number of Values 40.00 0.7
10 Arithmetic Average 45.38 0.6
Cumulative Frequency
Raw Frequency
8
Geometric Average 9.12 0.5
Variance 12778.04
0.4
6 Standard Deviation 113.04
0.3
4 Median 6.50
0.2
2
0.1
0 0
0
25
50
75
225
500
100
125
150
175
200
250
275
300
325
350
375
400
425
450
475
Permeability (md)
KULIAH GEOS 40
TATISTIK GL
Univariate Statistics
LOW Permeability (md) HIGH Permeability (md)
11
Outlier
5
6
3
5
8
Handling 12
13
256
390
Delete
44
11
2
1
Extreme 2
4
5
Values
4
2
5
7
Treat
8
14
59
389
Separat
17
452
12
11
ely 5
6
8
6
Transfor 3
2
1
m (next
5
7
6
8
page) 36.00
9.11
4.00
371.75
Number of Values
Arithmetic Average
6.06 364.00 Geometric Average
127.13 6822.92 Variance
11.28 82.60 Standard Deviation
6.00 389.50 Median
KULIAH GEOS 41
TATISTIK GL
Univariate Statistics
Depth Permeability (md) Log Square Root
Outlier 2705
2706
2707
2708
5
8
12
13
0.699
0.903
1.079
1.114
2.236
2.828
3.464
3.606
Handling
2709 256 2.408 16.000
2710 390 2.591 19.748
2711 44 1.643 6.633
2712 11 1.041 3.317
2713 2 0.301 1.414
d)
2718 4 0.602 2.000
2719 2 0.301 1.414
2720 5 0.699 2.236
2721 7 0.845 2.646
2722 8 0.903 2.828
Transfor
2723 14 1.146 3.742
2724 59 1.771 7.681
2725 389 2.590 19.723
2726 17 1.230 4.123
m
2727 452 2.655 21.260
2728 12 1.079 3.464
2729 11 1.041 3.317
2730 5 0.699 2.236
Example 2731
2732
2733
2734
6
8
6
3
0.778
0.903
0.778
0.477
2.449
2.828
2.449
1.732
s 2735
2736
2737
2
1
5
0.301
0.000
0.699
1.414
1.000
2.236
2738 7 0.845 2.646
Log10 2739
2740
6
8
0.778
0.903
2.449
2.828
9.12 19.00
=10**(Arithmetic Average)
=(Arithmetic Average)**2
KULIAH GEOS 42
TATISTIK GL
Univariate Statistics
Normal Score Transform
Gaussian Anamorphosis
Essentially Involves Transforming Any
Distribution to Its Corresponding
Normal or Gaussian Distribution Using
Percentile Ranks. That is, First One
Ranks the Data and then Assigns to
Every i th Observation the Value of
the i th Percentile in a Standard
Normal Distribution.
Important as Many Algorithms Assume
a Gaussian Data Distribution
Often, this Transformation Is Done
Automatically.
Note that for Log-Normal Distributions
the Normal Score Transform is the
Same as Doing a Logarithmic
Transform.
KULIAH GEOS 43
TATISTIK GL
Univariate Statistics
KULIAH GEOS 44
From Olea, 1999
TATISTIK GL
Univariate Statistics
Types of Distributions
Parametric Distributions Can Be
Completely Described by a Few
Parameters such as Mean and
Variance
Normal or Gaussian
Porosity (Sometime)
Saturation (Usually)
Log-Normal
Permeability (Sometime)
Non-Parametric Distributions Can Not
Easily Be Described by Parameters.
Frequently the Result of a Mixture of
Populations.
Bi-Modal
Multi-Modal
KULIAH GEOS 45
TATISTIK GL
Univariate Statistics
Theoretical Normal (Gaussian)
Distribution
Some Distributions Have a Concise
Mathematical Description
A Normal or Gaussian Distribution is
Completely Described by
1 z m 2
1 2
g z e 2
2
KULIAH GEOS 46
TATISTIK GL
Univariate Statistics
Normal Distributions
Example Histogram and CDF
CDF for a Normal (Gaussian)
Distribution Has an S Shape
25 1
0.9
20 Mean = 17.1 0.8
Cumulative Frequency
0.7
Raw Frequency
15 0.6
0.5
10 0.4
0.3
5 0.2
0.1
0 0
10
12
13
14
15
16
17
18
19
20
21
22
23
24
25
11
Porosity
KULIAH GEOS 47
TATISTIK GL
Univariate Statistics
Log-Normal Distribution
Example Histogram and CDF
Note Skewed S Shape of CDF
45 1
40
0.8
Cumulative Frequency
35
Raw Frequency
30
25 Mean = 4.2 0.6
20
15
Geometric Mean = 3.6 0.4
10 0.2
5
0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Porosity
KULIAH GEOS 48
TATISTIK GL
Univariate Statistics
Non-Parametric Distribution
Example Histogram and cdf for Bi-
Modal Distribution
Note Step Shape of CDF
16 1
0.9
14
Mean = 13.8 0.8
12
0.7 Cumulative Frequency
Raw Frequency
10 0.6
8 0.5
6 0.4
0.3
4
0.2
2 0.1
0 0
21
13
15
17
19
23
25
1
11
Porosity
KULIAH GEOS 49
TATISTIK GL
Univariate Statistics
Non-Parametric Distribution
Example
Example Data Set (25 Core
Porosity Measurements)
6, 6, 8, 5, 4, 5, 7, 7, 6, 6, 6, 6, 5, 7, 12,
14, 17, 15, 17, 18, 16, 14, 17, 15, 16
Summary Statistics
Summary Statistics
Mean 10.2
Median 7
Mode 6
Standard Deviation 5.03
Variance
25.25
Range 14
Minimum 4
Maximum 18
Sum 255
Count 25
KULIAH GEOS 50
TATISTIK GL
Univariate Statistics
Non-Parametric Distribution
Example (continued)
Histogram and CDF
Histogramand cdf
6 120.00%
5 100.00%
4 80.00%
Frequency
3 60.00%
2 40.00%
1 20.00%
0 0.00%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Porosity(%)
KULIAH GEOS 51
TATISTIK GL
Univariate Statistics
Non-Parametric Distributions
(continued)
Separating a Bi-Modal or Multi-Modal
Mixture (for Example, Effective
Porosity in a Sand-Shale Sequence)
May Result in Individual Distributions
that are Normal or near-Normal
Indicator Transform Often Provides a
Convenient Way of Splitting Mixed
Populations
KULIAH GEOS 52
TATISTIK GL
Univariate Statistics
Session Summary
Measures of Location and Spread
Mean - Measure of Central Value
(Arithmetic Vs Geometric)
Median - Another Measure of Central
Value
Variance and Standard Deviation -
Measure of Spread
Common Displays
Histogram and Population Density
Function (pdf)
Cumulative Density Function (cdf)
Outlier Handling
Exclusion
Separate Population
Transform
Types of Distribution
Parametric (Normal, Log-Normal)
Non-parametric (Bi-Modal)
KULIAH GEOS 53
TATISTIK GL
Univariate Statistics
Problem 1 - Calculate Arithmetic Mean,
Geometric Mean, Median, and Mode for
the Porosity and Permeability Values
Given Below
KULIAH GEOS 54
TATISTIK GL
Univariate Statistics
Problem 1 Worksheet
KULIAH GEOS 55
TATISTIK GL
Univariate Statistics
Problem 1 - Solution
Arithmetic Mean
Sum All Measurements
Divide Sum by Total Number of
Measurements
Median Value
Order Measurements from Smallest to
Largest Value
Middle Value = Median
Mode is the Value (or Values) that
Occurs Most Frequently
Geometric Mean
Calculate log10 of Each Value
Sum All log10s
Divide Sum by Total Number of
Measurements
Exponentiate to Get Geometric Mean
KULIAH GEOS 56
TATISTIK GL
Univariate Statistics
Learning Objectives
Review of Measures of Location and
Spread
Mean
Variance
Standard Deviation
Review of Univariate Plots
Histogram
Probability Density Function - pdf
Cumulative Density Function - cdf
Handling Outliers
Review of Types of Distributions
Parametric
Normal (Gaussian)
Log-Normal
Non-Parametric
KULIAH GEOS 57
TATISTIK GL
Bivariate Statistics
Learning Objectives
Bivariate Data Display
Scatterplot or Crossplot
Bivariate Measures
Covariance
Correlation Coefficient
Rank Correlation Coefficient
Brief Review of Linear
Regression
Procedure
Example
Limitations
KULIAH GEOS 58
TATISTIK GL
Bivariate Statistics
Given the Data Shown in the
Table Below, What is the
Relationship Between the
Gamma Ray Log Trace and
the Porosity Values? Is the
Line Shown a Good Fit?
MD (FEET) GR POROSITY
3260.00
3261.00
87.71
67.55
0.097
0.098
SCATTERPLOT
3262.00 42.21 0.09
3263.00 47.33 0.093 0.2
3264.00 48.48 0.079
3265.00 46.05 0.061
POROSITY
0.15
3266.00 40.49 0.064
3267.00 28.05 0.071
3268.00 15.98 0.075 0.1
3269.00 7.61 0.074
3270.00 11.15 0.065 0.05
3271.00 24.89 0.058
3272.00 49.39 0.056 0
3273.00 72.20 0.068
3274.00 79.02 0.105 0.00 50.00 100.00
3275.00 83.61 0.139
3276.00 91.02 0.144 GR
3277.00 94.37 0.149
3278.00 85.52 0.144
Y = 0.00087(X) + 0.045
KULIAH GEOS 59
TATISTIK GL
Bivariate Statistics
Basic Question Is -
What Is the
Relationship
Between Two 1000
Variables? 900
800
Scattergram or
700
600
500
300
Plot Used to
200
100
Summarize
0
0 5 10 15 20 25 30
Relationship
Between Two 10
Variables
Used to Examine
1
0 5 10 15 20 25 30
Semi-Log, or Log-Log 10
Scales
1
1 10 100
KULIAH GEOS 60
TATISTIK GL
Bivariate Statistics
Bivariate Measures
The Covariance, Usually Denoted by
C or XY is the Measure of Joint
Variation of Two Variables, X and Y,
About Their Respective Means, m X and
mY. That 1is n
xy
n
x i mx yi my
i 1
KULIAH GEOS 61
TATISTIK GL
Bivariate Statistics
Bivariate
Measures
(continued) 10
The 5
Correlation 0
0 5 10
Coefficient Is
Unitless and 10
Varies 5
Between -1 0
and +1. 0 5 10
Values Near 10
Zero Indicate
5
No Significant
0
Linear 0 5 10
Correlation.
Note:
Note:Variance
Variance(s(s2) )ofofXXand
andYY
2
KULIAH GEOS 62
TATISTIK GL
Bivariate Statistics
20.000
15.000
10.000
5.000
0.000
0.000 10.000 20.000 30.000 40.000 50.000 60.000
KULIAH GEOS 63
TATISTIK GL
Bivariate Statistics
18.00 22.00 10 10
Rank
10
9
Total Correlation Coeffient 0.94 8
KULIAH GEOS 64
TATISTIK GL
Bivariate Statistics
10.00 21000.00 10 10
Rank
10
9
Total Correlation Coeffient 0.73 8
7
KULIAH GEOS 65
TATISTIK GL
Bivariate Statistics
Linear Regression
Goal is to Summarize a Relationship
Between Two Variables such as
Porosity and Permeability with a
Linear Equation of the Form
y = a + bx
KULIAH GEOS 66
TATISTIK GL
Bivariate Statistics
KULIAH GEOS 67
TATISTIK GL
Bivariate Statistics
Linear Regression
(continued)
The Coefficient of Determination, r2, is
Used to Estimate How Much of the
Overall Variation in Y is Explained by
the Linear Model Shown Below
2 Explained Variation of Y
r =
That is,
Total Variation of Y
n 2
y i y by
r2 is Given
Mathematically,
r 2 1 in1
yi my
2
i1
KULIAH GEOS 68
TATISTIK GL
Bivariate Statistics
Linear Regression
(continued)
Sample Plot Showing
(y -^m )
i y
(yi - y )
y = a + bx
m y
(y1 - m y)
( y1 -^y )
y(1)
x(1)
KULIAH GEOS 69
TATISTIK GL
Bivariate Statistics
Depth Porosity Permeability Permeability
Linear
% darcy md
1001 16 0.288 288
1002 11 0.276 276
1003 10 0.124 124
1004 9 0.076 76
Regressi 1005
1006
1007
1008
7
5
1
2
0.050
0.020
0.001
0.005
50
20
1
5
on
1009 7 0.090 90
1010 4 0.010 10
1011 8 0.033 33
1012 13 0.222 222
Example
1013 19 0.459 459
1014 20 0.513 513
1015 29 0.887 887
1016 17 0.345 345
1017 17 0.411 411
1018 15 0.250 250
1019 6 0.022 22
1020 2 0.006 6
Number of Values 20 20 20
Column Sum 218.00 4.09 4088.0
Arithmetic Average 10.90 0.20 204.4
Geometric Average 8.19 0.07 74.0
Median 9.50 0.11 107.0
Variance 52.83 0.05 53566.8
Standard Deviation 7.27 0.23 231.4
Covariance Bewteen
Porosity (X) and Permeability (Y) 1.537 1536.990
Correlation Coefficient 0.962 0.962
b 0.029 29.092
a -0.113 -112.706
r2 0.925 0.925
900
800
700
600
500
400
300
200
100
0
Y = 29.02X - 112.7
-100
-200
0 5 10 15 20 25 30
KULIAH GEOS 70
TATISTIK GL
Bivariate Statistics
AverageWeight of Linemen
at Texas
300
200
100
0 y = - 1108 + 0.66x
Linear Regression
(continued)
Limitations
Linear Regression Only Useful
if Data Trend is Linear.
Extrapolation Using Best-Fit
Line Outside of Data Range is
Often Outright Wrong or
Misleading.
Examine Your Data Carefully. A
Subset of the Data May Have
a Useful Linear Trend Even if
the Entire Data Set Does Not.
KULIAH GEOS 72
TATISTIK GL
Bivariate Statistics
Section Summary
Scatterplot Is Basic Bivariate Display
Relationship Between Variables
Extreme Values Easily Spotted
May Be Linear, Semi-Log, or Log-Log
Bivariate Measures
Covariance
Correlation Coefficient
Unit-Less
Varies Between -1 and +1
Rank Correlation Coefficient
Rank Correlation Coefficient Exceeds
Correlation Coefficient Indicates Non-Linear
Relationship
Rank Correlation Coefficient Less than
Correlation Coefficient Suggests Presence of
Extreme Values
Linear Regression
KULIAH GEOS 73
TATISTIK GL
Bivariate Statistics
Learning Objectives
Bivariate Data Display
Scatterplot or Crossplot
Bivariate Measures
Covariance
Correlation Coefficient
Rank Correlation Coefficient
Brief Review of Linear
Regression
Procedure
Example
Limitations
KULIAH GEOS 74
TATISTIK GL