Chapter 4 Data Management

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 56

Data Management

CHAPTER 4
Introduction
Statistics – Branch of Science that deals
with the COLLECTION, ORGANIZATION,
PRESENTATION, ANALYSIS and
INTERPRETATION of data.
Introduction
Major Areas:
• Descriptive Statistics – Concerned with the
collecting and describing a set of data so as to
yield meaningful observations.
• Inferential Statistics – Deals with the analysis of
a subset of data leading to predictions or inference
about the entire set of data.
Introduction
Definition of Terms:
Population – Totality of the
observations.
Sample – Subset of population
Introduction
Population and Sample.
Population: All First Year
students
(BS Accountancy, BS Nursing,
Sample:
BS Psychology, etc.)
All First Year BS
Agriculture
Students
Introduction
Experimental Unit Variable
Individual or object on which Attribute or characteristics of
variable is being measure person or object which can be
assume different values or labels
Object treated in an experiment
The value of the variable can
“vary” from one entity to
another
Introduction
Height
Number of Leaves
Fruit Bearing (Yes or No)
Flower Bearing (Yes or No)
Diameter of Branch
Specie

Experimental Unit Variables


Introduction
Dependent Independent
Variable Variable
Variable the experimenter changes
A variable whose value depends on or controls and is assumed have a
another variable direct effect on the Dependent
Variable
Measured
Controlled
Effect
Cause
Introduction
Types of Variables

Qualitative Quantitative
Variable Variable

Attribute or Characteristics Attribute or


whose values are assigned Characteristics whose
values are assigned
categorically
numerically
Introduction
Types of Variables
Height
Number of Leaves
Fruit Bearing (Yes or No)
Flower Bearing (Yes or No)
Circumference of Branch
Specie
Qualitative Quantitative
Variable Variable
Introduction
Types of Quantitative Variables

Discrete Continuous

These are quantitative These are quantitative


variables whose values are variables whose values are
Countable Measurable
Introduction
Introduction
Types of Quantitative Variables
Height Number of Leaves
Diameter of Branch
Discrete Continuous
A. Data Gathering, Organizing,
Representing and Interpreting
1. Data Gathering (Methods):
 Direct or Interview – person-to-person encounter,
the interviewee and the interviewer. Can be done
personal, phone or internet.
 Indirect or Questionnaire – questionnaire is used
to elicit.
A. Data Gathering, Organizing,
Representing and Interpreting
1. Data Gathering (Methods):
 Registration – obtains data from the records or
government agency authorized by law to keep such
data or information and made these available to the
researcher
 Observation – Data are obtained through
observation mostly on behavior of individual or
group given a situation.
A. Data Gathering, Organizing,
Representing and Interpreting
1. Data Gathering (Methods):
 Experimental – Data are gathered from result of
series of experiments on some variables. Mostly
used in Scientific Inquiries.
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Levels of Measurement:
 Nominal Scale – assigns names or labels or
category.
 Ordinal Scale – assigns names or labels with
order or ranking.
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Levels of Measurement:
 Interval Scale – Unit of measurement is
arbitrary and there is no “true zero point”
 Ratio Scale – has “true zero point”
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Ways to present data:
 Textual – words, sentences, paragraphs.
 Tabular – rows and columns. Data are classified
in array.
 Graphical – pictorial form. Graphs, symbols and
visual aids.
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Tabular Presentation:
 Simple
 Focus on data
 Meaning and significance of information being
presented clear.
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Statistical Table should have the following parts:
 Heading (Table Number, Title and Head note)
 Box Head
 Stub
 Footnote
 Source note
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Good chart should possess the following:
 Accurate
 Simple
 Clear
 Attractive
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Types of Graphs:
 Line Graph
 Bar Graph
 Pie Graph
 Pictograph or Pictogram
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 2 Ways of organizing collected numerical data:
 Array
 FDT or Frequency Distribution Table
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Frequency Distribution Table:
 Classes
 Class Frequency
 Class Mark or Class Midpoint
 Cumulative Frequency
 Relative Frequency
A. Data Gathering, Organizing,
Representing and Interpreting
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Frequency Distribution Table:
Class Frequency Class Class Relative Relative <CF >CF
Interval Mark Boundary Frequency Frequency %
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Frequency Distribution Table:

70 80 75 82 72 83 81 81 75 85
96 94 82 71 85 82 75 85 76 86
87 88 88 75 78 77 91 92 90 79
87 74 79 77 86 89 74 84 83 82
A. Data Gathering, Organizing,
Representing and Interpreting
2. Data Organization and Presentation
 Frequency Distribution Table:
70 71 72 74 74 75 75 75 75 76
77 77 78 79 79 80 81 81 82 82
82 82 83 83 84 85 85 85 86 86
87 87 88 88 89 90 91 92 94 96
A. Data Gathering, Organizing,
Representing and Interpreting
Data Organization and Presentation
Frequency Distribution Table:
Class Interval Frequency Class Mark Class Relative Relative <CF >CF
Boundary Frequency Frequency %
A. Data Gathering, Organizing,
Representing and Interpreting
3. Data Analysis and Interpretation
 Descriptive Statistics:
 Measure of Central Tendency
 Measure of Dispersion
 Measure of Skewness and Kurtosis.
A. Data Gathering, Organizing,
Representing and Interpreting
3. Data Analysis and Interpretation
 Inferential Statistics are techniques wherein
samples can be used to make generalizations
about the populations from which samples were
drawn.
A. Data Gathering, Organizing,
Representing and Interpreting
3. Data Analysis and Interpretation
 Inferential Statistic arise out of the fact that
sampling naturally incurs sampling error and thus
a sample is not expected to perfectly represent
the population.
 Methods of Inferential Statistics: Estimation of
Parameter and Hypothesis Testing.
B. Measures of Central Tendency
Measure of Central Tendency are measures
indicating the center of a set of data which are
arranged in order of magnitude. It is described as
the point about which the scores tend to cluster,
hence, regarded as a sort of average in the series.
It is a single number which described the totality
of the set of data collected.
B. Measures of Central Tendency
1. Mean or Arithmetic mean (or Average) – most
popular and well-known measure of central
tendency. It can be used both discrete and
continuous data.
Weighted Mean – the weight is considered in
computation.
B. Measures of Central Tendency
Properties of Mean:
1. Sum of deviation is zero.
2. Sum of the squared deviations of the
observations from the mean is minimum.
3. Mean reflects the magnitude of every
observation, since every observation contributes
to the value of the mean.
B. Measures of Central Tendency
Properties of Mean:
4. The mean can be easily affected by outliers.
5. The mean of subgroups may be combined when
properly weighted, the combined mean is called
weighted mean.
B. Measures of Central Tendency
2. Median – is the middle score for a set pf data
arranged in order of magnitude. Median is best
used when data has several extreme entries.

Grouped –
[ ]
B. Measures of Central Tendency
Properties of Median:
1. Not affected by outlier.
2. Sum of absolute deviation is minimum.
3. Not amenable for further computation and hence
cannot be combined in the same manner as the mean.
4. Median of grouped can be calculated even with
open-ended interval provided the median is not open-
ended.
B. Measures of Central Tendency
3. Mode – most frequent score in the data set.
Sometime considered as the most popular option.

Grouped –
[ ]
C. Measure of Dispersion
Identify how a set of values spread or fluctuates.
• Range
• Mean absolute deviation or variance
• Standard Deviation
• Coefficient of Variation
• Coefficient of Skewness
• Boxplot
C. Measure of Dispersion
Range – difference between highest and lowest
score.
Ungrouped – R = |Max – Min|
Grouped – RG = |ULHC – LLLC|
C. Measure of Dispersion
Properties:
1. Quick but rough measure of dispersion.
2. The larger the value, more dispersed.
3. Considers only Highest and Lowest value.
C. Measure of Dispersion
Mean absolute deviation or Variance – Simplest
method of taking into account the variations or
the spread ability of all items into a series from the
point of central tendency.
σ2 – Population variance
s2 – Sample variance
C. Measure of Dispersion
Formula:

(Computational Formula)
C. Measure of Dispersion
Formula (Ungrouped):

(Computational Formula)
C. Measure of Dispersion
Formula (Grouped):
C. Measure of Dispersion
Properties:
1. Always non-negative.
2. The larger the value the more dispersed.
3. Easily be manipulated.
4. Each observation contributes to the magnitude
of variance.
5. Unit is the squared unit of the original data.
C. Measure of Dispersion
Standard Deviation – is based on the deviations of
all the scores in the series. Always computed from
the mean. Positive square root of the variance.

=
= =
C. Measure of Dispersion
Properties:
Same properties with the variance except for the
unit of measure. The unit of measure for Standard
Deviation is same as the original data.
C. Measure of Dispersion
Coefficient of Variation – also known as the
relative dispersion, is the ratio of the standard
deviation and the mean and is usually expressed in
percent.
CV = CV =
No unit of measure.
The higher the value of CV, the more dispersed.
C. Measure of Dispersion
Skewness – Measure how asymmetric the
distribution of data from the mean.
If Mean = Median = Mode, the SK is zero
If Mean > Median > Mode, SK is positive.
If Mean < Median < Mode, SK is negative.
C. Measure of Dispersion
Kurtosis – Peakedness and flatness of the
distribution.
Peaked – Leptokurtic. K>3
Normal – Mesokurtic. K=3
Flat – Platykurtic. K<3
D. Measure of Relative Position
Percentile – Divides the whole data set into 100
equal parts.
Decile – Divides the whole data set into 10 equal
parts.
Quartile – Divides the whole data set into 4 equal
parts.
References:
• Cordial, R. R., et al. (2018). Mathematics in the modern world.
Panday-Lahi Publishing House, Inc. Muntinlupa City.
• Walpole, R. E. (1997). Introduction to statistics. Prentice-Hall
International. Singapore.

You might also like