Introd 2

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 36

LESSON# 1

INTRODUCTION TO STATISTICS

STAT101/MATH 107 (Statistical Methods)


Department of Statistics
FC College University, Lahore

1
2
WHAT IS STATISTICS


The mathematical science that deals with

the collection, presentation, analysis and

interpretation of data.


There are two main branches of statistics :

descriptive statistics and inferential

statistics.
Collection of
Data

Interpretation of
Data Statistics Presentation of
Data

Analysis of Data

3
• Statistical techniques are used extensively by marketing,
WHO USES accounting, quality control, consumers, professional
STATISTICS? sports people, hospital administrators, educators,
politicians, physicians, etc...

4
BRANCHES OF STATISTICS
5

INFERENTIAL STATISTICS
DESCRIPTIVE STATISTICS
involves using sample data to
involves organizing, summarizing, draw conclusion about a
and displaying data. population.

e.g. tables, charts, averages


• Descriptive statistics and inferential statistics are inter-related.

• If the purpose of the study is to examine and explore information for its own natural interest
only, the study is descriptive.

• However, if the information is obtained from a sample of a population and the purpose of the
study is to use that information to draw conclusions about the population, the study is
6
inferential.

• Thus, a descritpive study may be performed either on a sample or on a population. Only


when an inference is made about the population, based on information obtained from the
sample, does the study become inferential.
EXAMPLE 3:
Ticket Votes Percentage
Truman-Barkely (Democratic) 24,179,345 49.7
The Following Table
Dewey-Warrab (Republican) 21,991,291 45.2
Displays The Voting Results
Thurmond-Wright (States Rights) 1,176,125 2.4
For The 1948 Presidential
Election. Wallac-Taylor (Progressive) 1,157,326 2.4
Thomas- Smith (Socialist) 139,572 0.3

This study is descriptive statistics. It is a summary of the votes


cast by U.S. voters in the 1948 Presidential election. No
7
inferences are made.
Viewing method Q1 2011 Q1 2010 Change
EXAMPLE 4 (%)
Watching TV in the home 158:47 158:25 0.2
Watching timeshifted TV 10:46 9:36 12.2
DVR playback
Data from a sample of Amercians yielded the Using the internet on a 25:33 25:54 -1.4
following estimates of average TV viewing per computer
month for all Americans 2 years old and older.
The times are in hours and minutes: Q1 stands Watching video on the 4:33 3:23 34.5
internet
for first quarter.
Mobile subscribers 4:20 3:37 20.0
watching video on a mobile
[SOURCE: The Cross-Platform Report, Quarter 1, phone
2011. Published by The Nielsen Company, ©
2011.]

This study is inferential study. The data is taken from a sample


of Americans yielded from 2010 to 2011 to make an inference
about the population of all TV viewers. 8
POPULATION AND SAMPLE

 THE ENTIRE GROUP OF INDIVIDUALS TO BE STUDIED IS CALLED THE POPULATION. AN


INDIVIDUAL IS A PERSON OR OBJECT THAT IS A MEMBER OF THE POPULATION BEING STUDIED.
 A SAMPLE IS A SUBSET OF THE POPULATION THAT IS BEING STUDIED.
EXAMPLE 5

(i)The coach wants to know which uniform the basketball team


wants to wear, but he only asks the starting five.

Population Sample
The basketball team The starting five

(ii) A record store manager asks customers who make a


purchase how many hours of music they listen to each day.

Population Sample
Music store customer Customers who make a purchase

10
11
Parameter and Statistic
 A parameter is a descriptive measure computed from an entire population of
data.
 A statistic(or estimate) is a descriptive measure computed from a sample of
data.
 Examples
Parameter Statistic

Proportion of all students who attended the last Mean height of a sample of NBA basketball
home football game. players.

Mean SAT of entering freshmen Mean number of pepperoni slices on a 12” pizza
from a sample of a certain brand of pepperoni
pizzas.
Population :
all the species that live in the lake
PRACTICE QUESTION 1
The parameter is the
number of species in
the lake
A scientist takes a big bucket of
water from a lake and counts how
Sample:
many species of bacteria, bugs, and the species that are in the bucket
other creepy crawlies he finds in the
bucket. the estimate is the number
of species found in the bucket.
Identify the population, the sample,
the parameter, and the estimate in
this situation. 12
13
Practice Question II

A school takes a poll to find The population is all the students at the school,
out what students want to eat and the parameter is the lunch preferences of
the whole school.
at lunch. 70 students are
randomly chosen to answer the
poll questions.

What are the population, the


sample, the parameter, and
the estimate of this study? The sample is the 70 students polled, and
their responses to the poll are the estimate.

14
• The collection of information from the elements
of a population or a sample is called a survey.

• A survey that includes every number of the


population is called a census.

• The technique of collecting information from a


portion of the population is called a survey sample

• As an example, if we collect information on the


2009 incomes of all families in Connecticut, it
will be referred to as a census. On the other hand,
if we collect information on the 2009 incomes of
50 families from Connecticut, it will be called
15
a
sample survey.
The experimental unit for a study refers to the object
under study.

Experimental Unit
(Sampling Unit) The first step in detailing data collection protocol is to
define the experimental unit. An experimental or
sampling unit is the person or object that will be studied
by the researcher.

This is the smallest unit of analysis in the experiment


from which data will be collected.

For example, depending on the objectives, experimental


or sampling unit can be individual persons, students in a
classroom, the classroom itself, an animal or a16litter of
animal, patients from a doctor’s office, houses
Answer the following Questions;
a) Identify the variable and experimental unit for this
study.
EXAMPLE 6
Ans: Since the engineers collected data at each of 50
intersections, the experimental unit is an intersection
without a left-turn-only lane. The variable measured is
Engineers with the university of Kentucky Transportation the total number of cars turning left hat were involved in
an accident.
Research Program have collected data on accidents
b) Describe the target population and the sample.
occurring at intersections in Lexington, Kentucky. One of
Ans: The goal of the study is to develop guidelines for
the goals of the study was to estimate the rate at which left- the installation of left-turn lanes at all major Lexington
intersections; consequently, the target population
turn accidents occur at intersections without left-turn-only consists of all major intersections in the city. The sample
consists of the subset of 25 intersections monitored by
lanes. This estimate will be used to develop numerical the engineers.
warrants ( or guidelines ) for the installation of left-turn c) What inference do the transportation engineers
lanes at all major Lexington intersections. The engineers want to make?
Ans: The engineers will use the sample data to estimate
collected data at each of 50 intersections without left-turn- the rate at which left-turn accidents occur at all major
only lanes over a 1-year period. At each intersection, they Lexington intersections. ( We learn, in Chapter 7, that
this estimate is the number of left-tun accidents in the
monitored traffic and recorded the total number of cars sample divided by the total number of cars making left
turns in the sample. )
turning left that were involved in an accident. 17
A C O L L E C T I O N O F FA C T S F R O M
W H I C H C O N C L U S I O N S M AY B E
D R AW N I S R E F E R R E D A S D ATA .

Data: is a collection of facts


such as numbers, words,
measurements, observations
or just description of things.

18
Variable: A measurable quantity which can vary
from one individual or object to another is called a
variable.
Example: height, age, number of siblings, martial
status, eye color, etc.

Constant: A quantity which can assume only one


value is called a constant.
Examples of constant are ∏=3.14159, e=2.71828,
etc.

19
Quantitative Variable: A variable is one which can
assume a numerical value, for example, balance in
Qualitative Variable
your checking account, minutes remaining in class,
number of children in a family. height of plant, weight
A qualitative variable is also of grains, number of students in class etc. Quantitative
known as categorical variable is variable can further be placed into two types
one which is not capable of
depending upon the type of measurement possible.
taking numerical
measurements. For example,
gender, religious affiliation,
type of automobile owned, state Continuous
of birth, eye color, general Discrete
knowledge (poor, moderate, variables
good) etc. variables;

20
a) A continuous variable is one that can take all possible values
continuous in an interval on the number line. For example, The pressure in
variable a tire, the weight of a pork chop, or the height of students in a
class, atmospheric pressure, plant height, student height,
temperature.

discrete b) A discrete variable is also known as discontinuous variable.


variable can only assume certain values and there are usually “gaps”
between values. EXAMPLE: the number of bedrooms in a
house, or the number of hammers sold at the local Home Depot
(1,2,3,…, etc.), number of students in a class, number of family
members in a house, number of plants in a row etc.

21
Measurement Nominal: unordered categories. This includes measurements of
Scales: categories such as gender, religion, sport etc.

The four scales of


measurement are:

I. Nominal scale;
Ordinal: Ordered categories. It has variable measurements of variable
II. Ordinal scale; categories such a size, behavior etc.

III. Ratio scale;

IV. Interval scale

22
Interval Scale: like the ordinal level, with the additional property
that meaningful amounts of differences between data values can be
Interval Scale determined. There is no natural zero point.

EXAMPLE: Temperature on the Fahrenheit scale.

There is no zero point for IQ. We do not think of a person as having no


intelligence. Here’s the problem with interval scales: they don’t have a
“true zero.” For example, there is no such thing as “no temperature.”

Ratio Scale: the interval level with an inherent zero starting point.
Ratio Scale Differences and ratios are meaningful for this level of
measurement.

EXAMPLES: Monthly income of surgeons, or distance traveled by


manufacturer’s representatives per month.

23
Nominal: unordered, categories e.g.
24 male/female, smoke/non-smoker,
Alive/dead
Qualitative
(categorical)

Ordinal: ordered, catrgories e.g. severity of


a disease (mild, moderate and severe),
exercise intensity (low, moderate and high)
variable
Discrete: whole numerical value-typically
counts e.g. number of visits to dentist, the
number of ants in ant colonies

Quantitative
(Numeric)

Continuous: Can take any value within a range


e.g. height in cm, pocket depth in mm
25
Data can be classified as grouped or ungrouped.

Ungrouped data (or raw data) are


data that are not organized, or if
arranged, could only be from
highest to lowest or lowest to
highest.

• Grouped data are data that are


organized and arranged into
different classes or categories.

26
Source of Data

Primary source of data

Secondary source of data

27
Source of Data Collection:
PRIMARY DATA SECONDARY DATA
•These are the data that are collected for the first •These are the data that are sourced from someplace that has
time by an investigator for a specific purpose. originally collected it.
•This means that this kind of data has already been collected
•Primary data are ‘pure’ in the sense that no
by some researchers or investigators in the past and is
statistical operations have been performed on available either in published or unpublished form.
them and they are original. •This information is impure as statistical operations may
have been performed on them already.
•An example of primary data is Census of
•An exmaple is an information available on the government
Pakistan. of Pakistan, the Department of Finance’s website or in other
repositories books, journals, etc.
28
29
METHODS FOR COLLECTION OF
PRIMARY DATA

1. Direct Personal Investigation

2. Indirect Personal Investigation

3. Questionnaire Method

4. Investigation through Enumerators

5. Registrations
30
1. DIRECT PERSONAL
INVESTIGATION

• IN THIS METHODS, THE INVESTIGATOR INTERVIEWS THE PERSONS CONCERNED OR OBSERVERS FACTS

PERSONALLY.

• THE INVESTIGATOR MAY GO TO LIVE WITH THE PEOPLE, MIX UP WITH THEM FREELY AND GATHER THE

FACTS.

• THE INFORMATION COLLECTED IN THIS WAY IS QUITE ACCURATE.

• THIS METHOD IS SLOW AND EXPENSIVE

31
• IT IS SUITABLE ONLY IN LABORATORY EXPERIMENTS OR LOCALIZED INQUIRES
2. INDIRECT PERSONAL
INVESTIGATION

• SOMETIMES, IT IS KNOWN THAT THE RESPONDENTS WOULD NOT DISCLOSE

THE INFORMATION AT ALL OR WOULD INTENTIONALLY PROVIDE FALSE

INFORMATION.

• FOR EXAMPLE, GOVERNMENT SERVANTS DO NOT DISCLOSE THEIR INCOME

FROM PART-TIME WORK AND THE BUSINESSMAN SELDOM DISCLOSE THEIR

TRUE INCOMES TO THE INCOME TAX AUTHORITIES.

• THIS METHOD IS USED WHEN INFORMATION TO BE COLLECTED IS COMPLEX


32

OR THE RESPONDENTS ARE RELUCTANT TO DISCLOSE THE TRUE FACTS.


3. QUESTIONNAIRE METHOD

• THIS METHOD IS USED TO COLLECT INFORMATION FROM LITERATE PEOPLE.

• QUESTIONNAIRE IS AS AN INSTRUMENT FOR RESEARCH, WHICH CONSISTS OF A LIST OF QUESTIONS, ALONG WITH THE CHOICE OF

ANSWERS, PRINTED OR TYPED IN A SEQUENCE ON A FORM USED FOR ACQUIRING SPECIFIC INFORMATION FROM THE RESPONDENTS.

• IN GENERAL, QUESTIONNAIRES ARE DELIVERED TO THE PERSONS CONCERNED EITHER BY POST OR MAIL, REQUESTING THEM TO

ANSWER THE QUESTIONS AND RETURN IT.

• INFORMANTS ARE EXPECTED TO READ AND UNDERSTAND THE QUESTIONS AND REPLY IN THE SPACE PROVIDED IN THE QUESTIONNAIRE

ITSELF.

33
• THE QUESTIONNAIRE IS PREPARED IN SUCH A WAY THAT IT TRANSLATES THE REQUIRED INFORMATION INTO A SERIES OF QUESTIONS,

THAT INFORMANTS CAN AND WILL ANSWER.


4. INVESTIGATION THROUGH ENUMERATORS

• THIS METHOD IS AN ALTERNATIVE WAY TO GET INFORMATION OF PRIMARY DATA FROM RURAL

AREA.

• A NUMBER OF ENUMERATORS ARE SELECTED AND TRAINED. THEY ARE PROVIDED WITH

STANDARDIZED QUESTIONNAIRE.

34
• THESE ENUMERATORS GOES TO THE RESPONDENTS ALONG WITH THE QUESTIONNAIRE AND
5. REGISTRATION

• IN THIS METHOD INFORMATION IS REPORTED TO THE APPROPRIATE

AUTHORITY WHEN OR SHORTLY AFTER AN EVENT OCCURS.

• FOR EXAMPLE, THE BIRTHS AND DEATHS ARE REGISTERED WITH

THE MUNICIPAL COMMITTEE OR CO-OPERATION IN URBAN AREAS

AND THE UNITED COUNCIL IN RURAL AREAS. 35


METHODS OF COLLECTION OF SECONDARY DATA

1. OFFICIAL SOURCES, E.G. PUBLICATIONS OF FEDERAL BUREAU OF STATISTICS, MINISTRIES OF AGRICULTURE, FINANCE,

COMMUNICATIONS AND RAILWAYS, PROVINCIAL BUREAUS OF STATISTICS AND PROVINCIAL DEPARTMENTS OF

AGRICULTURE, HEALTH AND EDUCATION

2. SEMI- OFFICIAL SOURCES, E.G. PUBLICATIONS OF STATE BANK OF PAKISTAN, CENTRAL COTTON COMMITTEE,

ECONOMIC RESEARCH INSTITUTES, DISTRICT COUNCIL, MUNICIPAL COMMITTEE, WAPDA, ETC.

3. PRIVATE SOURCES, E.G. PUBLICATIONS OF TRADE ASSOCIATIONS, CHAMBERS OF COMMERCE AND INDUSTRY,
36
MARKET

COMMITTEES, ETC.

You might also like