Null and Alternative Hypothesis Null Hypothesis Definition

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 34

NULL AND ALTERNATIVE HYPOTHESIS

Null hypothesis definition


The null hypothesis is a general statement that states that
there is no relationship between two phenomenons under
consideration or that there is no association between two
groups.
 A hypothesis, in general, is an assumption that is yet to be
proved with sufficient pieces of evidence. A null hypothesis
thus is the hypothesis a researcher is trying to disprove.
 A null hypothesis is a hypothesis capable of being
objectively verified, tested, and even rejected.
 If a study is to compare method A with method B about
their relationship, and if the study is preceded on the
assumption that both methods are equally good, then this
assumption is termed as the null hypothesis.
 The null hypothesis should always be a specific hypothesis,
i.e., it should not state about or approximately a certain
value.
Null hypothesis symbol
 The symbol for the null hypothesis is H0, and it is read as H-
null, H-zero, or H-naught.
 The null hypothesis is usually associated with just ‘equals to’
sign as a null hypothesis can either be accepted or rejected.
Null hypothesis purpose
 The main purpose of a null hypothesis is to verify/ disprove
the proposed statistical assumptions.
 Some scientific null hypothesis help to advance a theory.
 The null hypothesis is also used to verify the consistent
results of multiple experiments. For e.g., the null hypothesis
stating that there is no relation between some medication
and age of the patients supports the general effectiveness
conclusion, and allows recommendations.
Null hypothesis principle
 The principle of the null hypothesis is collecting
the data and determining the chances of the collected data
in the study of a random sample, proving that the null
hypothesis is true.
 In situations or studies where the collected data doesn’t
complete the expectation of the null hypothesis, it is
concluded that the data doesn’t provide sufficient or
reliable pieces of evidence to support the null hypothesis
and thus, it is rejected.
 The data collected is tested through some statistical tool
which is designed to measure the extent of departure of the
date from the null hypothesis.
 The procedure decides whether the observed departure
obtained from the statistical tool is larger than a defined
value so that the probability of occurrence of a high
departure value is very small under the null hypothesis.
 However, some data might not contradict the null
hypothesis which explains that only a weak conclusion can
be made and that the data doesn’t provide strong pieces of
evidence against the null hypothesis and the null hypothesis
might or might not be true.
 Under some other conditions, if the data collected is
sufficient and is capable of providing enough evidence, the
null hypothesis can be considered valid, indicating no
relationship between the phenomena.
When to reject null hypothesis?
 When the p-value of the data is less than the significant
level of the test, the null hypothesis is rejected, indicating
the test results are significant.
 However, if the p-value is higher than the significant value,
the null hypothesis is not rejected, and the results are
considered not significant.
 The level of significance is an important concept while
hypothesis testing as it determines the percentage risk of
rejecting the null hypothesis when H0 might happen to be
true.
 In other words, if we take the level of significance at 5%, it
means that the researcher is willing to take as much as a 5
percent risk of rejecting the null hypothesis when it (H0)
happens to be true.
 The null hypothesis cannot be accepted because the lack of
evidence only means that the relationship is not proven. It
doesn’t prove that something doesn’t exist, but it just
means that there are not enough shreds of evidence and
the study might have missed it.
Null hypothesis examples
The following are some examples of null hypothesis:
1. If the hypothesis is that “the consumption of a particular
medicine reduces the chances of heart arrest”, the null
hypothesis will be “the consumption of the medicine
doesn’t reduce the chances of heart arrest.”
2. If the hypothesis is that, “If random test scores are collected
from men and women, does the score of one group differ
from the other?” a possible null hypothesis will be that the
mean test score of men is the same as that of the women.
H0: µ1= µ2
H0= null hypothesis
µ1= mean score of men
µ2= mean score of women
Alternative hypothesis definition
An alternative hypothesis is a statement that describes that
there is a relationship between two selected variables in a
study.
 An alternative hypothesis is usually used to state that a new
theory is preferable to the old one (null hypothesis).
 This hypothesis can be simply termed as an alternative to
the null hypothesis.
 The alternative hypothesis is the hypothesis that is to be
proved that indicates that the results of a study are
significant and that the sample observation is not results
just from chance but from some non-random cause.
 If a study is to compare method A with method B about
their relationship and we assume that the method A is
superior or the method B is inferior, then such a statement
is termed as an alternative hypothesis.
 Alternative hypotheses should be clearly stated, considering
the nature of the research problem.
Alternative hypothesis symbol
 The symbol of the alternative hypothesis is either H1 or
Ha while using less than, greater than or not equal signs.
Alternative hypothesis purpose
 An alternative hypothesis provides the researchers with
some specific restatements and clarifications of the research
problem.
 An alternative hypothesis provides a direction to the study,
which then can be utilized by the researcher to obtain the
desired results.
 Since the alternative hypothesis is selected before
conducting the study, it allows the test to prove that the
study is supported by evidence, separating it from the
researchers’ desires and values.
 An alternative hypothesis provides a chance of discovering
new theories that can disprove an existing one that might
not be supported by evidence.
 The alternative hypothesis is important as they prove that a
relationship exists between two variables selected and that
the results of the study conducted are relevant and
significant.
Alternative hypothesis principle
 The principle behind the alternative hypothesis is similar to
that of the null hypothesis.
 The alternative hypothesis is based on the concept that
when sufficient evidence is collected from the data of
random sample, it provides a basis for proving the
assumption made by the researcher regarding the study.
 Like in the null hypothesis, the data collected from a
random sample is passed through a statistical tool that
measures the extent of departure of the data from the null
hypothesis.
 If the departure is small under the selected level of
significance, the alternative hypothesis is accepted, and the
null hypothesis is rejected.
 If the data collected don’t have chances of being in the
study of the random sample and are instead decided by the
relationship within the sample of the study, an alternative
hypothesis stands true.
Alternative hypothesis examples
The following are some examples of alternative hypothesis:
1. If a researcher is assuming that the bearing capacity of a
bridge is more than 10 tons, then the hypothesis under this
study will be:
Null hypothesis H0: µ= 10 tons
Alternative hypothesis Ha: µ>10 tons
2. Under another study that is trying to test whether there is a
significant difference between the effectiveness of medicine
against heart arrest, the alternative hypothesis will be that
there is a relationship between the medicine and chances of
heart arrest.

Null hypothesis vs Alternative hypothesis


Basis of
Null hypothesis Alternativ
comparison

The null hypothesis is a general


statement that states that there An alternative
is no relationship between two statement tha
Definition phenomenons under there is a relat
consideration or that there is two selected v
no association between two study.
groups.

Symbol It is denoted by H0. It is denoted b

Mathematical It is followed by ‘equals to’ It is followed b


expression sign. ‘less than’ or ‘g

The alternative
The null hypothesis believes
believes that t
Observation that the results are observed as
observed as a
a result of chance.
real causes.

It is the hypothesis that the It is a hypothe


Nature
researcher tries to disprove. researcher trie
The result of the null The result of a
Result hypothesis indicates no hypothesis cau
changes in opinions or actions. opinions and a

If the null hypothesis is If an alternativ


Significance of
accepted, the results of the accepted, the
data
study become insignificant. study become

If the p-value
If the p-value is greater than
level of signific
Acceptance the level of significance, the
alternative hyp
null hypothesis is accepted.
accepted.

Alternative hyp
The null hypothesis allows the
important as it
acceptance of correct existing
Importance relationship be
theories and the consistency of
variables, resu
multiple experiments.
improved theo

TYPE 1 &2 ERROR

Type 1 error definition


 Type 1 error, in statistical hypothesis testing, is the error
caused by rejecting a null hypothesis when it is true.
 Type 1 error is caused when the hypothesis that should
have been accepted is rejected.
 Type I error is denoted by α (alpha) known as an error, also
called the level of significance of the test.
 This type of error is a false negative error where the null
hypothesis is rejected based on some error during the
testing.
 The null hypothesis is set to state that there is no
relationship between two variables and the cause-effect
relationship between two variables, if present, is caused by
chance.
 Type 1 error occurs when the null hypothesis is rejected
even when there is no relationship between the variables.
 As a result of this error, the researcher might end up
believing that the hypothesis works even when it doesn’t.
Astronomers May Have Spotted Planet 9 in the 80’s and Not Even Realized It

Type 1 error causes


 Type 1 error is caused when something other than the
variable affects the other variable, which results in an
outcome that supports the rejection of the null hypothesis.
 Under such conditions, the outcome appears to have
happened due to some causes than chance, when in fact it
is caused by chance.
 Before a hypothesis is tested, a probability is set as a level
of significance which means that the hypothesis is being
tested while taking a chance where the null hypothesis is
rejected even when it is true.
 Thus, type 1 error might be due to chance/ level of
significance set before the test without considering the test
duration and sample size.
Probability of type 1 error
 The probability of Type I error is usually determined in
advance and is understood as the level of significance of
testing the hypothesis.
 If Type I error is fixed at 5 percent, it means that there are
about 5 chances in 100 that the null hypothesis, H0, will be
rejected when it is true.
 The rate or probability of type 1 error is symbolized by α
and is also termed the level os significance in a test.
 It is possible to reduce type 1 error at a fixed size of the
sample; however, while doing so, the probability of type II
error increases.
 There is a trade-off between the two errors where
decreasing the probability of one error increases the
probability of another. It is not possible to reduce both the
errors simultaneously.
 Thus, depending on the type and nature of the test, the
researchers need to decide the appropriate level of type 1
error after evaluating the consequences of the errors.
Type 1 error examples
 For this, let us take a hypothesis where a player is trying to
find the relationship between him wearing new shoes and
the number of wins for his team.
 Here, if the number of wins for his team is more when he
was wearing his new shoes is more than the number of wins
for his team otherwise, he might accept the alternative
hypothesis and determine that there is a relationship.
 However, the winning of his team might be influenced by
just chance rather than his shoes which results in type 1
error.
 In this case, he should’ve accepted the null hypothesis
because the winning of a team might happen due to chance
or luck.

Image Source: AB Tasty.


Type II error definition
 Type II error is the error that occurs when the null
hypothesis is accepted when it is not true.
 In simple words, Type II error means accepting the
hypothesis when it should not have been accepted.
 The type II error results in a false negative result.
 In other words, type II is the error of failing to accept an
alternative hypothesis when the researcher doesn’t have
adequate power.
 The Type II error is denoted by β (beta) and is also termed
as the beta error.
 The null hypothesis is set to state that there is no
relationship between two variables and the cause-effect
relationship between two variables, if present, is caused by
chance.
 Type II error occurs when the null hypothesis is acceptable
considering that the relationship between the variables is
because of chance or luck, and even when there is a
relationship between the variables.
 As a result of this error, the researcher might end up
believing that the hypothesis doesn’t work even when it
should.
Type II error causes
 The primary cause of type II error, like a Type II error, is the
low power of the statistical test.
 This occurs when the statistical is not powerful and thus
results in a Type II error.
 Other factors, like the sample size, might also affect the
results of the test.
 When small sample size is selected, the relationship
between two variables being tested might not be significant
even when it does exist.
 The researcher might assume the relationship is due to
chance and thus reject the alternative hypothesis even when
it is true.
 There it is important to select an appropriate size of the
sample before beginning the test.
Probability of type II error
 The probability of committing a Type II error is calculated by
subtracting the power of the test from 1.
 If Type II error is fixed at 2 percent, it means that there are
about two chances in 100 that the null hypothesis, H0, will be
accepted when it is not true.
 The rate or probability of type II error is symbolized by β
and is also termed as the error of the second type.
 It is possible to reduce the probability of Type II error by
increasing the level of significance.
 In this case, the probability of rejecting the null hypothesis
even when it is true also increases, in turn decreasing the
chances of accepting the null hypothesis when it is not true.
 However, because type I and Type II error are
interconnected, reducing one tends to increase the
probability of the other.
 Therefore, depending on the nature of the test, it is
important to determine which one of the errors is less
detrimental to the test.
 For this, if type I error involves with the time and effort for
retesting the chemicals used in medicine that should have
been accepted whereas the type II error involves the
chances of a number of users of this medicine being
poisoned, it is wise to accept the type I error over type II.
Type II error examples
 For this, let us take a hypothesis where a shepherd thinks
there is no wolf in the village and he wakes up all night for
five nights to determine the existence of the wolf.
 If he sees no wolf for five nights, he might assume that
there is no wolf in the village where the wolf might exist and
attack on the sixth night.
 In this case, when the shepherd accepts that no wolf exists,
a type II error results where he agrees with the null
hypothesis even when it is not true.

Figure: Graphical representation of type 1 and type 2 errors.


Image Source: https://doi.org/10.1175/BAMS-D-13-00115.1
Type I error vs Type II error
Basis for
Type I error Typ
comparison

Definition Type 1 error, in statistical Type II error i


hypothesis testing, is the error occurs when
caused by rejecting a null is accepted w
hypothesis when it is true.
 
 

Type I error is equivalent to Type II error i


Also termed
false positive. false negative

It is a false rejection of a true It is the false


Meaning
hypothesis. incorrect hyp

Symbol Type I error is denoted by α. Type II error i

The probability of type I error The probabili


Probability is equal to the level of equal to one
significance. the test.

It can be reduced by
It can be redu
Reduced decreasing the level of
the level of si
significance.

It is caused b
Cause It is caused by luck or chance.
size or a less

Type I error is similar to a false


What is it? Type II error i
hit.

Type II error i
Type I error is associated with
Hypothesis rejecting the
rejecting the null hypothesis.
hypothesis.
It happens when the
When does it It happens wh
acceptance levels are set too
happen? levels are set
lenient.

QUESTIONNAIRE DESIGN
Questionnaire- Types, Format, Questions
 A questionnaire is defined as a document containing
questions and other types of items designed to solicit
information appropriate for analysis.
 The questionnaire may be regarded as a form of an
interview on paper.
 Procedure for the construction of a questionnaire follows a
pattern similar to that of the interview schedule.
 However, because the questionnaire is impersonal it is all
the more important to take care of its construction.
 Since there is no interviewer to explain ambiguities or to
check misunderstandings, the questionnaire must be
especially clear in its working.
 The variety of possible answers to each question must be
anticipated more fully than for an interview.
The Essentials of the Questionnaire
Construction
 Questionnaire design is a very crucial and important part of
the research because an inappropriate questionnaire
misleads the research, academics, and policymaking.
 Therefore, a set of adequate and appropriate questions in a
sequential order is required in a questionnaire.
 The format of the questionnaire mostly depends on the
type of questionnaire used.

This is ‘Earth’s Black Box’ and It Will Document Our Extinction

Types of Questionnaire
There are roughly two types of questionnaires, structured and
unstructured. A mixture of these both is the quasi-structured
questionnaire that is used mostly in social science research.
 Structured questionnaires include pre-coded questions

with well-defined skipping patterns to follow the sequence


of questions. Most of the quantitative data collection
operations use structured questionnaires. Fewer
discrepancies, easy to administer consistency in answers
and easy for the data management are advantages of such
structured questionnaires.
 Unstructured questionnaires include open-ended and
vague opinion-type questions. Maybe questions are not in
the format of interrogative sentences and the moderator or
the enumerator has to elaborate the sense of the question.
Focus group discussions use such a questionnaire.
 Not all questions are easily pre-coded with almost possible
alternatives to answers. Given answer alternatives of some
questions in the standard questionnaires are left as ‘others’
(please specify). A common and pragmatic practice is that
most of the questions are structured, however, it is
comfortable to have some unstructured questions whose
answers are not feasible to enumerate completely. Such a
type of questionnaire is called a quasi-structured
questionnaire.
The Format of Questionnaire
Size:
 It should be smaller in size than that of the schedule.

 The extent in length and breadth should be appropriate.

 It should not be more than two or three pages as to the

nature of the research.


Appearance:
 It should be constructed on a good quality paper and

printing.
 It should have an attractive layout.

Clarity:
 The questions should be short, clear in terms, tenure, and
expression.
Sequence:
 The question should be arranged according to the

importance and preference.


Communicability:
 The questions of the questionnaire should be able to keep

the interest of the     respondents


Span:
 The length of the questions of the questionnaire should be

as short as possible.
 The questionnaire should not be long in length.

Question Types in a Questionnaire


The questions asked can take two forms:
 Restricted questions, also called closed-ended, are the

ones that ask the respondent to make choices — yes or no,


check items on a list, or select from multiple choice answers.
Restricted questions are easy to tabulate and compile.
 Unrestricted questions are open-ended and allow

respondents to share feelings and opinions that are


important to them about the matter at hand.
Unrestricted questions are not easy to tabulate and compile,
but they allow respondents to reveal the depth of their
emotions.
 If the objective is to compile data from all respondents, then

sticking with restricted questions that are easily quantified is


better.
 If degrees of emotions or depth of sentiment are to be
studied, then develop a scale to quantify those feelings.
Characteristics of Good Questions in a
Questionnaire
General rules of question crafting:
 Clear objective

 Simple language

 Clear concepts

 Without bias

 Adequate answer options

 Shorter questions

 The single question at a time

 Affirmative sentences

 Mathematics not imposed

 Short/clear reference periods

 Avoid question reference

Question Types to be avoided in a


Questionnaire
1. Question without objective
 Each question should have an objective.

Example:
The proposed research is to assess the knowledge of
respondents on sexually transmitted diseases. If the proposed
analytical framework has no consideration of the educational
(by discipline) background of the respondent it is futile to ask:
“Which subject did you study at university before you joined
the recent job?”
2. Complex language
 The language of the questionnaire should not be

complicated to understand. The vocabulary of the


respondents should be used in the questionnaire.
 A simple language is preferred. The use of rhetorical and

elite language creates problems while the questionnaire is


administered.
Example:
Did you realize the complexities of life in a different way by
the behavior of your spouse when you were tested positive
with HIV/AIDS? Instead, the questions like Do your spouse
knows about your HIV positive? (If Yes, Do you find a change
in his/her behavior? If Yes, What kind)
3. Ambiguous concepts
 Ambiguous concepts should not be incorporated into the

questions.
Example: What is your opinion about some medical
researches that pledge for the high prevalence of transmission
of HIV among the elite group of Nepal after the restoration of
a multiparty system? This question has three major elements
as medical research, HIV transmission, and restoration of
multiparty democracy. Elite group and high prevalence are
other minor elements. Respondent would not be able to
correctly form his/her opinion.
4. Reference of previous questions
It is extremely not suggested to ask the questions like “As I
asked in Question number 12 above about ….. “. If reference
or cue of previous questions is required to recall the answer of
respondents by stating full questions and answers to continue
the further interview.
5. Longer and vague reference periods
Reference periods should be clear and preferably shorter.
Longer reference period causes recall lapse errors. These
errors mislead the research.
For example, after the year of a greater earthquake or in
these ten years how many times did you visit the health post
for antenatal check-ups? Instead ” How many times did you
visit health post for a check-up during the period of your last
pregnancy (or three months)?
6. Questions with calculations
 As far as possible, avoid all calculation seeking questions.

Respondents do hesitate to calculate and there is always the


possibility of receiving wrong answers.
 Respondents who can not calculate also give wrong

answers to hide their ignorance and who can, they also have
a tendency of wrong calculation to exhibit their confidence
in calculations.
Example: What percent of your income is spent on the
treatment? Instead, use “What is your monthly income?” as a
preceding question of “How much do you spend in your
treatment?” and calculation should be performed in the data
processing and analysis phase.
 Do not give strains to the respondents.
7. Double negative (Double-barrelled)
 Double negatives must be avoided in the language of the

question.
 Double negative gives positive meaning but sounds like

negation to the statement. It also creates confusion for the


interviewers and respondents.
Example: “Do not you want to move from this place not to
expose yourself?” Instead, “Do you want to move from this
place to hide?” would be better.
8. Two in one Questions
 Merging of two questions into one should be completely

avoided.
 Such merging often confuses the respondent and according

to the cognitive capacity, some respondents serve answers


to the latter and some to the former.
 No, all respondents provide answers to both parts.

Example: When did you visit your spouse and how many


nights did you spend there? There are clearly 2 questions and
they are to be segregated.
9. Leading and embarrassing questions (Wording, Leading
and threatening)
 Leading and embarrassing questing should be biased.

 People feel offensive to answer these questions.

 Such questions also lead towards biased answers, therefore

these are to be avoided.


Example: Don’t you agree that persons with HIV positive have
also rights to marry? Or suppose, you are suffering from HIV
positive, should not you have the right to marry? Such types
of questions insist the respondent provide answers that match
the positive or negative tone of the question itself.

CHI-SQUARE TEST

Chi-square test definition


A Chi-square test is performed to determine if there is a
difference between the theoretical population parameter
and the observed data.
 Chi-square test is a non-parametric test where the data is

not assumed to be normally distributed but is distributed in


a chi-square fashion.
 It allows the researcher to test factors like a number of

factors like the goodness of fit, the significance of


population variance, and the homogeneity or difference in
population variance.
 This test is commonly used to determine if a random

sample is drawn from a population with mean µ and the


variance σ2.
This is ‘Earth’s Black Box’ and It Will Document Our Extinction

Chi-square test uses


Chi-square test is performed for various purposes, some of
which are:
1. This method is commonly used by researchers to determine
the differences between different categorical variables in a
population.
2. A Chi-square test can also be used as a test for goodness of
fit. It enables us to observe how well the theoretical
distribution fits the observed distribution.
3. It also works as a test of independence where it enables the
researcher to determine if two attributes of a population are
associated or not.
Chi-square test formula
Chi-square test is symbolically written as χ2 and the formula of
chi-square for comparing variance is given as:

where σs2 is the variance of the sample,


σp2 is the variance of the sample.
Similarly, when chi-square is used as a non-parametric test for
testing the goodness of fit or for testing the independence,
the following formula is used:

Where Oij is the observed frequency of the cell in the ith row


and jth column,
             Eij is the expected frequency of the cell in the ith row
and jth column.
Conditions for the chi-square test
For the chi-square test to be performed, the following
conditions are to be satisfied:
1. The observations are to be recorded and collected on a
random basis.
2. The items in the samples should all be independent.
3. The frequencies of data in a group should not be less than
10. Under such conditions, regrouping of items should be
done by combining frequencies.
4. The total number of individual items in the sample should
also be reasonably large, about 50 or more.
5. The constraints in the frequencies should be linear and not
containing squares or higher powers.
Chi-square distribution
 Chi-square distribution in statistics is the distribution of a
sum of the squares of independent normal random
variables.
 This distribution is a special case of the gamma distribution
and is one of the most commonly used distributions in
statistics.
 This distribution is used for the chi-square test for testing
the goodness of fit or testing the independence.
 Chi-square distribution is a part of the t-distribution, F-
distribution used for t-tests, and ANOVA.
Chi-square table
The following is the chi-square distribution table:
Chi-square test of independence
 When the chi-square test is used as a test of independence,
it allows the researcher to test whether the two attributes
being tested are associated or not.
 For this test, a null and alternative hypothesis is
formulated where the null hypothesis is that the two
attributes are not associated, and the alternative hypothesis
is that the attributes are associated.
 From the given data, the expected frequencies are then
calculated, followed by the calculation of chi-square value.
 Based on the calculated value of chi-square, either the null
or alternative hypothesis is accepted.
 Here, if the calculated value of chi-square is less than the
value in the table at the given level of significance, the null
hypothesis is accepted, indicating that there is no
relationship between the two attributes.
 However, if the calculated value of chi-square is found to be
higher than the value in the table, the alternative hypothesis
is accepted, indicating that there is a relationship between
the two attributes.
 The chi-square test only established the existence of a
relationship but not the degree of the relationship or its
form.
Chi-square test of goodness of fit
 Chi-square test is performed as a test of goodness of fit,
which helps the researcher to compare the theoretical
distribution with the observed distribution.
 When the calculated value of chi-square is found to be less
than the table value at a certain level of significance, the fit
between the data is considered to be good.
 A good fit indicates that the variation between the observed
and expected frequencies is due to fluctuations during
sampling.
 However, if the calculated value of chi-square is greater
than the table value, the fit is considered not to be as good.
Chi-square test examples
 A chi-square test performed to determine if a new
medication is effective against fever or not is an example of
a chi-square test as the test of independence to determine
the relationship between medicine and fever.
 Another example of the chi-square test is the testing of
some genetic theory that claims that children having one
parent of blood type A and the other of blood type B will
always have the blood group as one of three
types, A, AB, B, and that the proportion of three types will
on an average be as 1: 2: 1. On the basis of expected and
observed outcomes, the goodness of fit of the hypothesis
can be determined.
Chi-square test applications
 A Chi-square test is used in cryptanalysis to determine the
distribution of plain text and decrypted ciphertext.
 Similarly, it is also used in bioinformatics to determine the
distribution of different genes like disease genes and other
important genes.
 A Chi-square test is performed by various researchers of
different fields to test the minor or major hypothesis.

PRIMARY AND SECONDARY DATA

Primary Data
 It is the data collected by the investigator himself/ herself
for a specific purpose.
 Data gathered by finding out first-hand the attitudes of a
community towards health services, ascertaining the health
needs of a community, evaluating a social program,
determining the job satisfaction of the employees of an
organization, and ascertaining the quality of service
provided by a worker are the examples of primary data.
Advantages of using Primary data
 The investigator collects data specific to the problem under
study.
 There is no doubt about the quality of the data collected
(for the investigator).
 If required, it may be possible to obtain additional data
during the study period.
Disadvantages of using Primary data
1. The investigator has to contend with all the hassles of data
collection- 
 deciding why, what, how, when to collect

 getting the data collected (personally or through others)

 getting funding and dealing with funding agencies


 ethical considerations (consent, permissions, etc.)
2.   Ensuring the data collected is of a high standard-
 all desired data is obtained accurately, and in the format, it

is required in
 there is no fake/ cooked up data

 unnecessary/ useless data has not been included

3.   Cost of obtaining the data is often the major expense in


studies

Cooling Down Earth’s Core: What Would Happen if This Started to Become a
Reality?

Secondary Data
 Data collected by someone else for some other purpose
(but being utilized by the investigator for another purpose).
 Gathering information with the use of census data to obtain
information on the age-sex structure of a population, the
use of hospital records to find out the morbidity and
mortality patterns of a community, the use of an
organization’s records to ascertain its activities, and the
collection of data from sources such as articles, journals,
magazines, books and periodicals to obtain historical and
other types of information, are examples of secondary data.
Advantages of using Secondary data
 The data is already there- no hassles of data collection
 It is less expensive
 The investigator is not personally responsible for the quality
of data
Disadvantages of using Secondary data
 The investigator cannot decide what is collected (if specific
data about something is required, for instance).
 One can only hope that the data is of good quality
 Obtaining additional data (or even clarification) about
something is not possible (most often)
Primary Data vs Secondary Data
Primary data is an original and unique data, which is directly
collected by the researcher from a source according to his
requirements. As opposed to secondary data which is easily
accessible but are not pure as they have undergone through
many statistical treatments.
Character Primary Data Seco

Primary data refers to the first-hand


Seco
Definition data gathered by the researcher
by s
himself.
Data Real time data Past

Process Very Involved Quic

Surveys, observations, experiments, Gov


Source questionnaire, personal interview, boo
etc. reco

Cost-effectiveness Expensive Econ

Collection time Long Shor

Always specific to the researcher’s May


Specificity
needs. rese

Form Available in the crude form Avai

Accuracy and Reliability More Less

You might also like