Primary Data Collection Methods

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

Data Collection

Definition: The Data Collection is a process by which the researcher


collects the information from all the relevant sources to find answers to the
research problem, test the hypothesis and evaluate the outcomes.

While collecting the data, the researcher must identify the type of data to be
collected, source of data, and the method to be used to collect the data.
Also, the answers to the questions that who, when and where the data is to
be collected should be well addressed by the researcher.

The choice of data collection methods depends on the research problem


under study, the research design and the information gathered about the
variable. Broadly, the data collection methods can be classified into two
categories:

Primary Data Collection Methods: The primary data are the first-hand
data, collected by the researcher for the first time and is original in nature.
The researcher collects the fresh data when the research problem is
unique, and no related research work is done by any other person. The
results of the research are more accurate when the data is collected
directly by the researcher but however it is costly and time-consuming.

Secondary Data Collection Methods: When the data is collected by


someone else for his research work and has already passed through the
statistical analysis is called the secondary data. Thus, the secondary data
is the second-hand data which is readily available from the other sources.
One of the advantages of using the secondary data is that it is less
expensive and at the same time easily available, but however the
authenticity of the findings can be questioned.

Thus, the researcher can obtain data from either of the sources depending
on the nature of his study and the pursued research objective.

Primary Data Collection Methods


Definition: When the data are collected directly by the researcher for the
first time is called as Primary Data. It is original in nature and is specific to
a research problem under study.

Primary Data Collection Methods


A fresh data can be collected by using the following methods:
1. Interview Method: It is the most widely used primary data collection
methods wherein the interviewer asks questions either personally, or
through mail or telephone from the respondents to obtain the insights of
the problem under study. The researcher may either visit the respondent in
person at his home or meet him at the central location as mutually decided
by them. And in case, a large group of respondents is to be contacted
then the mail and telephone survey can be used. In the mail survey, the
questionnaires are sent to the respondent who is expected to give answers
to the questions via mail. In the case of a telephone survey, the
interviewees are called and asked questions (closed-ended) specific to the
research problem.
2. Delphi Technique: It is a forecasting technique wherein the researcher
elicits the information from the panel of experts either personally or through
a questionnaire sent through the mail. Here, each expert in his respective
field is asked to give their opinions on the problem concerned and the
consolidated view of all is used to reach for the most accurate answer.
3. Projective Techniques: The projective techniques are the unstructured
and an indirect interview method used where the respondents are reluctant
to give answers if the objective is disclosed. In order to deal with such
situation, the respondents are provided with the incomplete stimulus and
are required to complete it through which their underlying motivations,
attitudes, opinions, feelings, etc. related to the concerned issue gets
revealed. Some of the following projective techniques are used to discover
the ‘whys’ of the market and the consumer behavior:

 Thematic Apperception Test (TAT): Here the respondent is presented with


multiple pictures and then is asked to describe what he think the pictures
represent.
 Role Playing: Under this method, the respondents are given the imaginary
situations and are asked to enact in a way they would have if the situation is
real.
 Cartoon Completion: Here the respondents are shown the cartoon pictures
comprising of two or more characters and then are asked to give their ideas
and opinions about the characters.
 Word Association: Here the researcher provides a set of words to the
respondent and then ask them to tell what comes to their mind when they hear
a particular word.
 Sentence Completion: The researcher provides the incomplete sentences to
the respondents and asks them to complete it. This is done to check the ideas
of the respondents.
4. Focus Group Interview: It is one of the widely used data collection
methods wherein a small group of people, usually 6-12 members come
together to discuss the common areas of the problem. Here each
individual is required to provide his insights on the issue concerned and
reach to a unanimous decision. In this interview, there is a moderator who
regulates the discussion among the group members.

5. Questionnaire Method: Questionnaire is the most evident method of data


collection, which is comprised of a set of questions related to the research
problem. This method is very convenient in case the data are to be
collected from the diverse population. It mainly includes the printed set of
questions, either open-ended or closed-ended, which the respondents are
required to answer on the basis of their knowledge and experience with the
issue concerned.
Note: It is to be noted that these primary data collection methods can be used to
collect both the qualitative and quantitative data.

Secondary Data Collection Methods

Definition: When the data are collected by someone else for a purpose
other than the researcher’s current project and has already undergone the
statistical analysis is called as Secondary Data.

The secondary data are readily available from the other sources and as
such, there are no specific collection methods. The researcher can obtain
data from the sources both internal and external to the organization. The
internal sources of secondary data are:
- Sales Report
- Financial Statements
- Customer details, like name, age, contact details, etc.
- Company information
- Reports and feedback from a dealer, retailer, and distributor
- Management information system

There are several external sources from where the secondary data can be
collected. These are:
- Government censuses, like the population census, agriculture
census, etc.
- Information from other government departments, like social
security, tax records, etc.
- Business journals
- Social Books
- Business magazines
- Libraries
- Internet, where wide knowledge about different areas is easily
available.

The secondary data can be both qualitative and quantitative. The


qualitative data can be obtained through newspapers, diaries, interviews,
transcripts, etc., while the quantitative data can be obtained through a
survey, financial statements and statistics.

One of the advantages of the secondary data is that it is easily available


and hence less time is required to gather all the relevant information. Also,
it is less expensive than the primary data. But however the data might not
be specific to the researcher’s needs and at the same time is incomplete to
reach a conclusion. Also, the authenticity of the research results might be
skeptical.
Data Collection Definition
Data collection is defined as the procedure of collecting, measuring and
analyzing accurate insights for research using standard validated techniques.
A researcher can evaluate their hypothesis on the basis of collected data. In
most cases, data collection is the primary and most important step for
research, irrespective of the field of research. The approach of data collection
is different for different fields of study, depending on the required information.

The most critical objective of data collection is ensuring that information-rich


and reliable data is collected for statistical analysis so that data-driven
decisions can be made for research.

Data Collection Methods: Phone vs. Online vs. In-Person


Interviews
Essentially there are four choices for data collection – in-person interviews,
mail, phone and online. There are pros and cons to each of these modes.

In-Person Interviews
Pros: In-depth and a high degree of confidence on the data
Cons: Time consuming, expensive and can be dismissed as anecdotal

Mail Surveys
Pros: Can reach anyone and everyone – no barrier
Cons: Expensive, data collection errors, lag time

Phone Surveys
Pros: High degree of confidence on the data collected, reach almost anyone
Cons: Expensive, cannot self-administer, need to hire an agency

Web/Online Surveys
Pros: Cheap, can self-administer, very low probability of data errors
Cons: Not all your customers might have an email address/be on the internet,
customers may be wary of divulging information online.

In-person interviews always are better, but the big drawback is the trap you
might fall into if you don’t do them regularly. It is expensive to regularly
conduct interviews and not conducting enough interviews might give you false
positives. Validating your research is almost as important as designing and
conducting it. We’ve seen many instances where after the research is
conducted – if the results do not match up with the “gut-feel” of upper
management, it has been dismissed off as anecdotal and a “one-time”
phenomenon. To avoid such traps, we strongly recommend that data-
collection be done on an “ongoing and regular” basis. This will help you in
comparing and analyzing the change in perceptions according to marketing
done for your products/services. The other issue here is sample size. To be
confident with your research you have to interview enough people to weed out
the fringe elements.

A couple of years ago there was quite a lot of discussion about online surveys
and their statistical validity. The fact that not every customer had internet
connectivity was one of the main concerns. Although some of the discussions
are still valid, the reach of the internet as a means of communication has
become vital in the majority of customer interactions. According to the
US Census Bureau, the number of households with computers has doubled
between 1997 and 2001.
5 useful methods of collecting primary data in statistics

Statistical data as we have seen can be either primary or secondary.


Primary data are those which are collected for the first time and so are in
crude form. But secondary data are those which have already been
collected.
Primary data are always collected from the source. It is collected either by
the investigator himself or through his agents. There are different methods
of collecting primary data. Each method has its relative merits and
demerits. The investigator has to choose a particular method to collect the
information. The choice to a large extent depends on the preliminaries to
data collection some of the commonly used methods are discussed below.
1. Direct Personal observation:
This is a very general method of collecting primary data. Here the
investigator directly contacts the informants, solicits their cooperation and
enumerates the data. The information are collected by direct personal
interviews.
The novelty of this method is its simplicity. It is neither difficult for the
enumerator nor the informants. Because both are present at the spot of
data collection. This method provides most accurate information as the
investigator collects them personally. But as the investigator alone is
involved in the process, his personal bias may influence the accuracy of the
data. So it is necessary that the investigator should be honest, unbiased
and experienced. In such cases the data collected may be fairly accurate.
However, the method is quite costly and time-consuming. So the method
should be used when the scope of enquiry is small.
2. Indirect Oral Interviews :
This is an indirect method of collecting primary data. Here information are
not collected directly from the source but by interviewing persons closely
related with the problem. This method is applied to apprehend culprits in
case of theft, murder etc. The informations relating to one’s personal life or
which the informant hesitates to reveal are better collected by this method.
Here the investigator prepares ‘a small list of questions relating to the
enquiry. The answers (information) are collected by interviewing persons
well connected with the incident. The investigator should cross-examine the
informants to get correct information.
This method is time saving and involves relatively less cost. The accuracy
of the information largely depends upon the integrity of the investigator. It is
desirable that the investigator should be experienced and capable enough
to inspire and create confidence in the informant to collect accurate data.
3. Mailed Questionnaire method:
This is a very commonly used method of collecting primary data. Here
information are collected through a set of questionnaire. A questionnaire is
a document prepared by the investigator containing a set of questions.
These questions relate to the problem of enquiry directly or indirectly. Here
first the questionnaires are mailed to the informants with a formal request to
answer the question and send them back. For better response the
investigator should bear the postal charges. The questionnaire should carry
a polite note explaining the aims and objective of the enquiry, definition of
various terms and concepts used there. Besides this the investigator
should ensure the secrecy of the information as well as the name of the
informants, if required.
Success of this method greatly depends upon the way in which the
questionnaire is drafted. So the investigator must be very careful while
framing the questions. The questions should be
(i) Short and clear
(ii) Few in number
(iii) Simple and intelligible
(iv) Corroboratory in nature or there should be provision for cross check
(v) Impersonal, non-aggressive type
(vi) Simple alternative, multiple-choice or open-end type
(a) In the simple alternative question type, the respondent has to choose
between alternatives such as ‘Yes or No’, ‘right or wrong’ etc.
For example: Is Adam Smith called father of Statistics ? Yes/No,
(b) In the multiple choice type, the respondent has to answer from any of
the given alternatives.
Example: To which sector do you belong ?
(i) Primary Sector
(ii) Secondary Sector
(iii) Tertiary or Service Sector
(c) In the Open-end or free answer questions the respondents are given
complete freedom in answering the questions. The questions are like –
What are the defects of our educational system ?
The questionnaire method is very economical in terms of time, energy and
money. The method is widely used when the scope of enquiry is large.
Data collected by this method are not affected by the personal bias of the
investigator. However the accuracy of the information depends on the
cooperation and honesty of the informants. This method can be used only if
the informants are cooperative, conscious and educated. This limits the
scope of the method.
4. Schedule Method:
In case the informants are largely uneducated and non-responsive data
cannot be collected by the mailed questionnaire method. In such cases,
schedule method is used to collect data. Here the questionnaires are sent
through the enumerators to collect informations. Enumerators are persons
appointed by the investigator for the purpose. They directly meet the
informants with the questionnaire. They explain the scope and objective of
the enquiry to the informants and solicit their cooperation. The enumerators
ask the questions to the informants and record their answers in the
questionnaire and compile them. The success of this method depends on
the sincerity and efficiency of the enumerators. So the enumerator should
be sweet-tempered, good-natured, trained and well-behaved.
Schedule method is widely used in extensive studies. It gives fairly correct
result as the enumerators directly collect the information. The accuracy of
the information depends upon the honesty of the enumerators. They should
be unbiased. This method is relatively more costly and time-consuming
than the mailed questionnaire method.
5. From Local Agents:
Sometimes primary data are collected from local agents or correspondents.
These agents are appointed by the sponsoring authorities. They are well
conversant with the local conditions like language, communication, food
habits, traditions etc. Being on the spot and well acquainted with the nature
of the enquiry they are capable of furnishing reliable information.
The accuracy of the data collected by this method depends on the honesty
and sincerity of the agents. Because they actually collect the information
from the spot. Information from a wide area at less cost and time can be
collected by this method. The method is generally used by government
agencies, newspapers, periodicals etc. to collect data.
Information are like raw materials or inputs in an enquiry. The result of the
enquiry basically depends on the type of information used. Primary data
can be collected by employing any of the above methods. The investigator
should make a rational choice of the methods to be used for collecting
data. Because collection of data forms the beginning of the statistical
enquiry.
You can collect primary data using the following methods:

1. Surveys: This method involves cost, as it cost money to design and implement
a survey. Also, data collected from surveys may contain lots of missing data,
data in incorrect format for example, someone could enter their age as ‘twenty
eight’ instead of 28, so lots of work is needed to preprocess, organize, clean,
and reshape data collected from surveys.
2. From websites: Sometimes you can scrape data from websites, lots of work
has to be done to clean, organize and reshape the data. However, some
websites contain data in a clean and structured format.
3. Purchase raw data from organizations or companies: This method is also
costly. But it saves time as sometimes data purchased from a company or
organization may already be in a structured format that can be used directly
for analysis without cleaning and reshaping the data.
4. Simulate data: This method is mostly used for stochastic processes. For
example, you can use Monte-Carlo simulation to simulate data that follows a
given probability distribution like Poison distribution or Normal distribution.
This method of generating raw data is free.
Data Collection Methods: What To
Know for Statistics
Introduction

When faced with a research problem, you need to collect, analyze and interpret data to
answer your research questions. Examples of research questions that could require you
to gather data include how many people will vote for a candidate, what is the best
product mix to use and how useful is a drug in curing a disease. The research problem
you explore informs the type of data you’ll collect and the data collection method you’ll
use. In this article, we will explore various types of data, methods of data collection and
advantages and disadvantages of each. After reading our review, you will have an
excellent understanding of when to use each of the data collection methods we discuss.

Types of Data

Quantitative Data
Data that is expressed in numbers and summarized using statistics to give meaningful
information is referred to as quantitative data. Examples of quantitative data we could
collect are heights, weights, or ages of students. If we obtain the mean of each set of
measurements, we have meaningful information about the average value for each of
those student characteristics.
Qualitative Data
When we use data for description without measurement, we call it qualitative data.
Examples of qualitative data are student attitudes towards school, attitudes towards
exam cheating and friendliness of students to teachers. Such data cannot be easily
summarized using statistics.
Primary Data
When we obtain data directly from individuals, objects or processes, we refer to it
as primary data. Quantitative or qualitative data can be collected using this approach.
Such data is usually collected solely for the research problem to you will study. Primary
data has several advantages. First, we tailor it to our specific research question, so
there are no customizations needed to make the data usable. Second, primary data is
reliable because you control how the data is collected and can monitor its quality. Third,
by collecting primary data, you spend your resources in collecting only required data.
Finally, primary data is proprietary, so you enjoy advantages over those who cannot
access the data.
Despite its advantages, primary data also has disadvantages of which you need to be
aware. The first problem with primary data is that it is costlier to acquire as compared to
secondary data. Obtaining primary data also requires more time as compared to
gathering secondary data.

Secondary Data
When you collect data after another researcher or agency that initially gathered it makes
it available, you are gathering secondary data. Examples of secondary data are
census data published by the US Census Bureau, stock prices data published by CNN
and salaries data published by the Bureau of Labor Statistics.
One advantage to using secondary data is that it will save you time and money,
although some data sets require you to pay for access. A second advantage is the
relative ease with which you can obtain it. You can easily access secondary data from
publications, government agencies, data aggregation websites and blogs. A third
advantage is that it eliminates effort duplication since you can identify existing data that
matches your needs instead of gather new data.

Despite the benefits it offers, secondary data has its shortcomings. One limitation is that
secondary data may not be complete. For it to meet your research needs, you may
need to enrich it with data from other sources. A second shortcoming is that you cannot
verify the accuracy of secondary data, or the data may be outdated. A third challenge
you face when using secondary data is that documentation may be incomplete or
missing. Therefore, you may not be aware of any problems that happened in data
collection which would otherwise influence its interpretation. Another challenge you may
face when you decide to use secondary data is that there may be copyright restrictions.

Now that we’ve explained the various types of data you can collect when conducting
research, we will proceed to look at methods used to collect primary and secondary
data.
Methods Employed in Primary Data Collection

When you decide to conduct original research, the data you gather can be quantitative
or qualitative. Generally, you collect quantitative data through sample surveys,
experiments and observational studies. You obtain qualitative data through focus
groups, in-depth interviews and case studies. We will discuss each of these data
collection methods below and examine their advantages and disadvantages.

Sample Surveys
A survey is a data collection method where you select a sample of respondents from a
large population in order to gather information about that population. The process of
identifying individuals from the population who you will interview is known as sampling.
To gather data through a survey, you construct a questionnaire to prompt information
from selected respondents. When creating a questionnaire, you should keep in mind
several key considerations. First, make sure the questions and choices are
unambiguous. Second, make sure the questionnaire will be completed within a
reasonable amount of time. Finally, make sure there are no typographical errors. To
check if there are any problems with your questionnaire, use it to interview a few people
before administering it to all respondents in your sample. We refer to this process as
pretesting.

Using a survey to collect data offers you several advantages. The main benefit is time
and cost savings because you only interview a sample, not the large population.
Another benefit is that when you select your sample correctly, you will obtain
information of acceptable accuracy. Additionally, surveys are adaptable and can be
used to collect data for governments, health care institutions, businesses and any other
environment where data is needed.

A major shortcoming of surveys occurs when you fail to select a sample correctly;
without an appropriate sample, the results will not accurately generalize the population.
Ways of Interviewing Respondents

Once you have selected your sample and developed your questionnaire, there are
several ways you can interview participants. Each approach has its advantages and
disadvantages.

In-person Interviewing
When you use this method, you meet with the respondents face to face and ask
questions. In-person interviewing offers several advantages. This technique has
excellent response rates and enables you to conduct interviews that take a longer
amount of time. Another benefit is you can ask follow-up questions to responses that
are not clear.

In-person interviews do have disadvantages of which you need to be aware. First, this
method is expensive and takes more time because of interviewer training, transport,
and remuneration. A second disadvantage is that some areas of a population, such as
neighborhoods prone to crime, cannot be accessed which may result in bias.

Telephone Interviewing
Using this technique, you call respondents over the phone and interview them. This
method offers the advantage of quickly collecting data, especially when used with
computer-assisted telephone interviewing. Another advantage is that collecting data via
telephone is cheaper than in-person interviewing.

One of the main limitations with telephone interviewing it’s hard to gain the trust of
respondents. Due to this reason, you may not get responses or may introduce bias.
Since phone interviews are generally kept short to reduce the possibility of upsetting
respondents, this method may also limit the amount of data you can collect.

Online Interviewing
With online interviewing, you send an email inviting respondents to participate in an
online survey. This technique is used widely because it is a low-cost way of interviewing
many respondents. Another benefit is anonymity; you can get sensitive responses that
participants would not feel comfortable providing with in-person interviewing.
When you use online interviewing, you face the disadvantage of not getting a
representative sample. You also cannot seek clarification on responses that are
unclear.

Mailed Questionnaire
When you use this interviewing method, you send a printed questionnaire to the postal
address of the respondent. The participants fill in the questionnaire and mail it back.
This interviewing method gives you the advantage of obtaining information that
respondents may be unwilling to give when interviewing in person.

The main limitation with mailed questionnaires is you are likely to get a low response
rate. Keep in mind that inaccuracy in mailing address, delays or loss of mail could also
affect the response rate. Additionally, mailed questionnaires cannot be used to interview
respondents with low literacy, and you cannot seek clarifications on responses.

Focus Groups
When you use a focus group as a data collection method, you identify a group of 6 to 10
people with similar characteristics. A moderator then guides a discussion to identify
attitudes and experiences of the group. The responses are captured by video recording,
voice recording or writing—this is the data you will analyze to answer your research
questions. Focus groups have the advantage of requiring fewer resources and time as
compared to interviewing individuals. Another advantage is that you can request
clarifications to unclear responses.

One disadvantage you face when using focus groups is that the sample selected may
not represent the population accurately. Furthermore, dominant participants can
influence the responses of others.

Observational Data Collection Methods

In an observational data collection method, you acquire data by observing any


relationships that may be present in the phenomenon you are studying. There are four
types of observational methods that are available to you as a researcher: cross-
sectional, case-control, cohort and ecological.
In a cross-sectional study, you only collect data on observed relationships once. This
method has the advantage of being cheaper and taking less time as compared to case-
control and cohort. However, cross-sectional studies can miss relationships that may
arise over time.
Using a case-control method, you create cases and controls and then observe them. A
case has been exposed to a phenomenon of interest while a control has not. After
identifying the cases and controls, you move back in time to observe how your event of
interest occurs in the two groups. This is why case-control studies are referred to as
retrospective. For example, suppose a medical researcher suspects a certain type of
cosmetic is causing skin cancer. You recruit people who have used a cosmetic, the
cases, and those who have not used the cosmetic, the controls. You request
participants to remember the type of cosmetic and the frequency of its use. This method
is cheaper and requires less time as compared to the cohort method. However, this
approach has limitations when individuals you are observing cannot accurately recall
information. We refer to this as recall bias because you rely on the ability of participants
to remember information. In the cosmetic example, recall bias would occur if
participants cannot accurately remember the type of cosmetic and number of times
used.
In a cohort method, you follow people with similar characteristics over a period. This
method is advantageous when you are collecting data on occurrences that happen over
a long period. It has the disadvantage of being costly and requiring more time. It is also
not suitable for occurrences that happen rarely.
The three methods we have discussed previously collect data on individuals. When you
are interested in studying a population instead of individuals, you use
an ecological method. For example, say you are interested in lung cancer rates in Iowa
and North Dakota. You obtain number of cancer cases per 1000 people for each state
from the National Cancer Institute and compare them. You can then hypothesize
possible causes of differences between the two states. When you use the ecological
method, you save time and money because data is already available. However the data
collected may lead you to infer population relationships that do not exist.
Experiments

An experiment is a data collection method where you as a researcher change some


variables and observe their effect on other variables. The variables that you manipulate
are referred to as independent while the variables that change as a result of
manipulation are dependent variables. Imagine a manufacturer is testing the effect of
drug strength on number of bacteria in the body. The company decides to test drug
strength at 10mg, 20mg and 40mg. In this example, drug strength is the independent
variable while number of bacteria is the dependent variable. The drug administered is
the treatment, while 10mg, 20mg and 40mg are the levels of the treatment.
The greatest advantage of using an experiment is that you can explore causal
relationships that an observational study cannot. Additionally, experimental research
can be adapted to different fields like medical research, agriculture, sociology, and
psychology. Nevertheless, experiments have the disadvantage of being expensive and
requiring a lot of time.

Summary

This article introduced you to the various types of data you can collect for research
purposes. We discussed quantitative, qualitative, primary and secondary data and
identified the advantages and disadvantages of each data type. We also reviewed
various data collection methods and examined their benefits and drawbacks. Having
read this article, you should be able to select the data collection method most
appropriate for your research question. Data is the evidence that you use to solve your
research problem. When you use the correct data collection method, you get the right
data to solve your problem.

You might also like