This is the second article of a three-part series that continues the discussion on the fundamentals of writing research protocols
for quantitative, clinical research studies. In this editorial, the author discusses some considerations for including information
in a research protocol on the study design and approach of a research study. This series provides a guide for undergraduate
researchers interested in publishing their protocol in the Undergraduate Research in Natural and Clinical Sciences and
Technology (URNCST) Journal.
Keywords: protocol; proposal; study design; population; study setting; sampling; sample size; undergraduate research;
clinical research
descriptions of different types of observational and that prevents research participants and the research team
experimental study designs. from having prior knowledge about the assignment
Randomization refers to the process of assigning sequence of research participants [6]. Such knowledge may
research participants randomly to either the treatment or unduly influence the study results, for example, some
control groups to equally distribute the demographic and research participants who know that they are receiving a
clinical variables in the study sample [6]. These variables placebo treatment may experience worse clinical outcomes.
are known as confounding factors, and an equal This observation is also referred to as the placebo effect
distribution of these variables through randomization would [9]. Moreover, some studies have observed a trial effect
remove their risk of influencing the study [8]. where research participants behave differently due to their
Blinding, on the other hand, is a methodological step involvement in a clinical trial [10].
Population of Interest research study is carried out. Investigators should record the
The population of interest is the study’s target characteristics, events, gatherings, and other features of a
population that it intends to study or treat. In clinical study setting before submitting their study for ethics review
research studies, it is often not appropriate or feasible to and beginning data collection. Observing a study setting
recruit the entire population of interest. Instead, before the start of data collection allows investigators to
investigators will recruit a sample from the population of premeditate any practical challenges inherent in the
interest to include in their study. In such cases, the organization, structure or layout of the study setting. In
objective of the research study is to generalize the study turn, this allows investigators to circumvent these
findings from the sample to the population of interest [15]. challenges with appropriate strategies that can be included
In a research protocol of a clinical research study, it is in the ethics applications, funding applications, and
important to describe the demographic characteristics of the research protocols. Showing ethics officers and sponsors
population of interest including their age, ethnicity, that the investigators have taken careful consideration of
socioeconomic status, education level, marital status, and possible problems and challenges in the study setting or
work status. Reflecting on the characteristics of the “ideal” design may increase the likelihood of passing an ethics
research participant is an important way to conceptualize review and obtaining funding for a research study. Some
the population of interest, eligibility criteria, study setting, examples of study locations for clinical research studies are
and the sampling strategies that will optimize recruitment inpatient bedrooms, hospital wards, operating rooms, and
and retention. rehabilitation clinics.
The eligibility criteria determine whether or not an The characteristics of the study setting deserve a
individual is qualified to be a participant in a research separate section in a research protocol. Information that is
study. These criteria are determined a priori to the pertinent to include in the research protocol about the study
submission of an ethics application and start of data setting are the structure, layout, and organization of the
collection [16]. Eligibility criteria consist of inclusion setting, rationale for choosing this setting over others,
criteria, which are the main characteristics of the external or online links that describe the setting if available,
population of interest. A potential research participant has and any data from the literature on the setting. Keep in
to fulfill all criteria in order to participate in the study. mind that a protocol’s discussion on the study setting has to
Exclusion criteria, on the other hand, are characteristics be coherent with other parts of the research protocol. A
that may interfere with data collection, follow-up, and protocol that appears incoherent is not considered good
safety of research participants [16]. If a potential participant research practice, and in turn, may become an obstacle to
fulfils any one of the exclusion criteria, then they are obtaining ethics review and funding.
excluded from participation. Designing exclusion criteria
require investigators to examine the literature on the topic Sampling
and discern important variables and confounding factors Sampling is the process of selecting a statistically
that have shown to interfere with the study plan. Another representative sample of individuals from the population of
way to develop exclusion criteria is to use the PICO(TS) interest [16]. Sampling is an important tool for research
components of the research study [2]. For example, in a studies because the population of interest usually consists
research study looking at the effect of repetitive transcranial of too many individuals for any research project to include
magnetic stimulation on patients with chronic pain, an as participants. A good sample is a statistical representation
exclusion criterion may be to exclude individuals who are of the population of interest and is large enough to answer
older than 65 or younger than 20 because they may tolerate the research question [17].
pain differently compared to the population of patients In clinical research, there are different strategies that
between ages 40 and 65. Eligibility criteria are usually investigators can use to obtain a representative sample from
formatted in a two-column table with inclusion criteria on the population of interest [16]. These strategies are referred
the left side and exclusion criteria on the right. This is to as sampling strategies, and the strategy employed in a
usually accompanied by a rationale for choosing the research study depends on the characteristics of the
inclusion and exclusion criteria, and with the appropriate population of interest, the desired power and significance
citing of previous research studies that have utilized similar level (discussed in the next section), and the research
criteria to guide their study. question. Table 2 describes some of the most commonly
used sampling strategies in clinical research. The benefits
Study Setting and drawbacks of each sampling strategy are beyond the
The study setting is an important component of a scope of this paper but can be found in other documents and
research study. The nature, context, environment, and articles published online.
logistics of the study setting may influence how the
A Primer to Statistics in Epidemiology tolerate type I and II errors. In other words, they must
An organized research study contains a good research establish the thresholds for significance and power for their
question and hypothesis. A hypothesis can be simple, research study. The statistical significance is often set to
comprising of one predictor and one outcome variable, or 0.05 [19], although this is an arbitrary number without a
complex with multiple predictor variables [18]. In the real statistical or clinical rationale. Studies in some areas of
world, a hypothesis can be true or false, which is health sciences use other thresholds for defining
determined by the statistical significance of results. When significance, for example, the significance level may be as
considering the significance, there is a null hypothesis low as 10-14 in some genetic epidemiological research [20].
(H0), which assumes that there is no association between In clinical research studies, the power level is often set
the predictor and outcome variables, and the alternative between 0.80 and 0.95 [21]. The thresholds for significance
hypothesis (HA), which assumes that there is an association and power depend on a variety of factors such as the
between the predictor and outcome variables. The statistical discipline, number of research questions and objectives, the
objective of a research study would be to reject the H0 in nature of phenomenon, and the research participants [15].
favour of the HA. In other words, the investigators reject the
assumption that there is no association (H0) in the Example 1: Errors, Significance and Power
population of interest, thereby making the conclusion that Type I Error (α): The probability that the null hypothesis is
there is an association (HA). true; but the investigator incorrectly rejects it
In some cases, random variations in the sample may Type II Error (β): The probability that the null hypothesis
yield results that appear statistically significant but do not is untrue; but the investigator incorrectly accepts it
reflect real associations in the population. When the study Power (1 - β): The probability that the null hypothesis is
findings reflect random variations, then a statistical error untrue; and the investigator correctly rejects it
has occurred. There are two types of statistical errors that
can occur in a research study, which are considered The p-value is the probability of obtaining the study
probabilities of making an incorrect conclusion. A type I results because of random variations in the population of
error occurs when the investigators reject the H0 when it is interest. If this probability is small and less than the
true in the population of interest. Type I error is also predetermined significance level (p < α), then the H0 can be
referred to as the level of statistical significance (α). On rejected in favour of the HA. This conclusion assumes that
the other hand, a type II error (β) occurs when the there is an association that truly reflects the population of
investigator does not reject the null hypothesis when it is interest. On the other hand, if the p-value is higher than the
untrue in the population of interest. The compound (1 - β) predetermined significance level (p > α), the investigators
of the type II error is referred to as power, which is the cannot reject the H0. This conclusion does not mean that the
probability of rejecting the null hypothesis given that it is investigators accept the H0 or reject the HA. Instead, it
untrue in the population of interest [17]. means that the study findings are more likely due to random
Before conducting a research study, the investigators variations and therefore, may not truly reflect real
must determine the probability at which they are willing to associations in the population of interest.
Example 2: P-value These are usually set to 0.05 and 0.80, respectively [18],
Research Project 1: α = 0.05 however it may differ depending on the discipline,
H0: No association methodology, number of research participants and the
HA: Association research question. The next step is to determine whether the
p = 0.04 research study needs a one- or two-sided statistical test.
p < α; reject the H0 Generally, two-sided tests are usually employed because of
a statistical uncertainty that the results can go either in the
The probability of getting the results due to random positive or negative direction. For example, after diagnosis
variation is 4%, which is lower than the predetermined of a chronic medical condition, patients may experience an
significance level (α = 0.05). The results from the sample increase or decrease in psychological well-being [23].
are unlikely due to random variations in the population of However, in studies where there is a logical rationale for
interest. Therefore, reject the H0 in favour of HA. the study results to deviate in one or the other direction,
then a one-sided test should be used [18]. For example, in a
Research Project 2: α = 0.05 study of the deleterious effects of carbon monoxide
H0: No association exposure on the heart function of infants, the results will be
HA: Association in the negative direction because investigators can assume
p = 0.10 that no research participant will benefit from carbon
p > α; do not reject the H0 monoxide exposure.
The next steps include discerning the types, nature, and
The probability of getting the results due to random quantity of clinical outcomes to be included in the statistical
variation is 10%, which is higher than the predetermined computation of sample size. The investigators need to
significance level (α = 0.05). The results from the sample determine whether or not each clinical outcome follow a
may be due to random variations in the population of normal distribution and if they are binary or continuous.
interest. Therefore, do not reject the H0. This information is often obtained from previous studies in
the same or similar populations of interests or pilot studies
Sample Size on the research question of interest. After making this
One of the objectives of sampling in epidemiological decision, the investigators determine the size of the
studies is to obtain a statistically representative sample difference they hope to detect from their research study by
from the population of interest such that the inferences and answering:
study findings from the sample represent real associations 1) How large of a difference would impact patients’
in the population of interest. The sample size of a research lives and/or clinical practice?
study should have adequate power and significance [22], 2) How large of a difference are we expecting from
allowing the investigators to be confident that the study this research study?
findings cannot be attributed to random variations in the
population of interest. In this way, computing the sample Once these considerations are made, the investigators
size becomes an important step in clinical, quantitative are ready to compute their sample size calculation.
studies. Depending on the answers to the questions above, the
When computing the sample size of a research study, formula for the sample size will be different [17].
the first step is to consult a statistician to ensure that the Considering the factors that affect sample size while
computations use appropriate statistical methodologies. The consulting a statistician is an important step for sample size
next step is to set the significance and power levels determination. Some factors that may influence the sample
depending on the characteristics of the research study. size of a research study are shown in Table 3.
