0 Essay Content Validity in Classroom Tests
0 Essay Content Validity in Classroom Tests
0 Essay Content Validity in Classroom Tests
YES
Validity in Classroom Tests
students' learning and facilitates teachers' decision-making favouring students' individual needs and
teaching strategies (Siddiek, 2010). For that reason, educational tests should be featured by quality
This essay will display foremost the conceptualization of validity from the psychometric
perspective to its applicability to the classroom setting. Also, the Test Content will be highlighted as
the ideal source of validity evidence for the Classroom test. Table of specification will be presented
as an effective planning tool to ensure Content Validity and link objectives, instruction and
assessment for test constructed by teachers. Lastly, some conclusions and implications will be
mentioned.
Validity Concept
Validity is the most fundamental consideration in developing and evaluating tests. The
American Psychological Association (APA), American Educational Research Association (AERA) and
National Council of Measurement in Education (NCME) (2014), in their Standards for educational and
Psychological Testing, which is the most relevant and orthodox referent in developing and evaluating
tests (Elosua, 2003), define Validity as a unitary concept, the degree to which accumulated evidence
and theory support the interpretations of test scores for the proposed uses of the test (APA, AERA,
NCME, 2014).
It is emphasised that what is validated is the test scores' interpretations for specific uses, not
the test itself (APA, AERA, NCME, 2014). Another traditional but still relevant definition of Validity is
"the degree to which a test is measuring what it is supposed to measure" (Maizam, 2005).
In this line, validation is the process of collecting significant evidence, which is used as
arguments for and against the intended interpretation of test scores and their relevance to the
proposed use (Elosua, 2003). This evidence can be obtained from various sources of information
such as the Test Content, Response Processes, Internal Structure, Relationship to Other Variables
Figure 1
Test Content
Consequences Response
of the test Processes
Validity
Relationship to Internal
other variables Structure
Having mentioned this, the basis of a Validity study is to specify the intended uses and the
proposed interpretation of test scores with its rationale. As well as the determination of the
construct (knowledge, skills, abilities, processes, competencies) the test is intended to measure
Having spell out the definition of Validity, the following ideas will be centred around Test
Content Validity evidence, which is an essential property in test development, vital to ensure the
overall Validity and prerequisite for other types of validity evidence (Weller, 2015; Yusoff, 2019;
Zamanzadeh et al., 2014). It refers to the degree to which the test faithfully measures the domain
intended to assess (Sireci, 1998). It includes logical and empirical analyses concerning the degree to
which a test's content (themes, wording and format of the items, administration and scores)
represents the construct and is congruent with testing purposes (APA, AERA, NCME, 2014; Sireci &
Faulkner-Bond, 2014).
table of specifications detailing the content areas and sub-areas that the test is
designed to measure with the specific content standards and objectives from the
b) Domain representation: how the test adequately represents and measures the
content and cognitive areas as defined in the test specifications; that is, if the items
c) Domain relevance: the extent to which each item on a test is relevant to the specific
The traditional psychometric method to evaluate Content Validity is to calculate the Content
Validity Index (CVI) (see Figure 2), which is a percentage (calculated by a statistical formula) of the
agreement of a subject matter experts panel who analyse the representativeness and adequacy
comprehensiveness, appropriateness, relevance and clarity of each test item (Sireci & Faulkner-
Bond, 2014).
They as well provide, in consensus, valuables recommendations to improve the test. The
panel of experts should be experienced and qualified individuals in the content domain of the test. It
is recommended that at least five experts participate with their viewpoints in the Content Validity
Figure 2
the degree to which the test's content appropriately represents the construct, related to its depth,
breadth, and cognitive complexity, generally, through the comparative analysis of standards or
curriculum framework with the test items (APA, AERA, NCME, 2014; Sireci & Faulkner-Bond, 2014).
Based on the above considerations, the focus of this written will be on Content Validity
applied to the Classroom Test context. First of all, it is worth mentioning that the statistical methods
for analysing validity evidence, from the Classical Theory and the Item Response Theory, are very
complex and not always relevant to classroom settings since Psychometric Validity Theory was
primarily developed for large-scale assessment context, which is evidently different from the
administered by teachers to collect information and assess students' level of certain knowledge,
skills, abilities to be interpreted in standardized conditions for different purposes. Academic testing
is a complex activity as latent constructs o abstract ideas such as aptitudes o knowledge cannot be
directly measured. Therefore, inferences o conclusions are taken from secondary observable
b) Formative (assessment for learning) aims to improve learning and teaching practice.
(Wesolowski, 2020b).
These different purposes of classroom test challenge traditional validation methods and
sometimes fails to meet the classroom test needs (Wesolowski, 2020b). Yet, in spite of some
incongruencies in the applicability and utility of the Psychometric Validity Theory to the classroom
settings and the lack of literature on validation methods in the context and purposes of classroom
test, Validity is fundamental in this settings to ensure good inferences of student achievement are
made since many decisions are taken based on the interpretation of scores, impacting students'
resulted from any classroom assessment, should be validated through a logical argument (Bonner,
2013). Being that, the extent that tests are developed with quality, the educational decisions
can be trusted on the quality of evidence gathered (DiDonato-Barnes et al., 2014). In other words, it
is the confidence that the teacher makes quality inferences about students learning performance.
Therefore, the validation process would include gathering evidence to support the inferences about
In this regard, the teacher's responsibility is to define the construct, identify the specific
observable behaviours, and align teaching with the observable behaviours (Wesolowski, 2020b).
b) Level of thinking process: the cognitive rigour in the test concerning the cognitive
performance.
For the classroom context, the most significant validity evidence to be gathered is the Test
Content since tests constructed by teachers should accurately reflect the content being taught and
the learning objectives from a curricular framework. Test Content Validity analysis consists of
assessing the representativeness of the teaching time and lesson plans with the test content
(Bonner, 2013; Weller, 2015). It would include a deep analysis of the items, tasks, format, wording
achieved cognitive ability of each topic studied in class. A way to carry it out is by the
b) Designing and developing a test: The test content and the number of items have to
be according to the emphasis given in teaching; each test item has to correlate with
DiDonato-Barnes, 2013).
c) Test evaluation: It is recommended that the test draft be submitted to a pilot study.
Besides, a peer educator (judge) can evaluate whether the test item measures the
determined purpose and content, analysing the item quality and suggesting any
incorporated, and a final draft test may be resubmitted to a last analysis (Weller,
2015).
Since the above exposed, it is pertinent to bring up some mistakes in the classroom test that
affect Content Validity, such as ambiguous questions with multiples interpretations, bias items that
favour specific subgroups of students, use of jargon unfamiliar o a difficult level of language for
students. An item extremely difficult o extremely easy should be avoided as it does not identify good
As previously mentioned, there are sophisticated statistical methods to calculate the judges'
coefficient value of agreement in a content validation process. However, although it would be useful
in a classroom setting, it may not be practical to perform either relevant to teachers daily basis
interest since the lack of statistical preparation and tools to support the analysis, time and expert
available (Yusoff, 2019). Therefore, in the following lines, an excellent tool accessible to teachers is
Test construction is a real challenge for many teachers and often lack Content Validity
(Siddiek, 2010; Weller, 2015). The mismatch and lack of coherence between the content during
learning experience and summative assessment are prevalent in the classroom context (Fives &
DiDonato-Barnes, 2013). Students can perceive when a test is not congruent with teaching,
The shortfall of correspondence between teaching and assessment mentioned before leads
to invalid conclusions about students performance. For that reason, the Table of Specification
Content evidence of Validity (Bonner, 2013; Fives & DiDonato-Barnes, 2013). It helps to align and
make clear connections between learning objectives, cognitive level, teaching and assessment (Ing
et al., 2015).
Table of Specifications helps achieve a well-structured test, balanced with each topic's size,
cognitive levels (Siddiek, 2010). Through determining the content domain, mapping the amount of
teaching time spent on each learning objective with the cognitive level at which was taught and the
type of items that should be included (Fives & DiDonato-Barnes, 2013; Zamanzadeh et al., 2014).
The Table of Specifications is a two-way chart that contains an extensive description of the
list of topic that should be included in a test, the cognitive level to be assessed, the emphasis
proportion for each one and the learning objectives intended. It also specifies the item format used
to evaluate each topic, the number of questions and the marks awarded for each question (Maizam,
2005).
number of item and mark awarded) listed across the top columns.
b. Determinate the test content: the topics and subtopics that the test will be covered.
They have to be exactly the same content seen in teaching according to the lesson
plan.
c. Define the learning objectives: behaviours that students would be able to perform
after teaching. Blooms Taxonomy can be used to determine the appropriate level of
d. Determine the item format that will be used to assess the specific cognitive level; it
is important to mention that any type of item can be redacted to assess any
each case.
e. Distribute the weight of teaching time based on the relevance gave in class.
f. Determine the number of items of each element according to the amount of time
spent teaching.
g. Determine the mark students will achieve on each element, following a strict
correspondence with the relevance given and the cognitive level assessed (Gul,
2016).
As have been discussed, the Table of Specifications improves Content Validity by providing
the opportunity to outline the link between objectives, teaching and assessment. In an interesting
experimental study, undergraduates students of education trained in test planning and Table of
Specification created better classroom tests reflected in more outstanding scores in Test Content
Validity Analysis compared with the control group who did not receive any planning assessment tool
Comprehend
Remember
Content Learning Objectives
Evaluate
Analyse
Create
I. Southern Colonies Apply
I.I Southern Identify the Southern Colonies on a map. Matching
X 10% 1 1
Colonies location Image
I.II Maryland Identify who colonized Maryland and explain why
people colonized Maryland X ----- 10% 0 0
Colonization
II. Southern Colonies History
II.I People from the Predict how did people in each of the Southern Short
Colonies made a living. X 15% 3 5
Southern Colonies answer
Describe the difference between fact and opinion. True and
II.II Fact and opinion X 15% 3 5
False
III. Exploring Southern Colonies
Apply geographic tools, including legends and symbols, Practical
III.I Sub-topic to collect, analyze, and interpret data. X 25% 4 7
Exercise
Explain the geographic factors that influenced the
III.II Subtopic X Essay 25% 4 7
development of plantations in the Southern Colonies.
revised frequently since many academic decisions are taken based on test scores (Ing et al., 2015;
Siddiek, 2010). For that reason, the importance of Validity cannot be emphasized more, mainly Test
Content Validity, being that teaching and assessment should be seen as a reciprocal process where
the cognitive level is aligned among them. Content Validity benefits decision making and assures
quality evaluative judgments about students’ learning (DiDonato-Barnes et al., 2014; Weller, 2015)
A lack of Content Validity in a test arouses anxiety and stress in both teacher and student.
On one side, students feel uncertain and fearful about which content, type of questions and
cognitive levels will be evaluated. They can experience less motivation to engage in the learning
activities, and their efforts would be only around to collect good marks and not in learning. Thus, the
way teachers design the test will determine what and how students are going to study for their
exams (Siddiek, 2010). On the other hand, teachers’ labour comes to nothingness as the inferences
of the test score will not reflect the actual students’ learning (Siddiek, 2010).
It is teachers’ responsibility to build valid assessment instruments, not only the mere actions
of giving marks but also to promote students' best achievement (Siddiek, 2010). Notwithstanding, a
core problem arises when teachers are supposed to be responsible for a high-quality test
elaboration.
There is an evident concern that teachers give less attention and time preparation to
develop high-quality classroom test. They have poor knowledge of testing methods and test quality
properties such as Validity, Reliability, item analysis, the taxonomy of educational objectives, etc.
Many educational programmes do not have an extensive course on assessment methods. Therefore,
they have fundamental gaps in their understanding (Bonner, 2013; Siddiek, 2010).
For that reason, the appropriate assessment training needs to be provided to formulate
valid tests that can collect reliable data to make accurate inferences of students’ level of
incorporated into teaching programs. Particularly in topics related to validity, reliability, and
teachers can outline coherent links between learning objectives, teaching and assessment.
Nonetheless, it should be mentioned that teachers have a low understanding of developing a Table
tools to build valid assessment instruments. Also, there is a need to provide appropriate training for
teachers on designing and developing a Table of specifications (Ing et al., 2015). All the same, using a
Table of Specifications does not guarantee an easy way to quality assessment instrument; the level
of expertise in teaching and evaluation may improve the development of ToS (DiDonato-Barnes et
al., 2014).
References
APA, AERA, NCME. (2014). Standards for educational and Psychological Testing. American
Bonner, S. M. (2013). Validity in Classroom Assessment: Purposes, Properties, and Principles. Sage
https://doi.org/10.4135/9781452218649.n6
Fives, H., & DiDonato-Barnes, N. (2013). Classroom Test Construction: The Power of a Table of
https://doi.org/10.5539/ass.v11n5p193
Maizam, A. (2005). Assessment of learning outcomes: validity and reliability of classroom test. World
Siddiek, A. G. (2010). The Impact of Test Content Validity on Language Teaching and Learning. Asian
Sireci, S. (1998). Gathering and Analyzing Content Validity Data. Educational Assessment, 5(4), 299–
321. https://doi.org/10.1207/s15326977ea0504_2
Sireci, S., & Faulkner-Bond, M. (2014). Validity evidence based on test content. Psicothema, 26(1),
100–107. https://doi.org/10.7334/psicothema2013.256
Weller, L. D. (2015). Building Validity and Reliability into Classroom Tests. NASSP Bulletin, 85(622),
32–37. https://doi.org/10.1177/019263650108562205
https://doi.org/10.1177/0027432119894634
Yusoff, M. S. B. (2019). ABC of Content Validation and Content Validity Index Calculation. Education
Details of content validity and objectifying it in instrument development. Nursing Practice Today,
1(3), 163–171.