0 Essay Content Validity in Classroom Tests

MSc Education (Neuroscience and Education)
Assessment in Schools EDUC M0037

X X
Content Validity in Classroom Tests

2980
YES
Validity in Classroom Tests
Assessment is a fundamental activity in the teaching work; classroom test promotes
students' learning and facilitates teachers' decision-making favouring students' individual needs and
teaching strategies (Siddiek, 2010). For that reason, educational tests should be featured by quality
properties such as validity, reliability and fairness.
This essay will display foremost the conceptualization of validity from the psychometric
perspective to its applicability to the classroom setting. Also, the Test Content will be highlighted as
the ideal source of validity evidence for the Classroom test. Table of specification will be presented
as an effective planning tool to ensure Content Validity and link objectives, instruction and
assessment for test constructed by teachers. Lastly, some conclusions and implications will be
mentioned.
Validity Concept
Validity is the most fundamental consideration in developing and evaluating tests. The
American Psychological Association (APA), American Educational Research Association (AERA) and
National Council of Measurement in Education (NCME) (2014), in their Standards for educational and
Psychological Testing, which is the most relevant and orthodox referent in developing and evaluating
tests (Elosua, 2003), define Validity as a unitary concept, the degree to which accumulated evidence
and theory support the interpretations of test scores for the proposed uses of the test (APA, AERA,
NCME, 2014).
It is emphasised that what is validated is the test scores' interpretations for specific uses, not
the test itself (APA, AERA, NCME, 2014). Another traditional but still relevant definition of Validity is
"the degree to which a test is measuring what it is supposed to measure" (Maizam, 2005).
In this line, validation is the process of collecting significant evidence, which is used as
arguments for and against the intended interpretation of test scores and their relevance to the
proposed use (Elosua, 2003). This evidence can be obtained from various sources of information
such as the Test Content, Response Processes, Internal Structure, Relationship to Other Variables
and Consequences of the Test (APA, AERA, NCME, 2014).
Figure 1
Validity Evidence sources
Test Content
Consequences Response
of the test Processes
Validity
Relationship to Internal
other variables Structure
Having mentioned this, the basis of a Validity study is to specify the intended uses and the
proposed interpretation of test scores with its rationale. As well as the determination of the
construct (knowledge, skills, abilities, processes, competencies) the test is intended to measure
(APA, AERA, NCME, 2014, Sireci & Faulkner-Bond, 2014).
Test Content Validity evidence
Having spell out the definition of Validity, the following ideas will be centred around Test
Content Validity evidence, which is an essential property in test development, vital to ensure the
overall Validity and prerequisite for other types of validity evidence (Weller, 2015; Yusoff, 2019;
Zamanzadeh et al., 2014). It refers to the degree to which the test faithfully measures the domain
intended to assess (Sireci, 1998). It includes logical and empirical analyses concerning the degree to
which a test's content (themes, wording and format of the items, administration and scores)
represents the construct and is congruent with testing purposes (APA, AERA, NCME, 2014; Sireci &
Faulkner-Bond, 2014).
Four main elements are considered in Content Validity analysis:
a) Domain definition: an operational definition of the construct measured, using a
table of specifications detailing the content areas and sub-areas that the test is
designed to measure with the specific content standards and objectives from the
curricular framework to these contents.
b) Domain representation: how the test adequately represents and measures the
content and cognitive areas as defined in the test specifications; that is, if the items
are proportional to the construct.
c) Domain relevance: the extent to which each item on a test is relevant to the specific
aspects of the test objectives.
d) Appropriateness of the development process: the procedures during the test
development, such as technical accuracy, quality writing, sensitivity review of the
items avoiding offensive, advantage o disadvantage contents for students' sub-
groups (Sireci, 1998; Sireci & Faulkner-Bond, 2014; Yusoff, 2019).
The traditional psychometric method to evaluate Content Validity is to calculate the Content
Validity Index (CVI) (see Figure 2), which is a percentage (calculated by a statistical formula) of the
agreement of a subject matter experts panel who analyse the representativeness and adequacy
comprehensiveness, appropriateness, relevance and clarity of each test item (Sireci & Faulkner-
Bond, 2014).
They as well provide, in consensus, valuables recommendations to improve the test. The
panel of experts should be experienced and qualified individuals in the content domain of the test. It
is recommended that at least five experts participate with their viewpoints in the Content Validity
analysis (Sireci & Faulkner-Bond, 2014; Zamanzadeh et al., 2014).
Figure 2
The definition and formula of the CVI
(Yusoff, 2019, p. 53)
In educational testing, alignment study is emphasised in Content Validity evidence, which is
the degree to which the test's content appropriately represents the construct, related to its depth,
breadth, and cognitive complexity, generally, through the comparative analysis of standards or
curriculum framework with the test items (APA, AERA, NCME, 2014; Sireci & Faulkner-Bond, 2014).
Validity in the Classroom Test
Based on the above considerations, the focus of this written will be on Content Validity
applied to the Classroom Test context. First of all, it is worth mentioning that the statistical methods
for analysing validity evidence, from the Classical Theory and the Item Response Theory, are very
complex and not always relevant to classroom settings since Psychometric Validity Theory was
primarily developed for large-scale assessment context, which is evidently different from the
classroom test purposes (Elosua, 2003; Wesolowski, 2020b).

It is important to bear in mind that a classroom test is a systematic and uniform tool
administered by teachers to collect information and assess students' level of certain knowledge,
skills, abilities to be interpreted in standardized conditions for different purposes. Academic testing
is a complex activity as latent constructs o abstract ideas such as aptitudes o knowledge cannot be
directly measured. Therefore, inferences o conclusions are taken from secondary observable
behaviours by tasks performance (Maizam, 2005; Wesolowski, 2020b).
As mentioned before, classroom tests are student-centred, non-standardised and student-
learning-based. The main purposes of classroom assessment are:
a) Summative (assessment of learning) aims to measure and draw inferences of scores
to inform students' learning level. It sometimes implies highly consequential
purposes such as promotion, certification and selection.
b) Formative (assessment for learning) aims to improve learning and teaching practice.
Highlights feedback and students' involvement in the assessment process
(Wesolowski, 2020b).
These different purposes of classroom test challenge traditional validation methods and
sometimes fails to meet the classroom test needs (Wesolowski, 2020b). Yet, in spite of some
incongruencies in the applicability and utility of the Psychometric Validity Theory to the classroom
settings and the lack of literature on validation methods in the context and purposes of classroom
test, Validity is fundamental in this settings to ensure good inferences of student achievement are
made since many decisions are taken based on the interpretation of scores, impacting students'
academic life, learning outcomes and motivation (Wesolowski, 2020a).
Therefore, any interpretation and use of the information, be it qualitative o quantitative
resulted from any classroom assessment, should be validated through a logical argument (Bonner,
2013). Being that, the extent that tests are developed with quality, the educational decisions
followed will be quality (Wesolowski, 2020a).

Thus, in the classroom test context, Validity is the degree to which the students' evaluation
can be trusted on the quality of evidence gathered (DiDonato-Barnes et al., 2014). In other words, it
is the confidence that the teacher makes quality inferences about students learning performance.
Therefore, the validation process would include gathering evidence to support the inferences about
student achievement in relation to the test performance (Wesolowski, 2020b).
In this regard, the teacher's responsibility is to define the construct, identify the specific
observable behaviours, and align teaching with the observable behaviours (Wesolowski, 2020b).
Wesolowski (2020a) mentions three important aspects of classroom test Validity:
a) Relevance: alignment between national curriculum, objectives, content taught and
the test content.
b) Level of thinking process: the cognitive rigour in the test concerning the cognitive
rigour in the class content.
c) Congruency: relationship of the outcome with the previous patterns of student
performance.
Test content-based evidence in the Classroom Tests
For the classroom context, the most significant validity evidence to be gathered is the Test
Content since tests constructed by teachers should accurately reflect the content being taught and
the learning objectives from a curricular framework. Test Content Validity analysis consists of
assessing the representativeness of the teaching time and lesson plans with the test content
(Bonner, 2013; Weller, 2015). It would include a deep analysis of the items, tasks, format, wording
and cognitive processing level required of students (DiDonato-Barnes et al., 2014).
To assure Content Validity in the development and administration process of a classroom
test, these elements should be taken into account:

a) Determine the test's objectives: the criterion used to determine the level of
achieved cognitive ability of each topic studied in class. A way to carry it out is by the
taxonomies of educational objectives (Maizam, 2005; Siddiek, 2010).
b) Designing and developing a test: The test content and the number of items have to
be according to the emphasis given in teaching; each test item has to correlate with
an objective of the content taught. One important tool is a table of specifications; It
provides a workspace to connect objectives, instruction and assessment (Fives &
DiDonato-Barnes, 2013).
c) Test evaluation: It is recommended that the test draft be submitted to a pilot study.
Besides, a peer educator (judge) can evaluate whether the test item measures the
determined purpose and content, analysing the item quality and suggesting any
amendments to assure Validity. Any recommendation given by the judge should be
incorporated, and a final draft test may be resubmitted to a last analysis (Weller,
2015).
Since the above exposed, it is pertinent to bring up some mistakes in the classroom test that
affect Content Validity, such as ambiguous questions with multiples interpretations, bias items that
favour specific subgroups of students, use of jargon unfamiliar o a difficult level of language for
students. An item extremely difficult o extremely easy should be avoided as it does not identify good
from poor performance (Maizam, 2005).
As previously mentioned, there are sophisticated statistical methods to calculate the judges'
coefficient value of agreement in a content validation process. However, although it would be useful
in a classroom setting, it may not be practical to perform either relevant to teachers daily basis
interest since the lack of statistical preparation and tools to support the analysis, time and expert
available (Yusoff, 2019). Therefore, in the following lines, an excellent tool accessible to teachers is
proposed to assure Validity in the classroom tests.

Table of Specification
Test construction is a real challenge for many teachers and often lack Content Validity
(Siddiek, 2010; Weller, 2015). The mismatch and lack of coherence between the content during
learning experience and summative assessment are prevalent in the classroom context (Fives &
DiDonato-Barnes, 2013). Students can perceive when a test is not congruent with teaching,
producing discontentment and a sense of unfairness in assessment (Ing et al., 2015).
The shortfall of correspondence between teaching and assessment mentioned before leads
to invalid conclusions about students performance. For that reason, the Table of Specification
becomes absolutely relevant in classroom assessments context as it provides a source of Test
Content evidence of Validity (Bonner, 2013; Fives & DiDonato-Barnes, 2013). It helps to align and
make clear connections between learning objectives, cognitive level, teaching and assessment (Ing
et al., 2015).
Table of Specifications helps achieve a well-structured test, balanced with each topic's size,
representative of the teaching objectives, and appropriate distribution of questions at different
cognitive levels (Siddiek, 2010). Through determining the content domain, mapping the amount of
teaching time spent on each learning objective with the cognitive level at which was taught and the
type of items that should be included (Fives & DiDonato-Barnes, 2013; Zamanzadeh et al., 2014).
The Table of Specifications is a two-way chart that contains an extensive description of the
list of topic that should be included in a test, the cognitive level to be assessed, the emphasis
proportion for each one and the learning objectives intended. It also specifies the item format used
to evaluate each topic, the number of questions and the marks awarded for each question (Maizam,
2005).
The steps to develop the Table of Specifications are:

a. Develop a two-way chart: the content areas will be listed on the left and test
specifications (learning objectives, teaching time, cognitive level item format,
number of item and mark awarded) listed across the top columns.
b. Determinate the test content: the topics and subtopics that the test will be covered.
They have to be exactly the same content seen in teaching according to the lesson
plan.
c. Define the learning objectives: behaviours that students would be able to perform
after teaching. Blooms Taxonomy can be used to determine the appropriate level of
cognitive processing for each objective.
d. Determine the item format that will be used to assess the specific cognitive level; it
is important to mention that any type of item can be redacted to assess any
cognitive level; it would depend on its appropriateness and the particularities of
each case.
e. Distribute the weight of teaching time based on the relevance gave in class.
f. Determine the number of items of each element according to the amount of time
spent teaching.
g. Determine the mark students will achieve on each element, following a strict
correspondence with the relevance given and the cognitive level assessed (Gul,
2016).
As have been discussed, the Table of Specifications improves Content Validity by providing
the opportunity to outline the link between objectives, teaching and assessment. In an interesting
experimental study, undergraduates students of education trained in test planning and Table of
Specification created better classroom tests reflected in more outstanding scores in Test Content
Validity Analysis compared with the control group who did not receive any planning assessment tool
training (DiDonato-Barnes et al., 2014).

Table 1
Table of Specification example
Table of Specification format

Teaching
Item No. of
Cognitive Level time % Mark
Format items
Relevance
Comprehend
Remember
Content Learning Objectives
Evaluate
Analyse
Create
I. Southern Colonies Apply
I.I Southern Identify the Southern Colonies on a map. Matching
X 10% 1 1
Colonies location Image
I.II Maryland Identify who colonized Maryland and explain why
people colonized Maryland X ----- 10% 0 0
Colonization
II. Southern Colonies History
II.I People from the Predict how did people in each of the Southern Short
Colonies made a living. X 15% 3 5
Southern Colonies answer
Describe the difference between fact and opinion. True and
II.II Fact and opinion X 15% 3 5
False
III. Exploring Southern Colonies
Apply geographic tools, including legends and symbols, Practical
III.I Sub-topic to collect, analyze, and interpret data. X 25% 4 7
Exercise
Explain the geographic factors that influenced the
III.II Subtopic X Essay 25% 4 7
development of plantations in the Southern Colonies.
Total 100% 15 25%
Modified from (DiDonato-Barnes et al., 2014)
1. Implications and Conclusions
To summarize, assessment in educational settings should be well planned, designed and
revised frequently since many academic decisions are taken based on test scores (Ing et al., 2015;
Siddiek, 2010). For that reason, the importance of Validity cannot be emphasized more, mainly Test
Content Validity, being that teaching and assessment should be seen as a reciprocal process where
the cognitive level is aligned among them. Content Validity benefits decision making and assures
quality evaluative judgments about students’ learning (DiDonato-Barnes et al., 2014; Weller, 2015)
A lack of Content Validity in a test arouses anxiety and stress in both teacher and student.
On one side, students feel uncertain and fearful about which content, type of questions and
cognitive levels will be evaluated. They can experience less motivation to engage in the learning
activities, and their efforts would be only around to collect good marks and not in learning. Thus, the
way teachers design the test will determine what and how students are going to study for their
exams (Siddiek, 2010). On the other hand, teachers’ labour comes to nothingness as the inferences
of the test score will not reflect the actual students’ learning (Siddiek, 2010).
It is teachers’ responsibility to build valid assessment instruments, not only the mere actions
of giving marks but also to promote students' best achievement (Siddiek, 2010). Notwithstanding, a
core problem arises when teachers are supposed to be responsible for a high-quality test
elaboration.
There is an evident concern that teachers give less attention and time preparation to
develop high-quality classroom test. They have poor knowledge of testing methods and test quality
properties such as Validity, Reliability, item analysis, the taxonomy of educational objectives, etc.
Many educational programmes do not have an extensive course on assessment methods. Therefore,
they have fundamental gaps in their understanding (Bonner, 2013; Siddiek, 2010).
For that reason, the appropriate assessment training needs to be provided to formulate
valid tests that can collect reliable data to make accurate inferences of students’ level of
achievement. Courses in testing, educational evaluation, statistical techniques should be
incorporated into teaching programs. Particularly in topics related to validity, reliability, and
assessment tools (DiDonato-Barnes et al., 2014; Siddiek, 2010).

Table of Specification has been presented as a valuable Test Content Validity source in which
teachers can outline coherent links between learning objectives, teaching and assessment.
Nonetheless, it should be mentioned that teachers have a low understanding of developing a Table
of Specification due to the lack of awareness and training in this area.
Thus, it is essential to promote awareness of the importance of using planning assessment
tools to build valid assessment instruments. Also, there is a need to provide appropriate training for
teachers on designing and developing a Table of specifications (Ing et al., 2015). All the same, using a
Table of Specifications does not guarantee an easy way to quality assessment instrument; the level
of expertise in teaching and evaluation may improve the development of ToS (DiDonato-Barnes et
al., 2014).
References
APA, AERA, NCME. (2014). Standards for educational and Psychological Testing. American
Educational Research Association.
Bonner, S. M. (2013). Validity in Classroom Assessment: Purposes, Properties, and Principles. Sage
Handbook Od Research on Classroom Assessment, 87–106.
https://doi.org/10.4135/9781452218649.n6
DiDonato-Barnes, N., Fives, H., & Krause, E. S. (2014). Using a Table of Specifications to improve
teacher-constructed traditional tests: an experimental design. Assessment in Education:
Principles, Policy & Practice, 21(1), 90–108. https://doi.org/10.1080/0969594X.2013.808173
Elosua, P. (2003). Sobre la validez de los test. Psicothema, 15(2), 315–321.
Fives, H., & DiDonato-Barnes, N. (2013). Classroom Test Construction: The Power of a Table of
Specifications, 18(3). https://doi.org/10.7275/CZTT-7109

Gul, S. (2016). Integration of Table of Specification (ToS) in Academic Teaching and Evaluation.
Journal of Computing Technologies, 5(6).
Ing, L. M., Musah, M. B., Al-Hudawi, S. H., Tahir, L. M., & Kamil, N. M. (2015). Validity of Teacher-
Made Assessment: A Table of Specification Approach. Asian Social Science, 11(5).
https://doi.org/10.5539/ass.v11n5p193
Maizam, A. (2005). Assessment of learning outcomes: validity and reliability of classroom test. World
Transactions on Engineering and Technology Education, 4(2), 235–252.
Siddiek, A. G. (2010). The Impact of Test Content Validity on Language Teaching and Learning. Asian
Social Science, 6(12). https://doi.org/10.5539/ass.v6n12p133
Sireci, S. (1998). Gathering and Analyzing Content Validity Data. Educational Assessment, 5(4), 299–
321. https://doi.org/10.1207/s15326977ea0504_2
Sireci, S., & Faulkner-Bond, M. (2014). Validity evidence based on test content. Psicothema, 26(1),
100–107. https://doi.org/10.7334/psicothema2013.256
Weller, L. D. (2015). Building Validity and Reliability into Classroom Tests. NASSP Bulletin, 85(622),
32–37. https://doi.org/10.1177/019263650108562205
Wesolowski, B. C. (2020a). “Classroometrics”: The Validity, Reliability, and Fairness of Classroom
Music Assessments. Music Educators Journal, 106(3), 29–37.
https://doi.org/10.1177/0027432119894634
Wesolowski, B. C. (2020b). Validity, Reliability, and Fairness in Classroom Tests.
Yusoff, M. S. B. (2019). ABC of Content Validation and Content Validity Index Calculation. Education
in Medicine Journal, 11(2), 49–54. https://doi.org/10.21315/eimj2019.11.2.6
Zamanzadeh, V., Rassouli, M., Abbaszadeh, A., Majd, H. A., Nikanfar, A., & Ghahramanian, A. (2014).
Details of content validity and objectifying it in instrument development. Nursing Practice Today,
1(3), 163–171.

0 Essay Content Validity in Classroom Tests

Uploaded by

Copyright:

Available Formats

0 Essay Content Validity in Classroom Tests

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

0 Essay Content Validity in Classroom Tests

Uploaded by

Copyright:

Available Formats

MSc Education (Neuroscience and Education)

Assessment in Schools EDUC M0037

Content Validity in Classroom Tests

Assessment is a fundamental activity in the teaching work; classroom test promotes

properties such as validity, reliability and fairness.

and Consequences of the Test (APA, AERA, NCME, 2014).

Validity Evidence sources

(APA, AERA, NCME, 2014, Sireci & Faulkner-Bond, 2014).

Test Content Validity evidence

Four main elements are considered in Content Validity analysis:

a) Domain definition: an operational definition of the construct measured, using a

curricular framework to these contents.

are proportional to the construct.

aspects of the test objectives.

d) Appropriateness of the development process: the procedures during the test

development, such as technical accuracy, quality writing, sensitivity review of the

items avoiding offensive, advantage o disadvantage contents for students' sub-

groups (Sireci, 1998; Sireci & Faulkner-Bond, 2014; Yusoff, 2019).

analysis (Sireci & Faulkner-Bond, 2014; Zamanzadeh et al., 2014).

The definition and formula of the CVI

(Yusoff, 2019, p. 53)

In educational testing, alignment study is emphasised in Content Validity evidence, which is

Validity in the Classroom Test

classroom test purposes (Elosua, 2003; Wesolowski, 2020b).

behaviours by tasks performance (Maizam, 2005; Wesolowski, 2020b).

As mentioned before, classroom tests are student-centred, non-standardised and student-

learning-based. The main purposes of classroom assessment are:

a) Summative (assessment of learning) aims to measure and draw inferences of scores

to inform students' learning level. It sometimes implies highly consequential

purposes such as promotion, certification and selection.

Highlights feedback and students' involvement in the assessment process

academic life, learning outcomes and motivation (Wesolowski, 2020a).

Therefore, any interpretation and use of the information, be it qualitative o quantitative

followed will be quality (Wesolowski, 2020a).

student achievement in relation to the test performance (Wesolowski, 2020b).

Wesolowski (2020a) mentions three important aspects of classroom test Validity:

a) Relevance: alignment between national curriculum, objectives, content taught and

the test content.

rigour in the class content.

c) Congruency: relationship of the outcome with the previous patterns of student

Test content-based evidence in the Classroom Tests

and cognitive processing level required of students (DiDonato-Barnes et al., 2014).

To assure Content Validity in the development and administration process of a classroom

test, these elements should be taken into account:

taxonomies of educational objectives (Maizam, 2005; Siddiek, 2010).

an objective of the content taught. One important tool is a table of specifications; It

provides a workspace to connect objectives, instruction and assessment (Fives &

amendments to assure Validity. Any recommendation given by the judge should be

from poor performance (Maizam, 2005).

proposed to assure Validity in the classroom tests.

producing discontentment and a sense of unfairness in assessment (Ing et al., 2015).

becomes absolutely relevant in classroom assessments context as it provides a source of Test

representative of the teaching objectives, and appropriate distribution of questions at different

The steps to develop the Table of Specifications are:

specifications (learning objectives, teaching time, cognitive level item format,

cognitive processing for each objective.

cognitive level; it would depend on its appropriateness and the particularities of

training (DiDonato-Barnes et al., 2014).

Table of Specification example