0 Essay Content Validity in Classroom Tests

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

MSc Education (Neuroscience and Education)

Assessment in Schools EDUC M0037


X X

Content Validity in Classroom Tests


2980

YES
Validity in Classroom Tests

Assessment is a fundamental activity in the teaching work; classroom test promotes

students' learning and facilitates teachers' decision-making favouring students' individual needs and

teaching strategies (Siddiek, 2010). For that reason, educational tests should be featured by quality

properties such as validity, reliability and fairness.

This essay will display foremost the conceptualization of validity from the psychometric

perspective to its applicability to the classroom setting. Also, the Test Content will be highlighted as

the ideal source of validity evidence for the Classroom test. Table of specification will be presented

as an effective planning tool to ensure Content Validity and link objectives, instruction and

assessment for test constructed by teachers. Lastly, some conclusions and implications will be

mentioned.

Validity Concept

Validity is the most fundamental consideration in developing and evaluating tests. The

American Psychological Association (APA), American Educational Research Association (AERA) and

National Council of Measurement in Education (NCME) (2014), in their Standards for educational and

Psychological Testing, which is the most relevant and orthodox referent in developing and evaluating

tests (Elosua, 2003), define Validity as a unitary concept, the degree to which accumulated evidence

and theory support the interpretations of test scores for the proposed uses of the test (APA, AERA,

NCME, 2014).

It is emphasised that what is validated is the test scores' interpretations for specific uses, not

the test itself (APA, AERA, NCME, 2014). Another traditional but still relevant definition of Validity is

"the degree to which a test is measuring what it is supposed to measure" (Maizam, 2005).

In this line, validation is the process of collecting significant evidence, which is used as

arguments for and against the intended interpretation of test scores and their relevance to the

proposed use (Elosua, 2003). This evidence can be obtained from various sources of information
such as the Test Content, Response Processes, Internal Structure, Relationship to Other Variables

and Consequences of the Test (APA, AERA, NCME, 2014).

Figure 1

Validity Evidence sources

Test Content

Consequences Response
of the test Processes
Validity
Relationship to Internal
other variables Structure

Having mentioned this, the basis of a Validity study is to specify the intended uses and the

proposed interpretation of test scores with its rationale. As well as the determination of the

construct (knowledge, skills, abilities, processes, competencies) the test is intended to measure

(APA, AERA, NCME, 2014, Sireci & Faulkner-Bond, 2014).

Test Content Validity evidence

Having spell out the definition of Validity, the following ideas will be centred around Test

Content Validity evidence, which is an essential property in test development, vital to ensure the

overall Validity and prerequisite for other types of validity evidence (Weller, 2015; Yusoff, 2019;

Zamanzadeh et al., 2014). It refers to the degree to which the test faithfully measures the domain

intended to assess (Sireci, 1998). It includes logical and empirical analyses concerning the degree to
which a test's content (themes, wording and format of the items, administration and scores)

represents the construct and is congruent with testing purposes (APA, AERA, NCME, 2014; Sireci &

Faulkner-Bond, 2014).

Four main elements are considered in Content Validity analysis:

a) Domain definition: an operational definition of the construct measured, using a

table of specifications detailing the content areas and sub-areas that the test is

designed to measure with the specific content standards and objectives from the

curricular framework to these contents.

b) Domain representation: how the test adequately represents and measures the

content and cognitive areas as defined in the test specifications; that is, if the items

are proportional to the construct.

c) Domain relevance: the extent to which each item on a test is relevant to the specific

aspects of the test objectives.

d) Appropriateness of the development process: the procedures during the test

development, such as technical accuracy, quality writing, sensitivity review of the

items avoiding offensive, advantage o disadvantage contents for students' sub-

groups (Sireci, 1998; Sireci & Faulkner-Bond, 2014; Yusoff, 2019).

The traditional psychometric method to evaluate Content Validity is to calculate the Content

Validity Index (CVI) (see Figure 2), which is a percentage (calculated by a statistical formula) of the

agreement of a subject matter experts panel who analyse the representativeness and adequacy

comprehensiveness, appropriateness, relevance and clarity of each test item (Sireci & Faulkner-

Bond, 2014).

They as well provide, in consensus, valuables recommendations to improve the test. The

panel of experts should be experienced and qualified individuals in the content domain of the test. It
is recommended that at least five experts participate with their viewpoints in the Content Validity

analysis (Sireci & Faulkner-Bond, 2014; Zamanzadeh et al., 2014).

Figure 2

The definition and formula of the CVI

(Yusoff, 2019, p. 53)

In educational testing, alignment study is emphasised in Content Validity evidence, which is

the degree to which the test's content appropriately represents the construct, related to its depth,

breadth, and cognitive complexity, generally, through the comparative analysis of standards or

curriculum framework with the test items (APA, AERA, NCME, 2014; Sireci & Faulkner-Bond, 2014).

Validity in the Classroom Test

Based on the above considerations, the focus of this written will be on Content Validity

applied to the Classroom Test context. First of all, it is worth mentioning that the statistical methods

for analysing validity evidence, from the Classical Theory and the Item Response Theory, are very

complex and not always relevant to classroom settings since Psychometric Validity Theory was

primarily developed for large-scale assessment context, which is evidently different from the

classroom test purposes (Elosua, 2003; Wesolowski, 2020b).


It is important to bear in mind that a classroom test is a systematic and uniform tool

administered by teachers to collect information and assess students' level of certain knowledge,

skills, abilities to be interpreted in standardized conditions for different purposes. Academic testing

is a complex activity as latent constructs o abstract ideas such as aptitudes o knowledge cannot be

directly measured. Therefore, inferences o conclusions are taken from secondary observable

behaviours by tasks performance (Maizam, 2005; Wesolowski, 2020b).

As mentioned before, classroom tests are student-centred, non-standardised and student-

learning-based. The main purposes of classroom assessment are:

a) Summative (assessment of learning) aims to measure and draw inferences of scores

to inform students' learning level. It sometimes implies highly consequential

purposes such as promotion, certification and selection.

b) Formative (assessment for learning) aims to improve learning and teaching practice.

Highlights feedback and students' involvement in the assessment process

(Wesolowski, 2020b).

These different purposes of classroom test challenge traditional validation methods and

sometimes fails to meet the classroom test needs (Wesolowski, 2020b). Yet, in spite of some

incongruencies in the applicability and utility of the Psychometric Validity Theory to the classroom

settings and the lack of literature on validation methods in the context and purposes of classroom

test, Validity is fundamental in this settings to ensure good inferences of student achievement are

made since many decisions are taken based on the interpretation of scores, impacting students'

academic life, learning outcomes and motivation (Wesolowski, 2020a).

Therefore, any interpretation and use of the information, be it qualitative o quantitative

resulted from any classroom assessment, should be validated through a logical argument (Bonner,

2013). Being that, the extent that tests are developed with quality, the educational decisions

followed will be quality (Wesolowski, 2020a).


Thus, in the classroom test context, Validity is the degree to which the students' evaluation

can be trusted on the quality of evidence gathered (DiDonato-Barnes et al., 2014). In other words, it

is the confidence that the teacher makes quality inferences about students learning performance.

Therefore, the validation process would include gathering evidence to support the inferences about

student achievement in relation to the test performance (Wesolowski, 2020b).

In this regard, the teacher's responsibility is to define the construct, identify the specific

observable behaviours, and align teaching with the observable behaviours (Wesolowski, 2020b).

Wesolowski (2020a) mentions three important aspects of classroom test Validity:

a) Relevance: alignment between national curriculum, objectives, content taught and

the test content.

b) Level of thinking process: the cognitive rigour in the test concerning the cognitive

rigour in the class content.

c) Congruency: relationship of the outcome with the previous patterns of student

performance.

Test content-based evidence in the Classroom Tests

For the classroom context, the most significant validity evidence to be gathered is the Test

Content since tests constructed by teachers should accurately reflect the content being taught and

the learning objectives from a curricular framework. Test Content Validity analysis consists of

assessing the representativeness of the teaching time and lesson plans with the test content

(Bonner, 2013; Weller, 2015). It would include a deep analysis of the items, tasks, format, wording

and cognitive processing level required of students (DiDonato-Barnes et al., 2014).

To assure Content Validity in the development and administration process of a classroom

test, these elements should be taken into account:


a) Determine the test's objectives: the criterion used to determine the level of

achieved cognitive ability of each topic studied in class. A way to carry it out is by the

taxonomies of educational objectives (Maizam, 2005; Siddiek, 2010).

b) Designing and developing a test: The test content and the number of items have to

be according to the emphasis given in teaching; each test item has to correlate with

an objective of the content taught. One important tool is a table of specifications; It

provides a workspace to connect objectives, instruction and assessment (Fives &

DiDonato-Barnes, 2013).

c) Test evaluation: It is recommended that the test draft be submitted to a pilot study.

Besides, a peer educator (judge) can evaluate whether the test item measures the

determined purpose and content, analysing the item quality and suggesting any

amendments to assure Validity. Any recommendation given by the judge should be

incorporated, and a final draft test may be resubmitted to a last analysis (Weller,

2015).

Since the above exposed, it is pertinent to bring up some mistakes in the classroom test that

affect Content Validity, such as ambiguous questions with multiples interpretations, bias items that

favour specific subgroups of students, use of jargon unfamiliar o a difficult level of language for

students. An item extremely difficult o extremely easy should be avoided as it does not identify good

from poor performance (Maizam, 2005).

As previously mentioned, there are sophisticated statistical methods to calculate the judges'

coefficient value of agreement in a content validation process. However, although it would be useful

in a classroom setting, it may not be practical to perform either relevant to teachers daily basis

interest since the lack of statistical preparation and tools to support the analysis, time and expert

available (Yusoff, 2019). Therefore, in the following lines, an excellent tool accessible to teachers is

proposed to assure Validity in the classroom tests.


Table of Specification

Test construction is a real challenge for many teachers and often lack Content Validity

(Siddiek, 2010; Weller, 2015). The mismatch and lack of coherence between the content during

learning experience and summative assessment are prevalent in the classroom context (Fives &

DiDonato-Barnes, 2013). Students can perceive when a test is not congruent with teaching,

producing discontentment and a sense of unfairness in assessment (Ing et al., 2015).

The shortfall of correspondence between teaching and assessment mentioned before leads

to invalid conclusions about students performance. For that reason, the Table of Specification

becomes absolutely relevant in classroom assessments context as it provides a source of Test

Content evidence of Validity (Bonner, 2013; Fives & DiDonato-Barnes, 2013). It helps to align and

make clear connections between learning objectives, cognitive level, teaching and assessment (Ing

et al., 2015).

Table of Specifications helps achieve a well-structured test, balanced with each topic's size,

representative of the teaching objectives, and appropriate distribution of questions at different

cognitive levels (Siddiek, 2010). Through determining the content domain, mapping the amount of

teaching time spent on each learning objective with the cognitive level at which was taught and the

type of items that should be included (Fives & DiDonato-Barnes, 2013; Zamanzadeh et al., 2014).

The Table of Specifications is a two-way chart that contains an extensive description of the

list of topic that should be included in a test, the cognitive level to be assessed, the emphasis

proportion for each one and the learning objectives intended. It also specifies the item format used

to evaluate each topic, the number of questions and the marks awarded for each question (Maizam,

2005).

The steps to develop the Table of Specifications are:


a. Develop a two-way chart: the content areas will be listed on the left and test

specifications (learning objectives, teaching time, cognitive level item format,

number of item and mark awarded) listed across the top columns.

b. Determinate the test content: the topics and subtopics that the test will be covered.

They have to be exactly the same content seen in teaching according to the lesson

plan.

c. Define the learning objectives: behaviours that students would be able to perform

after teaching. Blooms Taxonomy can be used to determine the appropriate level of

cognitive processing for each objective.

d. Determine the item format that will be used to assess the specific cognitive level; it

is important to mention that any type of item can be redacted to assess any

cognitive level; it would depend on its appropriateness and the particularities of

each case.

e. Distribute the weight of teaching time based on the relevance gave in class.

f. Determine the number of items of each element according to the amount of time

spent teaching.

g. Determine the mark students will achieve on each element, following a strict

correspondence with the relevance given and the cognitive level assessed (Gul,

2016).

As have been discussed, the Table of Specifications improves Content Validity by providing

the opportunity to outline the link between objectives, teaching and assessment. In an interesting

experimental study, undergraduates students of education trained in test planning and Table of

Specification created better classroom tests reflected in more outstanding scores in Test Content

Validity Analysis compared with the control group who did not receive any planning assessment tool

training (DiDonato-Barnes et al., 2014).


Table 1

Table of Specification example

Table of Specification format


Teaching
Item No. of
Cognitive Level time % Mark
Format items
Relevance

Comprehend
Remember
Content Learning Objectives

Evaluate
Analyse

Create
I. Southern Colonies Apply
I.I Southern Identify the Southern Colonies on a map. Matching
X 10% 1 1
Colonies location Image
I.II Maryland Identify who colonized Maryland and explain why
people colonized Maryland X ----- 10% 0 0
Colonization
II. Southern Colonies History

II.I People from the Predict how did people in each of the Southern Short
Colonies made a living. X 15% 3 5
Southern Colonies answer
Describe the difference between fact and opinion. True and
II.II Fact and opinion X 15% 3 5
False
III. Exploring Southern Colonies
Apply geographic tools, including legends and symbols, Practical
III.I Sub-topic to collect, analyze, and interpret data. X 25% 4 7
Exercise
Explain the geographic factors that influenced the
III.II Subtopic X Essay 25% 4 7
development of plantations in the Southern Colonies.

Total 100% 15 25%

Modified from (DiDonato-Barnes et al., 2014)

1. Implications and Conclusions

To summarize, assessment in educational settings should be well planned, designed and

revised frequently since many academic decisions are taken based on test scores (Ing et al., 2015;

Siddiek, 2010). For that reason, the importance of Validity cannot be emphasized more, mainly Test

Content Validity, being that teaching and assessment should be seen as a reciprocal process where
the cognitive level is aligned among them. Content Validity benefits decision making and assures

quality evaluative judgments about students’ learning (DiDonato-Barnes et al., 2014; Weller, 2015)

A lack of Content Validity in a test arouses anxiety and stress in both teacher and student.

On one side, students feel uncertain and fearful about which content, type of questions and

cognitive levels will be evaluated. They can experience less motivation to engage in the learning

activities, and their efforts would be only around to collect good marks and not in learning. Thus, the

way teachers design the test will determine what and how students are going to study for their

exams (Siddiek, 2010). On the other hand, teachers’ labour comes to nothingness as the inferences

of the test score will not reflect the actual students’ learning (Siddiek, 2010).

It is teachers’ responsibility to build valid assessment instruments, not only the mere actions

of giving marks but also to promote students' best achievement (Siddiek, 2010). Notwithstanding, a

core problem arises when teachers are supposed to be responsible for a high-quality test

elaboration.

There is an evident concern that teachers give less attention and time preparation to

develop high-quality classroom test. They have poor knowledge of testing methods and test quality

properties such as Validity, Reliability, item analysis, the taxonomy of educational objectives, etc.

Many educational programmes do not have an extensive course on assessment methods. Therefore,

they have fundamental gaps in their understanding (Bonner, 2013; Siddiek, 2010).

For that reason, the appropriate assessment training needs to be provided to formulate

valid tests that can collect reliable data to make accurate inferences of students’ level of

achievement. Courses in testing, educational evaluation, statistical techniques should be

incorporated into teaching programs. Particularly in topics related to validity, reliability, and

assessment tools (DiDonato-Barnes et al., 2014; Siddiek, 2010).


Table of Specification has been presented as a valuable Test Content Validity source in which

teachers can outline coherent links between learning objectives, teaching and assessment.

Nonetheless, it should be mentioned that teachers have a low understanding of developing a Table

of Specification due to the lack of awareness and training in this area.

Thus, it is essential to promote awareness of the importance of using planning assessment

tools to build valid assessment instruments. Also, there is a need to provide appropriate training for

teachers on designing and developing a Table of specifications (Ing et al., 2015). All the same, using a

Table of Specifications does not guarantee an easy way to quality assessment instrument; the level

of expertise in teaching and evaluation may improve the development of ToS (DiDonato-Barnes et

al., 2014).

References

APA, AERA, NCME. (2014). Standards for educational and Psychological Testing. American

Educational Research Association.

Bonner, S. M. (2013). Validity in Classroom Assessment: Purposes, Properties, and Principles. Sage

Handbook Od Research on Classroom Assessment, 87–106.

https://doi.org/10.4135/9781452218649.n6

DiDonato-Barnes, N., Fives, H., & Krause, E. S. (2014). Using a Table of Specifications to improve

teacher-constructed traditional tests: an experimental design. Assessment in Education:

Principles, Policy & Practice, 21(1), 90–108. https://doi.org/10.1080/0969594X.2013.808173

Elosua, P. (2003). Sobre la validez de los test. Psicothema, 15(2), 315–321.

Fives, H., & DiDonato-Barnes, N. (2013). Classroom Test Construction: The Power of a Table of

Specifications, 18(3). https://doi.org/10.7275/CZTT-7109


Gul, S. (2016). Integration of Table of Specification (ToS) in Academic Teaching and Evaluation.

Journal of Computing Technologies, 5(6).

Ing, L. M., Musah, M. B., Al-Hudawi, S. H., Tahir, L. M., & Kamil, N. M. (2015). Validity of Teacher-

Made Assessment: A Table of Specification Approach. Asian Social Science, 11(5).

https://doi.org/10.5539/ass.v11n5p193

Maizam, A. (2005). Assessment of learning outcomes: validity and reliability of classroom test. World

Transactions on Engineering and Technology Education, 4(2), 235–252.

Siddiek, A. G. (2010). The Impact of Test Content Validity on Language Teaching and Learning. Asian

Social Science, 6(12). https://doi.org/10.5539/ass.v6n12p133

Sireci, S. (1998). Gathering and Analyzing Content Validity Data. Educational Assessment, 5(4), 299–

321. https://doi.org/10.1207/s15326977ea0504_2

Sireci, S., & Faulkner-Bond, M. (2014). Validity evidence based on test content. Psicothema, 26(1),

100–107. https://doi.org/10.7334/psicothema2013.256

Weller, L. D. (2015). Building Validity and Reliability into Classroom Tests. NASSP Bulletin, 85(622),

32–37. https://doi.org/10.1177/019263650108562205

Wesolowski, B. C. (2020a). “Classroometrics”: The Validity, Reliability, and Fairness of Classroom

Music Assessments. Music Educators Journal, 106(3), 29–37.

https://doi.org/10.1177/0027432119894634

Wesolowski, B. C. (2020b). Validity, Reliability, and Fairness in Classroom Tests.

Yusoff, M. S. B. (2019). ABC of Content Validation and Content Validity Index Calculation. Education

in Medicine Journal, 11(2), 49–54. https://doi.org/10.21315/eimj2019.11.2.6

Zamanzadeh, V., Rassouli, M., Abbaszadeh, A., Majd, H. A., Nikanfar, A., & Ghahramanian, A. (2014).

Details of content validity and objectifying it in instrument development. Nursing Practice Today,

1(3), 163–171.

You might also like