Multiple Choice Items: Paper Nature of Student Assessment
Multiple Choice Items: Paper Nature of Student Assessment
Multiple Choice Items: Paper Nature of Student Assessment
by:
2010
CHAPTER I
INTRODUCTION
A. Background
The multiple-choice (MC) item is one of the most popular item formats used in
educational assessment. Most multiple-choice test questions are not as replete with errors as
this example, but you have probably seen many of the errors before. In addition to confusing
and frustrating students, poorly-written test questions yield scores of dubious value that are
inappropriate to use as a basis of evaluating student achievement. Compare the example
above with the following one: While this example may still leave room for improvement, it is
certainly superior to the first one. Well-written multiple-choice test questions do not confuse
students, and yield scores that are more appropriate to use in determining the extent to which
students have achieved educational objectives.
MC items can be constructed to assess a variety of learning outcomes, from simple
recall of facts to Bloom’s highest taxonomic level of cognitive skills – evaluation. It is
common knowledge that the correct answers should be distributed evenly among the
alternative positions of MC items, but there are many other important guidelines for writing
good items. For example, Haladyna (1999) describes 30 guidelines for writing MC items.
Space limitation precludes a discussion of all these guidelines here. We focus on eight
guidelines that we believe are generally not well recognized by chemistry teachers in
Indonesia. Illustrative examples are provided to demonstrate how these guidelines can be
applied to construct chemistry items.
B. Problems
To know about the multiple choice items play such an important role in achievement
testing, there are some important things which have to be considered. Here, we’re going to
discuss about:
2
6. How to construct multiple-choice items that are well stated, relevant to important
learning outcomes and free of defects ?
C. Purposes
Based on the problems above, the purposes of this discussion are to:
3
CHAPTER II
Multiple-choice questions are selection-type items. Students are given three or more
possible answers and are asked to choose the correct answer or the "best" answer. A standard
multiple-choice test item consists of two basic parts: a problem (stem) and a list of suggested
solutions (alternatives). The stem may be in the form of either a question or an incomplete
statement, and the list of alternatives contains one correct or best alternative (answer) and a
number of incorrect or inferior alternatives (distracters).
The purpose of the distracters is to appear as plausible solutions to the problem for
those students who have not achieved the objective being measured by the test item.
Conversely, the distracters must appear as implausible solutions for those students who have
achieved the objective. Only the answer should appear plausible to these students.
In this paper , an asterisk (*) is used to indicate the answer.
Example :
What is the main function of the salt bridge in an electrochemical cell? Stem
Answer *A. supply ions moving to the two half-cells
Distracter B. draw electrons from one half-cell to the other half-cell Alternative
Distracter C. keep the levels of solutions equal in the two half-cells
Distracter D. supply electrons to complete the circuit
The multiple choice items can be used to measure knowledge outcomes and various
type of complex learning outcomes. The single item format is probably most widely used for
measuring knowledge, comprehension, and application outcomes. The interpretive exercise
consisting of the series of multiple choice item based on introductory material (e.g.
paragraph, picture, or graph ). Is especially useful for measuring analysis, interpretation, and
other complex learning outcomes. Here we confine the discussion to the use of single ,
independent, multiple choice items.
4
1. Knowledge Items
Examples
2. Comprehension Items
Comprehension items typically measure at the lowest level understanding.
Involves students’ ability to read course content, extrapolate and interpret important
information and put other’s ideas into their own words. Test questions focus on use of
facts, rules and principles. Sample verbs for stating specific learning outcomes are :
classify, convert, describe, distinguish between, explain, extend, give examples,
illustrate, interpret, paraphrase, summarize, translate.
Example
In this last example, the student must recognize that increasing number of
alternative in the items produces the same effect as lengthening the test.
3. Application Items
Application items also measure understanding, but typically at higher level
than that of comprehension. The students must take new concepts and apply them to
6
another situation. Thus, application items determine the extent to which student can
transfer their learning and use it effectively in solving new problem. Such items may
call for the application of various aspects of knowledge, such as facts, concepts,
principle, rules, methods and theories. Both comprehension and application items
are adaptable to practically all area of subject matter, and they provide the basic
means of measuring understanding. Test questions focus on applying facts or
principles. To assess students, teacher must specify the learning outcomes so the
students will be more focus on what they’re going to perform. Sample verbs for
stating specific learning outcomes are : apply, arrange, compute, construct,
demonstrate, discover, modify, operate, predict, prepare, produce, relate, show,
solve, use.
Example
Outcome: Distinguishes between properly and improperly stated outcomes.
Which one of the following learning outcomes is properly stated in terms of
student performance?
A. Develops an appreciation of the importance of testing.
* B. Explains the purpose of test specifications.
C. Learns how to write good test items.
D. Realizes the importance of validity.
Multiple-choice test items are not a best test item. They have advantages and limitations just
as any other type of test item. Teachers need to be aware of these characteristics in order to
use multiple-choice items effectively.
1. Advantages
a. Versatility.
Multiple-choice test items are appropriate for use in many different subject-
matter areas, and can be used to measure a great variety of educational objectives.
They are adaptable to various levels of learning outcomes, from simple recall of
knowledge to more complex levels, such as the student’s ability to:
1) Analyze phenomena
2) Apply principles to new situations
3) Comprehend concepts and principles
4) Discriminate between fact and opinion
5) Interpret cause-and-effect relationships
6) Interpret charts and graphs
7) Judge the relevance of information
8) Make inferences from given data
9) Solve problems
The difficulty of multiple-choice items can be controlled by changing the
alternatives, since the more homogeneous the alternatives, the finer the distinction
the students must make in order to identify the correct answer. Multiple-choice
items are amenable to item analysis, which enables the teacher to improve the item
by replacing distracters that are not functioning properly. In addition, the
distracters chosen by the student may be used to diagnose misconceptions of the
student or weaknesses in the teacher’s instruction.
b. Validity.
8
In general, it takes much longer to respond to an essay test question than it
does to respond to a multiple-choice test item, since the composing and recording
of an essay answer is such a slow process. A student is therefore able to answer
many multiple-choice items in the time it would take to answer a single essay
question. This feature enables the teacher using multiple-choice items to test a
broader sample of course content in a given amount of testing time. Consequently,
the test scores will likely be more representative of the students’ overall
achievement in the course.
c. Reliability.
Well-written multiple-choice test items compare favorably with other test item
types on the issue of reliability. They are less susceptible to guessing than are
true-false test items, and therefore capable of producing more reliable scores.
Their scoring is more clear-cut than shortanswer test item scoring because there
are no misspelled or partial answers to deal with. Since multiple-choice items are
objectively scored, they are not affected by scorer inconsistencies as are essay
questions, and they are essentially immune to the influence of bluffing and writing
ability factors, both of which can lower the reliability of essay test scores.
d. Efficiency.
Multiple-choice items are amenable to rapid scoring, which is often done by
scoring machines. This expedites the reporting of test results to the student so that
any follow-up clarification of instruction may be done before the course has
proceeded much further. Essay questions, on the other hand, must be graded
manually, one at a time.
2. Limitations
a. Versatility.
Since the student selects a response from a list of alternatives rather than
supplying or constructing a response, multiple-choice test items are not adaptable
to measuring certain learning outcomes, such as the student’s ability to:
2) Articulate explanations
3) Display thought processes
4) Furnish information
5) Organize personal thoughts
6) Perform a specific task
7) Produce original ideas
9
8) Provide examples
Such learning outcomes are better measured by short answer or essay
questions, or by performance tests.
b. Reliability.
Although they are less susceptible to guessing than are true false-test items,
multiple-choice items are still affected to a certain extent. This guessing factor
reduces the reliability of multiple-choice item scores somewhat, but increasing
the number of items on the test offsets this reduction in reliability. The following
table illustrates this principle.
For example, if your test includes a section with only two multiple-choice
items of 4 alternatives each (a b c d), you can expect 1 out of 16 of your students to
correctly answer both items by guessing blindly. On the other hand if a section has
15 multiple-choice items of 4 alternatives each, you can expect only 1 out of 8,670
of your students to score 70% or more on that section by guessing blindly.
c. Difficulty of Construction.
Good multiple-choice test items are generally more difficult and time-
consuming to write than other types of test items. Coming up with plausible
distracters requires a certain amount of skill. This skill, however, may be increased
through study, practice, and experience.
1. The stem should be meaningful by itself and should present a definite problem.
10
A common fault in MC item writing is to have a brief, meaningless stem with
problem definition revealed in the options. In such cases, it can be difficult to see the intent of
the item after reading the stem. To write a focused item, we should include the central idea in
the stem instead of the options. In Item 1, the stem does not present a definite problem.
ITEM 1
Non-metals
A. cannot exist as solids at room temperature.
B. can combine only with metals to form stable compounds.
*C. usually have more than three electrons in the outermost shell of the atom.
D. are usually found on the left hand side of the Periodic Table.
The correct answer is indicated with an asterisk. Students are faced with four true-
false options; each is about non-metals, but only option C is correct. Furthermore, the four
options cover a set of widely dissimilar chemical ideas so that evaluation by comparison is
not possible. The stem can be judged to be clearly presenting a problem if it forces the
options to be parallel in type of content.
Item 2 demonstrates one way to make the stem become a definite problem. Students
can think about the correct answer rather than figuring out what the problem is. Also, the
clearly stated problem in the stem has forced the four options to be parallel in content.
ITEM 2
How many electrons could be found in the outermost shell of a non-metal atom?
A. 1
B. 2
C. 3
*D. 4
Similarly, Item 3 is a poorly written MC item. The stem fails to present a definite
problem and the four options appear to be a hodgepodge of chemical ideas. Clearly, Item 4 is
more focused than Item 3.The stem of Item 4 poses a clear, definite problem and assesses a
single learning objective.
ITEM 3
Which of the following statements concerning electrochemical cells is correct?
11
*A. There is a spontaneous chemical reaction in each electrochemical cell.
B. The e.m.f. of an electrochemical cell is measured in joules.
C. The anode is labeled (+) while the cathode is labeled (–).
D. The salt bridge provides electrons to complete the circuit.
ITEM 4
What is the main function of the salt bridge in an electrochemical cell?
*A. supply ions moving to the two half-cells
B. draw electrons from one half-cell to the other half-cell
C. keep the levels of solutions equal in the two half-cells
D. supply electrons to complete the circuit
ITEM 5
have the molecular formula CnH2n.
A. Alkanes
*B. Alkenes
C. Alkanols
D. Alkanoic acids
ITEM 6
Which type of organic substance has the molecular formula CnH2n?
A. alkanes
*B. alkenes
C. alkanols
12
D. alkanoic acids
3. Use a negatively stated stem only when significant learning outcomes require it.
Most students have difficulty understanding the meaning of negatively phrased items.
They often read through the negative terms such as not, no, and least, and forget to reverse
the logic of the relation being tested. For example, Items 7 and 8 assess the same concept of
chemistry, but some students may answer Item 7 incorrectly merely because of the word
least. Since least and concentrated are opposites, the phrase least concentrated is more
difficult to understand than the phrase most concentrated. Research by Cassels and Johnstone
(1984) has confirmed that the change from least concentrated to most concentrated will
increase the percent of correct responses.
ITEM 7
Which of the following solutions is the least concentrated?
A. 50 g of calcium carbonate in 100 cm3 of water
B. 60 g of sodium chloride in 200 cm3 of water
C. 65 g of potassium nitrate in 100 cm3 of water
*D. 120 g of potassium sulphate in 200cm3 of water
ITEM 8
Which of the following solutions is the most concentrated?
A. 50 g of calcium carbonate in 100 cm3 of water
B. 60 g of sodium chloride in 200 cm3 of water
*C. 65 g of potassium nitrate in 100 cm3 of water
D. 120 g of potassium sulphate in 200 cm3 of water
Although negatively phrased stems should generally be avoided, they are useful if we
want to assess whether students can identify dangerous laboratory practices that may damage
expensive equipment or result in bodily injury, and which should not be carried out. Item 9 is
an example of such an item. However, when a negative term is used, it should be emphasized
by being underlined or capitalized. Replacing the negative term with the word except can
sometimes improve clarity, as illustrated in Item 10. Few students would overlook the
negative element in the stem because the word except is deliberately placed at the end of the
stem and is capitalized.
13
ITEM 9
Water-type extinguisher is not suitable for putting out fire caused by burning
*A. alcohol.
B. cotton.
C. paper.
D. wood.
ITEM 10
Water-type extinguisher is suitable for putting out fire caused by burning all of the
following
EXCEPT
*A. alcohol.
B. cotton.
C. paper.
D. wood.
ITEM 11
What volume of water should be added to 57.35 cm3 of 1.96 M NaCl in order to dilute it
to
1.50 M?
*A. 17.59 cm3
B. 42.65 cm3
C. 74.94 cm3
D. 112.41 cm3
ITEM 12
14
What volume of water in cubic centimeters should be added to 60 cm3 of 2.0 M NaCl in
order to dilute it to 1.5 M?
*A. 20
B. 40
C. 80
D. 120
Similarly, we should not attempt to increase the difficulty of an item by using
unnecessarily complex or unfamiliar vocabulary, such as the word Topaz in Item 13. This
item aims at testing students’ understanding of the types of ions that give a yellow color. But
if students do not know that Topaz is yellow, they are lost. Item 14 is a better measure of the
same learning objective. The purpose of chemistry MC tests is to assess students’ knowledge,
understanding and problem solving, not reading proficiency.
ITEM 13
Which ion below is responsible for the colour of the gemstone called Topaz?
A. Cr3+
B. Cu2+
*C. Fe 3+
D. Mn3+
ITEM 14
Which ion below is probably responsible for the colour of yellow gemstones?
A. Cr3+
B. Cu2+
*C. Fe 3+
D. Mn3+
15
ITEM 15
Ordinary soft drinks like Coca-Cola have a pH about
A. 1
B. 2
*C. 3
D. 4
ITEM 16
Ordinary soft drinks like Coca-Cola have a pH about
*A. 3.
B. 5.
6
C. 6.
D. 8.
For Items 17 and 18, the correct answer is CuS2. According to research (Schmidt,
1987), students have two common misconceptions. Some tend to use the mass-ratio strategy
(Cu:S = 1:1) and select option A in Item 18 as the answer. Other students like to employ the
molar-mass-ratio strategy (Cu:S = 64:32) and think that option C in Item 18 is the correct
answer. Thus, CuS and Cu2S are good distracters and useful for diagnosis of students’
learning difficulties. Arbitrary distracters such as CuS3, Cu2S3 and Cu3S should be avoided.
ITEM 17
2 g of a compound contains 1 g copper, the rest is sulphur. Which one of the following
formulae correctly represents this compound?
*A. CuS2
B. CuS3
C. Cu2S3
D. Cu3S
ITEM 18
2 g of a compound contains 1 g copper, the rest is sulphur. Which one of the following
formulae correctly represents this compound?
A. CuS
*B. CuS2
16
C. Cu2S
D. Cu2S3
ITEM 19
Which of the following chemicals is/are contained in town gas?
(1) hydrogen
(2) sulphur dioxide
(3) carbon monoxide
(4) gaseous naphtha
A. (3) only
B. (1) and (2) only
C. (2) and (4) only
*D. (1), (3) and (4)
ITEM 20
What is the major constituent of the town gas in Hong Kong?
A. carbon monoxide
B. gaseous naphtha
*C. hydrogen
17
D. methane
7. The relative length of the options should not provide a clue to the answer.
Teachers are mostly unaware of this item-writing principle (Rodriguez, 1997). It is
common to express the correct response more carefully and at greater length than the
distracters. However, research (Chase, 1964) has indicated that longer options tend to result
in higher response rates. In Item 21, test wise students will notice that option D is much
longer than the other options. Even without a good understanding of the concept of sacrificial
protection, they will guess that the correct answer is D because it stands out from the others.
In Item 22, the correct answer is shortened and two distracters are rephrased to the desired
length. Although expanding the distracters can increase their specificity and plausibility,
teachers should not load them with irrelevant lengthiness or false technicality.
ITEM 21
Why is zinc better than tin if we want to protect a piece of iron from rusting by
electroplating?
A. Zinc is cheaper than tin.
B. Tin is toxic.
C. Zinc can prevent iron from contacting with water and air.
*D. Zinc is more reactive than iron and thus rusting is prevented even when the metal
ITEM 22
Why is zinc better than tin if we want to protect a piece of iron from rusting by
electroplating?
A. The cost of extraction of zinc from ores is lower than that of tin.
B. Tin is a toxic metal and causes incurable diseases.
C. Zinc can prevent iron from contacting with water and air.
*D. Rusting is prevented even when the zinc layer is broken.
ITEM 23
Which of the following substances would relight a glowing splint?
A. carbon dioxide
B. chlorine
C. nitrogen
*D. none of the above
ITEM 24
Which of the following substances would relight a glowing splint?
A. carbon dioxide
B. chlorine
C. nitrogen
*D. oxygen
The use of the option all of the above is also problematic. For example, Item 25 is
poorly constructed because a student may know that two of the three options offered are
correct and this information can clue the student into selecting all of the above. Thus, the
option format allows students to select the correct answer on the basis of only partial rather
than complete knowledge of the item. Item 26 shows an improved version.
ITEM 25
Which statement is true of most plastics?
A. They have no reaction with acids.
B. They can be molded easily.
C. They are flammable.
*D. All of the above.
ITEM 26
19
Which statement is true of most plastics?
*A. no reaction with acids
B. difficult to be molded
C. nonflammable
D. good conductors of heat
The option all of the above is still faulty even though it is not designed as the correct
answer in MC items. When a student recognizes that at least one option is incorrect, he or she
may immediately note that the all of the above option must be wrong. In such a case, the
option all of the above is not a functional distracters.
20
CHAPTER III
CONCLUSION
1. The multiple-choice item provides the most useful format for measuring achievement
at various levels of learning.When selection-type items are to be used (multiple-
choice, true-false, matching, check all that apply) an effective procedure is to start
each item as a multiple-choice item and switch to another item type only when the
learning outcome and content make it desirable to do so. It means that multiple choice
is the most highly regarded and useful selection type items
2. Multiple choice items consist of stem and a set of alternative answer. It can be
designed to measure various intended learing outcomes, ranging from simple to
complex
3. The multiple choice items in single item format is probably most widely used for
measuring knowledge, comprehension, and application outcomes.
a). Knowledge items typically measure the degree to which previously learned
material has been remembered.
b). Comprehension items measure the extent to which students have grasped the
meaning of material.
c). Application items measure the extent to which Students take new concepts and
apply them to another situation.
4. However, poorly written multiple choice items will introduce bias or distortion and
thus lower the dependability of test scores. More importantly, poorly written multiple
choice items cannot provide us with information to inform teaching and learning.
21
REFERENCE
Cros, D., Maurin, M., Amouroux, R., Chastrette, M., Leber, J. & Fayol, M. (1986).
Conceptions of first-year university students of the constituents of matter and the
notions of acids and bases. European Journal of Science Education, 8(3), 305- 313.
Ebel, R. L. & Frisbie, D. A. (1991). Essentials of educational measurement. Englewood
Cliffs, NJ: Prentice Hall.
Haladyna, T. M. (1999). Developing and validating multiple-choice test items. Mahwah:
Lawrence Erlbaum.
Rodriguez, M. C. (1997). The art & science of item writing: A meta-analysis of multiple-
choice item format effects. Paper presented at the annual meeting of the American
Educational Research
Association.
Schmidt, H. J. (1987). Secondary school students’ learning difficulties in stoichiometry. In J.
Novak (Ed), Misconceptions & educational strategies in science and mathematics
(pp.396-404).Ithaca: Cornell University.
From http://www3.fed.cuhk.edu.hk/chemistry/files/constructMC.pdf
22
23