Toolkit 6 - Item Analysis

Item analysis
A toolkit for teacher development

Item analysis
These materials provide an overview of item analysis and how to use
the results to inform our item writing practice.
This module aimed at:
• In-service teachers of English as a foreign language at all stages of
education
Learning objectives
By the end of this module, you will have developed an
understanding of:
Key concepts in item analysis
Excel demonstration
Results interpretation
What is item analysis?
As teachers, we spend a ton of time thinking about How should I phrase

writing assessments, from choosing the right type of this question?
questions to writing the question which we’ll call
the stem to creating suitable potential answers
which we’ll call distractors. How am I going to measure
Where is ULIS located? Stem students’ comprehension of
reading a 500 word text?
A. Ha Noi Answer
B. Da Nang
C. Hue Distractors Multiple choice?
D. Sai Gon True/False? Essay?
What is item analysis?
• Our ability to write the questions in the But … I wonder if

assessments shapes the successes and failures
of our students. my questions are
even any good?
• Fortunately, there is a process which we can
use to examine how effective the questions
and answers are.
• This process is called “item analysis”.
Please note that this module looks at MCQ items

only. In the next slides we look at item analysis in
more detail.
Item analysis
Item analysis can tell us:
- How hard or easy a question is

- How high and low performers do on the question
- How students reacted to the distractors
Why do we need item analysis?
With our busy teaching schedules it can be difficult to find the time to
conduct item analysis. We think that it is important to do this work as
it tells us how good our tests are and good tests are needed to treat
our students fairly. Our test items should reward the students with a
good level of knowledge. Item analysis can tell us if our items are too
easy or too difficult.
In the following slides we discuss the reasons we do item analysis in
more depth.
Why do we need to
look at item analysis?
Advanced level
We may all agree that:

- there should be a connection between what is Student 2
taught or learned and what is assessed.
- Knowledge should be measured on a Student 1
continuum from basic to more advanced level.
Student 3
Basic level
Why do we need to
Advanced level
If our test is way too hard, our students may feel

frustrated and their level of confidence and motivation
can be negatively impacted. Student 2
If our test is way too easy, it may not say someone Student 1
actually knows the topic because anyone can answer the
questions that we have written.
Student 3
Basic level
Why do we need to
• If we develop a pool of questions that measure knowledge and skills across many
levels from easy to hard, we can improve our assessments by drawing on the
right question at the right time for a particular student.
Medium
Easy questions Hard questions
questions
Why do we need to
• By looking at item analysis, we can improve our items by:

- Identifying and editing poor items
- Removing poor items
- Rephrasing the question stem
- Reworking the distractors we’ve used
What do we look at in item analysis?
In this module, we look at:

- Item difficulty, also called facility value (FV)
- Discrimination index (DI)
- Distractor analysis (DA)
Please note that the measures are based on the Classical Test Theory and the
results are meaningful in the context of your current test population and may not
be generalisable to a wider target test population.
Item analysis - formula
• In the following slides we look at the formula used to work out the
three aspects of item analysis we are focusing on in this module.
• The formula are straightforward and can be worked out using a
calculator.
• Later in the module we will show you how you can use excel to make
the item analysis process smoother.
Item difficulty (FV)
• Item difficulty tells us how difficult a particular question was for our students,
based on how many got it right and how many got it wrong.
• It is reported between 0.0 (difficult) and 1.0 (easy)
• Good FV is 0.5 (Popham, 2000)
• Acceptable range is 0.3 – 0.7 (Bachman, 2004)
• This is the formula to calculate FV
Next steps
• The FV tells us how difficult the question was for our students.
• Now we need to consider how well different groups of students
responded to question.
• The DI, which we look at in detail on the next slide, helps us to
understand if our question has allowed our good performers to show
their high level of knowledge. It also shows if too many of our low
performers answered the question correctly.
• This can also be thought of as the way an item discriminates between
different groups of learners.
Discrimination index (DI)
• DI tells us how well the question can tell the difference between high and low
performers. The index can tell us who is understanding the topic and who is
not.
High total scores Low total scores

Discrimination index (DI)
• DI will range -1.0 to 1.0

• 0.4 and above = very good (it separates the groups of learners well)
• 0.3 - 0.39 = reasonably good (it separates the groups of learners)
• 0.2 - 0.29 = marginal items (everyone is performing similarly on the questions)
• 0.19 and below = poor items (need rewritten or discarded)
(Popham, 2000)
Distractor analysis
• The third part of item analysis is distractor analysis. This tells us if

the distractor answers are working well. A distractor which nobody
choses, for example, is a poor one and should be revised. In the
following slides we look at how to perform this analysis.
Distractor analysis
• First, we calculate how many students select each option given to

them for each question.
• Based on this frequency, we can decide which distractor is working
well and which is not.
Distractor analysis
• For example, in question 1, there are 27 students who selected A (answer) and
14 selected distractor B. Only 3 students selected distractor D and none
selected C.
A B C D
Question 1 27 14 0 3
• This may tell us that we may need to rewrite option C and D as they did not do
its job of distracting students from the correct answer (A).
Item analysis
• We have covered the first part of the module.

• Please feel free to take a break and come back later when
you’re ready.
Excel demonstration
In this part, we walk you through step-by-step the processes

to prepare your data file and calculate item difficulty,
discrimination index and distractor analysis.
Prepare your data set
1. Assign a number to each student in

your class.
2. In an excel spreadsheet, enter student
numbers to the first column (column
A)
In this example, we put from 1 to 10.
3. Use the first row, starting with

cell B1 to record Item numbers.
These are the questions asked in
your test.
4. Enter your answer key into the
second row.
5. Enter your data for closed-ended

items as A-D, as selected by students,
starting with the third row. This will
enable you to do distractor analysis
later on.
• The next step is converting the A-D values into 1s and 0s.
• This means 1 is given to the correct answer and 0 is given to
incorrect answers.
• For example, A is the correct answer for Item 1. We should convert A
into 1 whereas B, C and D should be converted into 0.
• This step is really important because without converting A, B, C, and
D into 1s and 0s, we cannot calculate FV and DI.
6. Create a separate tab and convert the A-D

values into 1s and 0s.
a. The easiest way to do this is to use the
Score Converter excel template on this
website. This will automatically convert A-D
into 1 and 0.
b. You can also do it manually by highlighting
a column and using the “Find + Replace”
function to replace all correct answers with
Don’t worry about “TRUE” on the second row.
a 1 and all incorrect answers with a 0.
7. Now all A-D values are converted into 1s and

0s.
Select all the data set, except the first two
rows.
Copy and Paste this into a new excel file for our
calculations of FV and DI.
Don’t worry about “TRUE” on the second row.

Calculating FV values
1. Use the '=Average (first cell: last cell)' formula, press enter
Calculating FV values
2. Click on the cell with the formula and drag it across the FV row. This will generate
facility values for the rest of your items.
3. Decide which items might be potentially 'too hard' (FV < 0.3) or 'too easy' (FV >
0.7).
Calculating discrimination index (DI)
1. Count total scores for each student by using SUM formular

2. Sort in order of total scores
- Highlight the data set and select “Sort and Filter” then “Customise”
- Select Total Column and from “largest to smallest”
3. Decide on proportion of 'high scorers' and 'low

scorers' (should be about 27% each)
In this example, there are 10 students in the
class, 27% of this total would be 3 students.
Therefore, the upper group would include
student 8, 2 and 4 while the lower group
includes student 5, 6 and 3.
We now need to look at the performance of
these students for each item in order to find the
item discrimination index of each item.
For item 1, all three students in the upper group answered it correctly while none
in the lower group answered it correctly.
4. Calculate discriminability index 'DI' for each item
DI for item 1 will be: (3-0)/3 = 1

5. Decide which items are apparently 'discriminating well' / 'not discriminating

well':
• 1 = an item is working perfectly
• 0.3 and above = an item is working well enough
• 0 = no discrimination
• - value = negative discrimination (i.e. weaker students are performing better than
stronger ones on this item)
6. Try to interpret the results, in conjunction with other item analysis

measurements (item facility values). Possible reasons for low discriminability
indices include:
• The item was so easy that it couldn’t discriminate correctly
• The item was so difficult that it couldn’t discriminate correctly
• A negative discrimination index may indicate that the item is measuring
something other than what the rest of the test is measuring.
• It may be a sign that the item has been mis-keyed (wrong key) or double keyed
(two correct answers)
Distractor analysis
Distractor analysis is based on counting the frequency of occurrences of the

distractors in the responses provided by the group.
If a distractor is not selected by many candidates, it means it is not attractive
enough and might need to be reviewed or replaced.
Distractor analysis
In excel you can use the =COUNIF formula to calculate distractor frequency:
1. Go the excel tab where you have the data recorded as A-D.
2. Scroll down to the bottom of your data set and add labels A-D to the 4 rows
below it.
Distractor analysis
3. Go to the bottom of the column for your first item, for

example column B. Add a formula =COUNTIF
(ColumnFistCellNumber: ColumnFistCellNumber: last
cell, "A"), for example: =COUNTIF (B3:B12, "A").
This will count all the occurrences of distractor A in
responses provided for item 1.
Distractor analysis
4. Drag the formula down to rows labelled B, C and

D and amend the letter in the formulae from A to
B, C, and D respectively. For example: =COUNTIF
(B3:B12, "B") to count the frequency for distractor
B.
Distractor analysis
There is no specific rule about how many times an

effective distractor is selected.
For a small pilot data set, it might be worth
reviewing distractors with frequencies lower than 3
(or 5% of the total number of the students) as the
low frequency may mean that the distractor is not
attractive enough and might need to be reviewed
or replaced.
How can we move forward
with the results of item analysis?
We have walked you through the process of calculating item difficulty index, item
discrimination index and distractor analysis.
By looking at these results, we can understand more about the quality of the items we
have written.
We can examine why an item is too hard for our students. Was it because:
- Its wording is not clear?
- It is beyond the scope of understanding of our students?
- Our students did not fully understand what was taught?
We may need to consider rephrasing the question or re-teach the topic.
• If the discrimination index is too low, it means that the students in the low
performing group got the answer correct at a higher rate than the higher
performing group.
• When more students who performed poorly on the overall exam were able
to correctly answer the question than students who performed well on the
exam, it indicates that the item should be reviewed.
• When both the difficulty and discrimination indices are outside of the
normal range, questions should be reviewed.
• Is the item well constructed?
• Are some of the distracters non-functioning?
• Does the question fairly represent concepts and content taught in the
course?
• We can look at the frequency of distractors selected to examine:

- Why our students did not select that option
- Why they selected the wrong answer
From these insights, we may revise the distractors or adjust our lessons to
make sure that they will not select the wrong answer next time.
Key take-aways
• That’s certainly a lot about item analysis, from how hard questions
are to which students are answering correctly, to how they are
choosing their answers.
• However, we think this is all vital for teachers to understand. There’s
more to a test than a score that students receive.
• There’s lots of thoughts that need to go into our end in writing a test.
• Understanding results from item analysis helps us write or choose the
best options to tell us which students know the topics and which ones
don’t.
Reference
• Brown, H. (2010). Language Assessment: Principles and
classroom practices. New York: Longman. Ch. 10
• Carr, N. (2011). Designing and Analyzing Language Tests: A
hands-on introduction to language testing theory and practice.
Oxford university press.
• Hughes, A. (2003). Testing for language teachers. Cambridge
university press.

Toolkit 6 - Item Analysis

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Toolkit 6 - Item Analysis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Toolkit 6 - Item Analysis

Uploaded by

Copyright:

Available Formats

Item analysis

A toolkit for teacher development

Key concepts in item analysis

As teachers, we spend a ton of time thinking about How should I phrase

• Our ability to write the questions in the But … I wonder if

Please note that this module looks at MCQ items

Item analysis can tell us:

- How hard or easy a question is

We may all agree that:

If our test is way too hard, our students may feel

• By looking at item analysis, we can improve our items by:

In this module, we look at:

High total scores Low total scores

• DI will range -1.0 to 1.0

• The third part of item analysis is distractor analysis. This tells us if

• First, we calculate how many students select each option given to

• We have covered the first part of the module.

In this part, we walk you through step-by-step the processes

1. Assign a number to each student in

3. Use the first row, starting with

5. Enter your data for closed-ended

6. Create a separate tab and convert the A-D

7. Now all A-D values are converted into 1s and

Don’t worry about “TRUE” on the second row.

1. Count total scores for each student by using SUM formular

3. Decide on proportion of 'high scorers' and 'low

4. Calculate discriminability index 'DI' for each item

DI for item 1 will be: (3-0)/3 = 1

5. Decide which items are apparently 'discriminating well' / 'not discriminating

6. Try to interpret the results, in conjunction with other item analysis

Distractor analysis is based on counting the frequency of occurrences of the

3. Go to the bottom of the column for your first item, for

4. Drag the formula down to rows labelled B, C and

There is no specific rule about how many times an

• We can look at the frequency of distractors selected to examine:

You might also like