Detecting Cheating in Lecture Halls

ritvikmath Blog About
drilling into data
Detecting Cheating in Lecture Halls
Using Statistics to Ensure Academic Honesty

Read Time: 10 min
I recently got to thinking about how a college professor could detect whether or not students were
cheating on a class exam. Of course, the only way to definitively accuse a student of cheating is to
observe them cheating in the moment. That is, observe them looking at another student’s exam, for
example. Still, there had to be some statistical ways to determine whether cheating had occurred
during the exam even if you can’t prove which students were responsible.
The idea would be fairly simple. From here on out, pretend you’re the professor of your favorite
course (and comment that course below!).
First, you would randomize students to seats on exam day, perhaps alphabetically or display a
randomized seating chart for students to follow.
Second, after the exam, you would count how many pairs of students (students sitting right next to
each other) had the exact same score on their exams.
Third, you would ask yourself if that many pairs with identical scores was “normal” or if it was too
high to be a coincidence.
Seems simple enough? Cool! Let’s talk about some of the details and run through an example.
For this to work, you’ll need three assumptions to hold. Good news is that these assumptions aren’t
far-fetched.
First off, you (the professor), need an adequate number of students in the course so that you
don’t run into issues with too small of a sample size. In reality, most lower division and some upper
division courses at many colleges can get quite large. For our example we will assume a lecture
hall which is 15 seats by 20 seats so that it accommodates 300 students on exam day.
Second, we will assume that you have taught this course multiple times in the past and
understand well what the grade distribution should look like. This, again, is realistic for a professor
who has been teaching a course for multiple terms. For our example exam, we will assume that the
grades are usually distributed normally (as in a bell-curve) with mean 70% and standard
deviation 15% with a maximum possible score of 100%. This distribution is shown below.
Expectation of Exam Distribution

Third, we assume that cheating can only happen between students sitting right next to each
other. Although it is possible that students can cheat off of the person sitting ahead of them, the fact
that there is a body in the way makes it much harder.
Before the Exam

Before the exam, you’ll want to understand how many pairs of students (sitting right next to each
other) you should expect to have the same score on their exams. Indeed, just because two
students sitting right next to each other got the same score doesn’t at all mean either one cheated.
This can and will happen by chance even with randomized seating.
To understand this, we can run a simulated exam. Assign each of the 300 students in the lecture hall a
random grade from our well-understood grade distribution above. We will end up with a lecture hall
score heatmap like this one.
In the red boxes, we have identified which seating pairs got the same exam score.
Now we count how many pairs of students get the same score (7 in the picture above). Divide
this count by the total number of possible adjacent seating pairs of students (285 in the picture
above). We then get a percentage of how many pairs got the same score (7/285 = 2.5%).
Record this percentage in a list and then run another fresh simulation. Repeat this over and over again
(we do 10000 such simulations). In the end, we get a new distribution, this time telling us how likely it
is for a pair of students to get the same exam score. We show this distribution below.
Percentage of all Student Seating Pairs Receiving Same Score

We see that this seems to be distributed in something like a bell-curve but with a long right tail. We
see that on average, around 2% of all student seating pairs receive the same exam score when
there is no cheating going on.
During the Exam

During the exam, as we mentioned earlier, we will randomly assign students to seats, allow the exam
to progress, look out for any clear signs of cheating and then record all student exam scores along with
their assigned seat.
It is worth noting that we randomly assign students to seats to control for other reasons for adjacent
seated students to have similar scores, such as the possibility that they are friends who studied
together in a very similar way.
After the Exam
After the exam, we will create a heatmap of the student scores and identify which seating pairs of
students received the same score. Such a map is shown below.
We count 11 seating pairs with the same score in the map above. As a percentage, this means that
11/285 = 3.86% of seating pairs got the same score. Given that the average was more like 2%, this
seems suspicious but how do we tell for sure?
We just need to ask ourselves one question: if there were no cheating during the exam, how likely
would it be to see a same score seating percentage of 3.86% or more?.
If the answer to this question is fairly likely, then we should not have any suspicion of cheating.
However if the answer is extremely unlikely, we should be fairly confident that there was some
cheating going on.
To visually answer this question, let’s see what part of the student seating pair same score
distribution lies past 3.86%.
Adding up the height of the bars past the 3.86% vertical red line, we get 2.2%. To put that in words, if
there was indeed no cheating during our exam, there is a 2.2% chance of seeing the 3.86% same
score rate or an even higher rate. Put more fundamentally, it is very unlikely that we got the scores
we did without cheating having occurred.
Note that this still doesn’t give us a clue which of these student pairs actually cheated, so we
cannot use this insight to start accusing students. But, we can use this insight to better prepare for
the next exam by printing different test versions, having more proctors to watch out for cheating, or
including a required signature on an academic honesty agreement on the exam’s first page.
Limitations
This technique can help determine when some forms of cheating have occurred but it still has its
limitations, namely the following three.
First off, it only factors in students with the same score sitting next to each other. We can
imagine a slightly modified version of this technique where we count a student seating pair if their
grades are within plus or minus 2% for example.
Second, this technique will not detect global cheating such as if half the class got access to the
answer key before the exam. For that we need to analyze more the grade distribution.
Third, this technique will not detect single-student cheating such as the classic case where a
student brings in an illegal note sheet. In this case, an analysis of this student’s past
performance in the class versus his/her current exam grade may be in order.
Thanks for reading and please leave comments below!
Written on January 6, 2018

PDFmyURL.com - convert URLs, web pages or even full websites to PDF online. Easy API for developers!

Detecting Cheating in Lecture Halls

Uploaded by

Copyright:

Available Formats

Detecting Cheating in Lecture Halls

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Detecting Cheating in Lecture Halls

Uploaded by

Copyright:

Available Formats

ritvikmath Blog About

drilling into data

Detecting Cheating in Lecture Halls

Using Statistics to Ensure Academic Honesty

Expectation of Exam Distribution

Before the Exam

Percentage of all Student Seating Pairs Receiving Same Score

During the Exam

After the Exam

Thanks for reading and please leave comments below!

Written on January 6, 2018

You might also like