A Novel Approach To Predict Students Performance in Online Courses Through Machine Learning

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

A Novel Approach to Predict Students Performance in Online

Courses through Machine Learning


S Jayasree1, MD Asim2
1
M.Tech Student, Dept. of CSE, Dr.K.V.Subba Reddy College of Engineering for Women, Kurnool, A.P
2
Assistant Professor, Dept. of CSE, Dr.K.V.Subba Reddy College of Engineering for Women, Kurnool, A.P

Abstract— Forecasting student success can enable teachers to prevent students from dropping out
before final examinations, identify those who need additional help and boost institution ranking and
prestige. Machine learning techniques in educational data mining aim to develop a model for
discovering meaningful hidden patterns and exploring useful information from educational settings.
The key traditional characteristics of students (demographic, academic background and behavioral
features) are the main essential factors that can represent the training dataset for supervised machine
learning algorithms.
The real goal is to have an overview of the systems of artificial intelligence that were used to predict
Academic learning. This research also focuses on how to classify the most relevant attributes in
student data by using prediction algorithm. Using educational machine learning methods, we could
potentially improve the performance and progress of students more efficiently in an efficient manner.
Students, educator and academic institutions could benefit and also have an impact.
In this paper, two predictive models have been designed namely students’ assessments grades and
final students’ performance. The models can be used to detect the factors that influence students’
learning achievement in MOOCs. The result shows that both models gain feasible and accurate
results. The lowest RSME gain by RF acquire a value of 8.131 for students assessments grades
model while GBM yields the highest accuracy in final students’ performance, an average value of
0.086 was achieved.
Keywords— Massive Open Online Courses (MOOCs), Machine Learning, Receiver Operating
Characteristic (ROC), Students Performance.

I. INTRODUCTION With rapid advancements in technology,


There are many studies in the learning field artificial intelligence has recently become an
that investigated the ways of applying effective approach in the evaluation and
machine learning techniques for various testing of student performance in online
educational purposes. One of the focuses of courses.
these studies is to identify high-risk students, Many researchers applied machine learning to
as well as to identify features which affect the predict student performance in [7], however
performance of students. few works have been done to examine the
trajectories performance [8]. As a result,
educators could not monitor the real-time targets: “success” and “fail”. Our model
student’s learning curve. predicts the performance with three-class
Two sets of experiments are conducted in this labels “success”, “fail” and “withdrew”.
work. In the first set of experiments, The rapid development of information
regression analysis is implemented for technology (IT) has greatly increased the
estimation of students’ assessment scores. The amount of data in different institutions. Huge
student past and current activities in addition warehouses contain a wealth of data and
to past performance are employed to predict constitute a valuable information goldmine.
student outcome. In the second set of This dramatic inflation in the amount of data
experiments, supervised machine learning in institutions has not kept pace with the
method has been utilized to predict long-term efficient ways of investing these data. Thus, a
student performance. Three types of candidate new challenge has recently emerged, that is,
predictors have been considered firstly transitioning from traditional databases that
behavioral features, followed by temporal and store and search for information only through
demographic features. The proposed models questions asked by a researcher to techniques
offer new insight into determining the most used in extracting knowledge by exploring
critical learning activity and assist the prevailing patterns of data for decision
educators in keeping tracking of timely making, planning and future vision.
student performance. To the best of our
knowledge, student performance has been
evaluated in online course using only two
“Assessments” contains information about the
II.SYSTEM ANALYSIS number, weight and the type of assessments

Data Description required for each module. In general, each

The OULAD dataset was captured from the module involves a set of assessments,

Open University Learning Analytics Dataset followed by the final exam. The assessments

(OULAD )repository. The open university in are Tutor Marked Assessment (TMA),

the UK delivers the online course in various Computer Marked Assessment (CMA).

topic for undergraduate and postgraduate The final average grade is computed with the

students in the period between 2013-2014. The sum of all assessments (50%) and final exams

main composite table called “studentInfo” is (50%). The “Student Assessment” table

linked to all tables.The"studentInfo "table involves information relating to student

includes information relevant to students’ assessment results, such as the date of the

demographic characteristics [15]. submitted assessment and the assessment

The information related to student’s mark[15].

performance is collected in “Assessments” and The “Student Registration” table contains

Student Assessment tables. The table information about the date the students
registered and unregistered in a particular processes. It is used to achieve enhanced
module. The overall date is measured by results with few labelled examples. Its training
counting numbers of unique days that students dataset consists of labelled and unlabelled
interact with courses until the course ends. In data. DT, NB, Logistic Regression, SVMs,
Open University online courses, students are KNN, SMO and Neural Network are well-
able to access a module even before being a known supervised techniques with accurate
student of the course; however, it is not results in different scientific fields.
possible to access the course post-course
closure date. The students' information related III.SYSTEM CONSTRUCTION
to their interaction with digital is store in Simulation results- Final Students
learners Virtual Learning Environment Performance Model
dataset. The classification analysis results for the
The field of machine learning has gained the second experiment presented as follows. The
attention of computer science and IT same set of machine learning classifiers in the
researchers. The data analysis field has previous experiment are used in this case
become more essential than before, owing to study. As can be seen in a Table 3 all
the increasing amounts of huge data processed classifiers obtain similar ideal results, the
every day. The three basic types of machine highest performance achieved by Gbm with
learning are supervised, unsupervised and the value of 0.868 while RF, Nnet producing
semi-supervised learnings [19]. In supervised the value of 0.854, achieved the lowest
learning, the training dataset only consists of accuracy. Table 4 shows the class
labelled data. A supervised function is trained “Withdrawn” acquired the best accuracy of all
during the learning process, with the aim of Classifiers reaching an average value of 0.99
predicting the future labels of unseen data. whereas the class “Fail” gives the lowest
The two basic supervised problems are performance, with an approximate range of
regression and classification, especially for accuracy between 0.76-0.80.
discrete function classification and continuous The sensitivities are high overall classifiers for
regression [10]. Unsupervised learning aims to class “Withdraw” and “Pass”. The best
find meaningful, regular patterns without sensitivity achieved by Rpart reported the
human intervention on unlabelled data. Its values of 0.99 and 0.92. The class “Fail”
training set is made up of unlabelled data, and gained very low sensitivities across all
no instructor is present to help identify these classifiers. This is expected since the number
patterns. Some popular supervised methods of records with target class “Fail” are limited
include clustering, novelty identification and hence, the algorithm cannot learn well. With
dimensionality reduction [4, 11]. Semi- regards, to true negative instance, the Gbm
supervised learning is a combination of and Nnet produce the best result, specificity
supervised and unsupervised learning =0.998 for class “Withdrawn”. The poorest
specificity gained by Rpart for class “Pass”
Table 3 Accuracy Result for Final Students
obtained the values of 0.81. As can be seen,
Performance model
the best F1-Measure gained by Gbm yielded a
Table 4 Results for Final Students Performance
value of 0.993, 0.864, 0.772, for the class Prediction Model
“Withdrawn”,” Pass” and “Fail” respectively. Classifier Performance
The lowest F1-Measure is shown for Rpart Metrics
MLP ACC F1 Sens Spec AU
with the value of 0.67 over class “Fail”.
C
ROC is used in this study to choose a decision Pass 0.85 0.850 0.892 0.82 0.91

threshold value for the true and false positive 8 4 6


Fail 0.78 0.690 0.631 0.93 0.88
rate across each class. Overall, a range of 2 2 6
AUC values between 0.99-0.82 for all classes Withdraw 0.99 0.992 0.989 0.99 0.99
n 3 6 6
was obtained. As previously mentioned, the RF ACC F1 Sens Spec AU
demographic behavioral and temporal features C
Pass 0.85 0.843 0.844 0.86 0.92
in classification analysis were combined. In
5 6 4
this model the total numbers of variables are Fail 0.80 0.713 0.712 0.90 0.89
35. As a result, the predictive model may 8 4 2
Withdraw 0.99 0.993 0.991 0.99 0.99
suffer from the over fitting issue. In this case,
n 5 0 5
we compare classifiers results in terms of train Rpart ACC F1 Sens Spec AU
and test error which could give an indication C
Pass 0.86 0.865 0.923 0.81 0.86
of the over fitting problem.
6 0 7
Figure 3 displays the result of over fitting Fail 0.76 0.671 0.582 0.95 0.82

evaluation. It can be observed that training and 7 2 1


Withdraw 0.99 0.991 0.996 0.99 0.99
test error are low for all classifiers. The n 7 2 7
lowest test and train error was obtained by Gbm ACC F1 Sens Spec AU
C
Gbm. The RF, Nnet obtained a similar test Pass 0.87 0.864 0.903 0.84 0.92
error with an approximate percentage of 14%. 2 1 5
Fail 0.80 0.722 0.665 0.93 0.90
The training errors are slightly higher in these
2 9 0
classifiers. The largest error was acquired by Withdraw 0.99 0.993 0.991 0.99 0.99
the Mlp model. Although all models fit well n 4 8 7
Nnet ACC F1 Sens Spec AU
for most classifiers, Mlp suffers from
C
overfitting. Pass 0.85 0.847 0.870 0.84 0.92
6 3 5
Fail 0.79 0.704 0.670 0.92 0.90
Classifier Accurac
5 0 0
y Withdraw 0.99 0.993 0.991 0.99 0.99
Mlp 0.858 n 4 8 8
RF 0.854
Rpart 0.862
Gbm 0.868
Nnet 0.854
ROC is used in this study to choose a decision threshold requirements), predict the timetable scale of the
value for the true and false positive rate across each class enrolment and help students decide how to
class. Figure 2 lists ROC curves. Overall, a range of
choose courses depending on how well they will
AUC values between 0.99-0.82 for all classes was
do in the chosen courses. The proposed model
obtained. As previously mentioned, the demographic
behavioral and temporal features in classification
predicted student performance.

analysis were combined. In this model the total The final student performance predictive model
numbers of variables are 35. As a result, the predictive revealed that student engagement with digital
model may suffer from the overfitting issue. In this material has a significant impact on their success
case, we compare classifiers results in terms of train and in the entire course. The findings’ results also
test error which could give an indication of the
demonstrate that long-term students’
overfitting problem.
performance achieves better accuracy than
students’ assessments grades prediction model,
due to the exclusion of temporal features in
regression analysis. The date of student
deregistration from the course is a valuable
predictor that is significantly correlated with
student performance. With the regression
analysis, the data does not provide the last date
of students’ activity prior to undertaken
assessments. The findings’ results have been
recommended to take into account the temporal
features on predicting of subsequent assessments
grades.
Future research direction involves the use of
temporal features for predicting students’
assessments grades model. With temporal
feature time series analysis will be untaken,
might be more advanced machine leering will be
IV.CONCLUSION utilized.
Predicting student performance is important in V.REFERENCES
the educational domain because student status [1] J. Xu, K. H. Moon, and M. Van Der
Schaar, “A Machine Learning Approach for
analysis helps improve the performance of
Tracking and Predicting Student Performance
institutions. Different sources of information, in Degree Programs,” IEEE J. Sel. Top. Signal
such as traditional (demographic, academic Process., vol. 11, no. 5, pp. 742–753, 2017.
background and behavioural features) and [2] K. P. Shaleena and S. Paul, “Data mining
techniques for predicting student
multimedia databases, are often accessible in
performance,” in ICETECH 2015 - 2015 IEEE
educational institutions. These sources help International Conference on Engineering and
administrators find information (e.g. admission Technology, 2015, no. March, pp. 0–2.
[3] A. M. Shahiri, W. Husain, and N. A. [13] P. and K. Al-Shabandar, R., Hussain,
Rashid, “A Review on Predicting Student’s A.J., Liatsis, “Detecting At-Risk Students
Performance Using Data Mining Techniques,” With Early Interventions Using Machine
in Procedia Computer Science, 2015. Learning Techniques,” IEEE Access, vol.
[4] Y. Meier, J. Xu, O. Atan, and M. Van Der 7, pp. 149464–149478, 2019.
Schaar, “Predicting grades,” IEEE Trans. [14] S. Jiang, A. E. Williams, K. Schenke,
Signal Process. vol. 64, no. 4, pp. 959–972, M. Warschauer, and D. O. Dowd,
2016. “Predicting MOOC Performance with
[5] P. Guleria, N. Thakur, and M. Sood, Week 1 Behavior,” in Proceedings of the
“Predicting student performance using 7th International Conference on
decision tree classifiers and information gain,” Educational Data Mining (EDM), 2014,
Proc. 2014 3rd Int. Conf. Parallel, Distrib. pp. 273–275.
Grid Comput. PDGC 2014, pp. 126–129, [15] L. Analytics and C. Exchange, “OU
2015. Analyse : Analysing at - risk students at
[6] P. M. Arsad, N. Buniyamin, and J. L. A. The Open University,” in in Conference,
Manan, “A neural network students’ 5th International Learning Analytics and
performance prediction model (NNSPPM),” Knowledge (LAK) (ed.), 2015, no. October
2013 IEEE Int. Conf. Smart Instrumentation, 2014.
Meas. Appl. ICSIMA 2013, no. July 2006, pp. [16] R. Alshabandar, A. Hussain, R.
26–27, 2013. Keight, A. Laws, and T. Baker, “The
[7] K. F. Li, D. Rusk, and F. Song, “Predicting Application of Gaussian Mixture Models
student academic performance,” Proc. - 2013 for the Identification of At-Risk Learners
7th Int.Conf. Complex, Intelligent, Softw. in Massive Open Online Courses,” in
Intensive Syst. CISIS 2013, pp. 27–33, 2013. 2018 IEEE Congress on Evolutionary
[8] G. Gray, C. McGuinness, and P. Owende, Computation, CEC 2018 - Proceedings,
“An application of classification models to 2018.
predict learner progression in tertiary [17] J.-L. Hung, M. C. Wang, S. Wang,
education,” in Souvenir of the 2014 IEEE M. Abdelrasoul, Y. Li, and W. He,
International Advance Computing Conference, “Identifying At-Risk Students for Early
IACC 2014, 2014. Interventions—A Time-Series Clustering
[9] N. Buniyamin, U. Bin Mat, and P. M. Approach,” IEEE Trans. Emerg. Top.
Arshad, “Educational data mining for Comput., vol. 5, no. 1, pp. 45–55, 2017.
prediction and classification of engineering [18] C. Yun, D. Shin, H. Jo, J. Yang, and
students achievement,” 2015 IEEE 7th Int. S. Kim, “An Experimental Study on
Conf. Eng. Educ. ICEED 2015, pp. 49–53, Feature Subset Selection Methods,” 7th
2016. IEEE Int. Conf. Comput. Inf. Technol.
[10] Z. Alharbi, J. . Cornford, L. . Dolder, and (CIT 2007), pp. 77–82, 2007.
B. . De La Iglesia, “Using data mining [19] G. Chandrashekar and F. Sahin, “A
techniques to predict students at risk of poor survey on feature selection methods,”
performance,” Proc. 2016 SAI Comput. Conf. Comput. Electr. Eng., vol. 40, no. 1, pp.
SAI 2016, pp. 523–531, 2016. 16–28, 2014.
[11] B. Hore, s. Mehrotra, m. Canim, and
m. Kantarcioglu, “secure
multidimensional Range queries over
outsourced data,” vldb j., vol. 21, no. 3,Pp.
333–358, 2012.
[12] J. Mullan, “Learning Analytics in
Higher Education,” London, 2016.

You might also like