Ascs 04 0287 2
Ascs 04 0287 2
Ascs 04 0287 2
Abstract
Maintaining large information systems, especially the systems of an educational institute, is fundamental and most challenging.
The processes like admission or examination cannot afford time delays, inefficiencies, and inaccuracies. Process mining has recently
gained enormous research attention and is a growing technology in evaluating the processes of information systems via event log
data recorded over numerous time stamps. The goal of process mining is to find, evaluate, fix, and improve real-world business pro-
cesses based on the behaviour of an information system as recorded in an event log. The main objective of this research work is to
apply process mining techniques for examining and analyzing educational processes using real-time log data of the MUET admission
information system. The ProM toolkit is used for process model discovery and deviations. The event data was collected from Mehran
UET’s admission department. The log files comprising 2 months of access log data typically collected during the admission process
were examined. The process models were discovered using different plugins of ProM software. The results proved that with the help
of event log files, an actual process model can be generated and see the bottlenecks and inefficiencies.
Keywords: Process Mining; Education Institution; Admission Process; Conformance Checking; ProM
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
04
Usually the information systems (software) have two kinds of Literature Review
logs: the access log files, and the error log files. The access log keeps The field of process mining has been investigated by different
track of the requests that come into the web server. This data could scholars worldwide and its applicability, in terms of case studies has
include what sites visitors are looking at, the status of requests, and been realized in different domains. Research study accomplished
how long it took the server to respond. This research focuses on in [8] has utilized techniques of process mining technology for
deducing information from actual process executions, which are the examination of life pattern of a library information resource/
captured in these so-called event logs. Basically, it is a case study of facility. Seattle public library’s data has been used as log data. The
an admission portal, aims to analyze and develop process model of study constructed a diagrammatic model of life cycle by using ProM
processes. so that the process of admission can be improved. software with inductive visual mining plugin.
The admission system is a portal designed to provide online Work in [9] presented the outcomes of process mining of
admission facility in the undergraduate degree programs. User/ an actual case study regulated/operated in San Carlo di Nancy
Students must register themselves first in the portal by creating hospital in Rome (Italy). Event Log data files are collected from
an account and then further process of filling the admission hospital’s clinical data stored by hospital management system.
form starts. By logging in to the account, system will generate a Event log data is collected from hospital information system. Three
bank challan to user/student, user must submit it to bank, and data sets were analyzed, outpatient clinic with 299685 records,
upload scanned printout to the portal. Then adding basic data/ emergency room with 22043 records, and hospitalization with
information, educational data/information, uploading a profile 10843 records and then this data is analyzed using process mining
picture and scanned documents must be uploaded. After adding tool ProM from different views and presented the results. Different
the complete information and choosing the preference for self- algorithms are used in performance analysis phase, like inductive
finance seat, students got a verified status then they are provided visual minor, transition system minor, simple log filter etc. Work
with admit slip for Admission test. And this the last step of online accomplished in [10] has applied process mining technique to
submission of admission form. And in between this process if enhance accident rescue methods of fatal Gas explosion accidents
any of information or document is not properly added or blurred in China. They have taken 50 Gas explosion accidents during period
picture of document is uploaded then student got an email and a 2006-2014, as log data. Disco and ProM both have been employed
kind of message to upload or add that information again. in their study. For Final Process model, inductive minor algorithm
in ProM is used. The of work in [11] was to uncover student’s self-
The server processes all the requests and reserves the data in
regulated studying during an e-learning studies. They applied
log files. And hence we can apply process mining methods and
process mining methods to achieve this. 101 university students
techniques to gain insights into process models. Event data and log
were given a course in 1 semester on the moodle 2.0 platform, and
data is used as the input files for this research study. For developing
they analyzed students pass and fail ratio. Log data was taken from
model of admission process, ProM software has been used with
this platform’s event log files.
process mining algorithms. To decide to go for which algorithm,
one must know its queries and situation of data. Furthermore, Research work in [12] gives a review of current state of
applying any process mining method without any preprocessing is educational data mining and provided a framework for analyzing
usually ineffective. educational data provided by platforms like moodle learning
management systems. The framework will help in taking decisions
Rest of this paper is organized as follows: Review of existing
or decision support systems. To analyze the basic patterns, latent
literature is discussed in section 2. Section 3 details phases of
class analysis (LCA) and sequential pattern mining approaches
research methodology. Process Model Discovery is elaborated
were utilized; for process mining, heuristic and fuzzy approaches
in section 4. Section 5 highlights conformance checking and
were used to obtain workflows and statistics; and, finally, social-
bottlenecks. Results and recommendations are given in section 6
network analysis was used to uncover collaborations. They used
and finally, section 7 concludes the research work.
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
05
the ProM tool’s Heuristic Miner to discover the student learning in educational domain leaving behind a research gap. To the best of
process. And for fuzzy miner, Disco tool is utilized. Study in [1] knowledge and literature review this is the first study that typically
discusses cyberattacks, examination/evaluation of alarms and analyzes admission process in an Engineering University using real
the restrictions of attack reveal/discover systems. They have time event log data.
put forward a model using process mining methods to uplift the
present attack detection systems, with the help of that model Methodology
doubtful activity can be detected in real situations. If there will be In this work we look at a case study that aims to do analysis
any intrusion this model will detect that and safe the data. They and develop process model of MUET’s admission portal. The
have used one hospital data for audit and ProM tool was used for conformance checking process has also been accomplished. To
process mining. In this survey paper [2] authors discuss about achieve these goals various data analysis and process mining
educational process mining and how this technology is applied in techniques were used.
education field. They discuss that initial point of process mining
The overall layout and phases of research methodology are
is an event log file, which can be any file that contains sequence
given below.
of activities. Log files can be collected from learning environments
like studying management systems, massive open online courses
(MOOCs), adaptive hypermedia network/structure etc. There
are many PM tools for processing these log files like Disco, ProM,
Celonis discovery, Perceptive process mining, XMAnalyzer etc. The
authors have discussed the techniques, algorithms of each step of
process mining.
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
06
. . . . .
. . . . .
. . . . .
[31/
1234 Aug/2021:03:07:35 GET/dashboard.php HTTP/1.1 301 250
+0200]
[31/
GET/forgetPaswordConformation.php
1235 Aug/2021:06:47:11 301 269
HTTP/1.1
+0200]
[31/
1236 Aug/2021:07:42:54 GET/login.php HTTP/1.1 301 249
+0200]
[31/
GET/status_candidate_form.
1237 Aug/2021:08:56:50 301 280
php?verifyID=22406 HTTP/1.1
+0200]
GET/data-form.html?error-
[31/
msg2=Intermediate%20Percentage%20
1238 Jul/2021:21:26:12 301 330
shouldn%27t%20be%20less%20than%2060.
+0200]
HTTP/1.1
The log files need to be parsed out and from text mining we Process model discovery
extracted the information showing in table 2.
Aside from event logs, a process model is another common input
Register Registration on portal for process mining approaches. A process model is a blueprint for
how a procedure should be carried out. A process model encodes
Login Login
(or should express) the set of permissible executions in a more
Dashboard Dashboard of portal
formal fashion than a textual description in natural language. Using
Student form Student admission form
a formal process modelling notation provides multiple advantages,
Filling of student data/filling basic
Basic info student details including the ability to evaluate various model quality features,
details
such as the lack of deadlocks. It also enables software tools to
Education details Adding education details on portal
reason about the modelled process behavior automatically [3].
Upload challan Upload admission fee challan
Upload docs Uploading basic documents We get a process model like the one shown in figure 2 when
Checking student form and giving we apply automated process discovery techniques directly on raw
Candidate status verified
green signal logged click-stream data. Clearly such a complicated process model
Downloading admit slip for entry makes it difficult to achieve the overall purpose of process mining,
Download admit slip
test which is to gain a better understanding of the process.
Table 2: Depiction of the functionality of website/portal called as
The model is clearly not human interpretable, that is why, the
activity or events.
log file is filtered out and only most frequent events are added. And
The log files were analyzed and parsed and then converted to then the file is analyzed. Figure 3 is showing the most frequent
csv format. That is because our tool ProM cannot accept the ‘.gz’ events by ProM tool.
format files or unstructured files.
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
07
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
08
Activity graph
After the preprocessing, we have been able to develop a casual
activity graph of processes. Figure 4 shows the activity graph, Figure 4: Casual Activity Graph.
which is how the events are working in a sequence.
The model is discovered/developed using default heuristic
Heuristic model parameters initially. Admission process starts with register
A heuristic model can be used to depict the log’s frequency event, and the student form activity is followed by login event. It
characteristics. The heuristic model is a directed graph with the is analyzed that mostly occurred events in the log are <Register>,
events as its vertices. The number of traces that contain this event is <Login>, <Student Form>, <Basic Info student details>, <education
displayed for each vertex (event). If the matching two occurrences details>, <Upload docs>, and <Test Slip Download>. The discovered
in the event log follow each other directly, two graph vertices are Heuristic process net is shown below in figure 5.
connected by an arc. A frequency parameter (the number of traces
having the associated dependency) is assigned to each arc.
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
09
Conformance checking As the portal does not have a documented or formal process
For conformance checking/analysis there are two parameters: model, there was just an idea that how this portal should work, so
one is fitness and other is appropriateness. Appropriateness, i.e. by doing detailed analysis and from developed process models we
the degree of accuracy with which the process model describes have seen the differences/bottlenecks in real processes and some
the observed behavior, combined with the degree of clarity with processes who took more user’s time.
which it is represented, and, Fitness, i.e. the extent to which the log
traces can be associated with valid execution paths specified by the
process model.
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
10
For checking fitness, the log is replayed in the model as shown in • Improve the bandwidth of the channel between user
Figure 7. The figure 7 shows higher frequency activities got darker interface and portal server or database.
blue color, so the activities <register>, <login>, <dashboard>, • There is no need to upload the challan document,
<student form>, <Basic info student details>, <education details>, administration can get the status of submitted challans by
<Upload challan>, <upload docs>, <candidate status verified>, and banks. Student just need to enter the challan number.
<test slip download> are executed more often. The green/purple
• Hardware performance must also be improved.
bar at the bottom of each transition indicates the percentage of
• Administration must update the content of the website/
correct/incorrect executions. While an area is highlighted yellow, it
portal.
means there was an occurrence in the log that couldn’t be explained
when a token was present. The red boarder means model move is • Forget password activity is seen most in the log data, students
greater than zero and green color border shows there is no model must get a msg on their phone about their password.
move occur, and the frequency of synchronous move is shown. • The user cannot delete the account if he/she has registered
himself with wrong CNIC, so delete account option should
The element statistics shows that out of 543, model move value
be there.
is 289. And the global statistics results show trace fitness is 60%.
As during data preprocessing phase, we filtered the log and choose Conclusion
only 545 records, which is less than 40% of most common events The extensive use of process mining in industry has revealed
remained. plenty of new difficulties. One of which is applying this technique
in education department. The analysis methods were general, and
Results and Recommendations
the problems are typical for a portal system. Process mining can
The analysis of process models shows that users/students
help in improving the business processes because the analysis
often forget their passwords as forget password activity seen most
of the system has been done via real event log data. Since the log
in both models also many students got errors in submitting the
data captures the user activities, hence from this kind of data, an
documents to the portal as Upload docs error event is also seen in
actual behavior of the system can be analyzed and improved. The
log file. Most of the time students leave the portal after registration
crucial question was how to make this e-portal a better version so
process and fill the student form later that day or next day. As we
that student should not feel it difficult or annoyed via filling the
have gathered two months data, each containing 4531 and 650
admission form online. Event Log files were main data for this
records, after filtering the log data, 544 records were chosen in
work. For analyzing a system, data preprocessing step is most
data preprocessing, because only these selected events were seen
important as the system captures every click on the system by user,
repeatedly in log file of admission portal. So, the process model
therefore there was repetition of events due to which more than
would be same if many other cases were included. Due to heavy
60% of data was eliminated in log filtering. The process mining
traffic on portal events/activities seems taken more time than
techniques are more appropriate for analyzing and improving the
usual.
processes. It is critical to obtain the help of a domain specialist
while analyzing procedures. The results must be interpreted
As a result of this research following recommendations are
to assess the situation. The findings obtained by software are
given:
worthless without adaptation and judicious method selection for
Due to heavy traffic students are unable to upload the required
a specific circumstance. Interpretation is a crucial phase as well. To
documents in one click.
interpret the findings correctly, one must exercise extreme caution.
• This step can also be removed, the reason is that user’s most The correctness of the evaluation results, for example, is highly
time wasted due to heavy traffic and second thing document dependent on the software, data, and settings employed. The
verification is done manually so no point of uploading all the results obtained in this study will be highly beneficial if applied in
documents. real time setting.
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.
Application of Process Mining in Education: A Case Study
11
4. P He., et al. “An Evaluation Study on Log Parsing and Its Use
in Log Mining”. presented at the 2016 46th Annual IEEE/
IFIP International Conference on Dependable Systems and
Networks (DSN) (2016).
Citation: Sania Bhatti., et al. “Application of Process Mining in Education: A Case Study". Acta Scientific Computer Sciences 4.7 (2022): 03-11.