Educational Data Mining For Improving Learning Outcomes in Teaching Accounting Within Higher Education

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

The current issue and full text archive of this journal is available on Emerald Insight at:

Educational data mining for
improving learning outcomes
in teaching accounting within
272 higher education
Julian Chamizo-Gonzalez
Universidad Autónoma de Madrid, Madrid, Spain
Elisa Isabel Cano-Montero
Universidad de Castilla La Mancha, Toledo, Spain, and
Elena Urquia-Grande and Clara Isabel Muñoz-Colomina
Universidad Complutense de Madrid, Madrid, Spain

Purpose – The purpose of this paper are twofold. First, to disclose whether accounting students who
participate more in online activities proposed by the teacher achieve better learning outcomes. Second,
to identify which virtual learning activities achieve improved outcomes.
Design/methodology/approach – Data mining is a computer-based tool devoted to analyzing massive
data sources, generating information and discovering deeper knowledge and links among variables.
Findings – There were differences between universities and subjects in the association of level of
activity and learning outcomes. These findings will help teachers adjust their teaching guide, schedule
and explanations.
Research limitations/implications – Further developments should include the level of online
compromise of the lecturers, and the correspondence of the online activity with the designed activities
in the teaching guide. In order to identify the value-added activities performed by the students to
achieve better deep learning outcomes.
Practical implications – Higher Education should provide students with cognitive and transversal
skills for successful incorporation into the labor market. In this sense, teaching methodology combined
with online tools facilitates the process of teaching and learning with the implementation of different
multimedia resources.
Originality/value – Recently, the impact of virtual platform usage on students’ learning outcomes
has started to be analyzed using “data mining” techniques. Educational data mining is a new focus to
disclose existing links among students, lecturers and its activity.
Keywords Learning outcomes, Accounting, Educational data mining, Virtual learning environment
Paper type Research paper

1. Introduction
Long-life meaningful learning for students should enable them for professional
development. Cognitive and transferable skills must develop for successful incorporation
into the labor market within the Higher Education system. Therefore, universities must
provide professionals trained in accordance with the qualified role required. In this
context, it is assumed that the students’ acquisition of skills and abilities go through a
The International Journal of process of self-learning guided by the lecturer with the essential active participation of
Information and Learning
Technology the student (Oliveros, 2006). In this sense, teaching methodology combined with
Vol. 32 No. 5, 2015
pp. 272-285
e-learning tools facilitates the process of teaching and learning including the use and
© Emerald Group Publishing Limited
application of different multimedia resources such as interactive videos and blackboards,
DOI 10.1108/IJILT-08-2015-0020 simulations and virtual platform (Liaw et al., 2007; Paechter et al., 2010; Sun et al., 2008;
Urquía et al., 2012). In this trend virtual platform tools such as dynamic environment, EDM for
modular or object-oriented learning offer many possibilities as a device for supporting improving
innovation in teaching-learning and research (Correa Gorospe, 2005). Additionally,
students must combine aspects of constructivism such as knowledge generated through
students’ experience, interaction with virtual environment and “learning by doing” outcomes
within the social constructivism pedagogy theory. Consequently, virtual learning
environments (VLE) are an essential support for the implementation of active teaching- 273
learning methodologies in twenty-first century.
The appropriateness and contribution of virtual learning or e-learning in
teaching-learning methodologies was made clear in the last decade as pointed out by
several researchers (Liaw et al., 2007). These researchers analyzed types of teaching
methodology from several points of view: from the impact this teaching has in
student learning (deep or surface), in students’ satisfaction (Alexander and Golja,
2007; Coates et al., 2005), teacher satisfaction and as an aid for teaching-learning best
practices design (Paechter et al., 2010). Moreover, Sun et al. (2008) described the
success of virtual learning based on variables such as the ability and motivation of
the teacher toward Information Communication Technologies (ICT), the quality and
diversity of the course and the perceived usefulness and ease for use by the student
(López Pérez et al., 2013).
Recently the impact of virtual teaching on student learning outcomes has started to
be analyzed using “data-mining” techniques. Data mining (DM) is a computer-based
information system devoted to scan huge data sources, generate information, and
discover deeper knowledge and discover new links among huge data. DM pursues to
find out data patterns, organize information of hidden relationships, structure
association rules, estimate unknown items’ values to classify objects, compose clusters
of homogenous objects and reveal many kinds of findings that are not easily produced.
Thereby, DM outcomes represent a valuable support for decisions making
(Peña-Ayala, 2014). DM is used in many other professional sectors such as banking,
insurance and commercial companies to deepen customer information and, thus,
accordingly to adjust their products or services to the customer requirements.
Concerning education, it is a novel DM application target for knowledge discovery,
decision making and links among students, student-teacher, learning-teaching.
Nowadays, the use of DM in education is emerging and is denominated educational
data mining (EDM) research area (Romero et al., 2008, 2009; Martin-Blas and Serrano-
Fernández, 2008). EDM arises to design models, tasks, methods and algorithms for
exploring data from educational scenarios pursuing to find out patterns and make
predictions that characterize learners and students’ behaviors and achievements,
domain knowledge content, assessments, educational functionalities and applications.
DM applied to Higher Education can collect information about the behavior of students
(either participating or just consulting) in activities that have been carried out through
the VLE Moodle (blogs, forums, exercises, tests, etc.). One of the objectives for using
EDM in this area is to be able to link this behavior with the final learning performance
of students in a particular subject (Abdous et al., 2012; Agudo-Peregrina et al., 2014;
López-Pérez et al., 2011, 2013; Romero et al., 2010). There are three DM processes most
commonly applied in educational research. First, statistical and visualization (Heiner
et al., 2004; Nilakant and Mitrovic, 2005; Zorrilla et al., 2005); second encompassing web
mining clustering or classification of students based on their behavior or activity
classification; association rule mining and sequential pattern mining or sequential
search behavior patterns and third, text mining. Virtual education provided through
IJILT web platforms records the activities of students and teachers, with numerous
32,5 interaction records: server log file, client proxy log file and log file or files of activities
and limiting access to certain information.
A contribution of DM used for Higher Education is the collection of data in a complex
way and deepening the behavior of students regarding the tasks proposed by the teacher.
DM also provides large amounts of data records of student activities about which
274 documents published on the VLE are the most read. Information was analyzed about
which assignments have been delivered and from which computer they have come
(university or outside the university), in which forums students have participated and even
if they communicate with their peers through these forums (Mazza and Milani, 2005;
Mostow, 2004). There are nine categories of EDM models, tasks, methods, techniques and
algorithms classified by Peña-Ayala (2014) as neural networks; algorithm architecture,
dynamic prediction, analysis of system architecture, intelligent agent systems, modeling,
knowledge-based systems, systems optimization and information systems. Regarding
EDM, Romero et al. (2008) identify statistics-visualization and web mining as a couple of
DM techniques to classify the application of DM to HE. As for statistics, several tools are
identified visualization, web mining, split into three kinds of tasks: clustering, classification
and outlier detection; association rules and sequential pattern and text mining. As future
trends, EDM tools must be designed for non-technical users. The objectives of using this
methodology in the educational context are several. First, to improve active teaching
methodology where teachers guide students in learning improvement (Romero et al., 2004;
Hamalainen et al., 2004; Tang and McCalla, 2005). Second, to motivate students’ use of the
VLE improving their significant, deep learning (Kim, 2008; Paechter et al., 2010). Finally to
assist decision making by institutional authorities toward improving the quality of
teaching (Grob et al., 2004).
Thus, the actors of a virtual teaching and learning system when collecting data
through DM are represented in Figure 1 together with a learning-teaching feedback
proposal in order to renew continuously and improve Higher Education methodologies.
Some EDM targets are student models, models of domain knowledge, pedagogical
support and impacts on learning. Regarding EDM, several models, tasks and techniques
are identified; where mathematical, rules-based and soft computing techniques are the
target of analysis. As for EDM works, they are organized into four functionalities:
student modeling, tutoring, content and assessment. Concerning EDM applications, they
are classified into 11 educational categories: analysis and visualization of data; providing
feedback for supporting instruction; recommendations for students; predicting students’
performance; student modeling; detecting undesirable student behaviors; grouping
students; social network analysis; developing concept maps; constructing courseware;
planning and scheduling.
In the case of Spanish universities there are few research studies that have focussed
on the impact of virtual teaching on students’ significant learning (Mondejar et al., 2006).

Review syllabus Motivate

Educational Virtual Learning
Review quality systems Review and redesign System PLatform
teaching material
Institution and methodologies
responsibles Students

Figure 1. Feedback

Virtual teaching and Virtual Learning

Data mining
Feedback PLatform
learning process
Improve learning strategies
This research work analyses a DM methodology, regarding 71,518 data records, EDM for
developed and proven in previous studies (Chaparro Peláez et al., 2010; Lara et al., 2014; improving
Romero and Ventura, 2007; Romero et al., 2010) mixed with correlation between variables
and regression analysis. The methodology and use of the VLE is compared between two
universities and lecturers who teach different Accounting subjects, in several degree outcomes
courses. Collection, processing and analysis of information from the Moodle platform
are investigated through DM and multivariate analysis in relation to the monitoring 275
and evaluation of students. This analysis can additionally help students to develop
competencies and teachers to assess learning outcomes.
Specifically, the objectives of this research are twofold: first, to analyze which
online activities are performed by students in accounting using DM techniques. Second,
to identify the specific online activities that more positively influence on students’
learning outcomes.
The research questions defined are:
RQ1. Do accounting students online VLE activity (collected through DM) influence
in learning outcomes?
RQ2. What specific online activities in accounting subjects do positively influence
students learning outcomes?

2. Sample design and methodology

2.1 Methodology
Quantitative DM of the activities undertaken by e-learning in the Accounting lectures in
two Spanish universities, Universidad Complutense de Madrid, Management Accounting
and Analytical Accounting taught at the Faculty of Economics and Business
Administration (UCM) were analyzed as well as Cost Accounting taught at the Faculty of
Social Sciences of Talavera de la Reina in Universidad Castilla-La Mancha (UCLM).
To conduct this research, lecturers who have taught these subjects have reported on
the type of activities performed and the students’ competences and skills with which they
related. Therefore, activities related with specific subject knowledge were the lecturers’
presentations, readings and students’ delivery of assignments. Activities related with the
application of Information Technology to the field of cost and management accounting
were the use of specific accounting software programs. Other activities related to reporting
and communication were portfolio-designed and given in. Finally, activities related with
critical knowledge were the forum discussions. Due to the characteristics in accounting,
subjects where deep learning is promoted this type of activities were classified in a
simplified way into active or passive depending on the student participation based on their
knowledge and life-experience seen in Table I in line with Agudo-Peregrina et al. (2014).
The activities classified as active include students’ contribution, which involves an
interactive relation between the lecturer and the student or among students through the
platform. Passive activities include students’ poor involvement limited to mere spectators’
role. In a third column the relation of the activities done with the learning objectives of
accounting and the VLE has been defined in line with Romero et al. (2010).
However, in relation to all the activities undertaken by the VLE in the virtual
platform (Moodle), it was found that all lecturers who taught these subjects had
managed most of these activities through the platform and had accumulated such an
amount of data that the DM technique chosen was visualization and statistical patterns
of behavior to classify the activities. The DM used is based on preliminary studies such
as Romero and Ventura (2005, 2006), Shen et al. (2003), Agudo-Peregrina et al. (2014).
IJILT Students
32,5 Activities in the VLE participation Relation from main activities to accounting learning

Assignment upload Active Solving practical cases, presenting group or individual work.
Portfolios. Preparation of reports
Forum add post Active Online search and discussion of news that deals with the
276 Forum update post Active Feedback online search and discussion of news that deals with
the subject
Forum view Active Gathering the required information to undertake online
discussion research and discussion of news that deals with the subject
Assignment view Passive The student only views the assignment
Assignment view all Passive No relevant relationship
Blog view Passive The student consults the blog
Course recent Passive No relevant relationship
Course view Passive The student views all the subject contents
Forum unsubscribe all Passive The student unsubscribes the forum
Forum user report Passive No relevant relationship
Table I. Forum view forum Passive The student views the forum
VLE activities Forum view forums Passive No relevant relationship
and continuous Resource view Passive Obtaining material to carry out other evaluable activities
assessment Resource view all Passive No relevant relationship

In addition to descriptive statistics, correlation analysis and logistic regression was

used to identify the variables that determine the students’ performance. Logistic or
logic regression is a type of probabilistic statistical classification model that is also
used to predict a binary response variable from one or more predictor variables.
Particularly, the probabilities describing the possible outcomes of a single trial
are modeled, as a function of the independent variables, using a logistic function,
hereby its name.

2.2 Sample design

The sample taken had 129 students who participated in the study to be included in the
VLE activities. The student distribution between the different accounting subjects in
both universities is as follows in Table II within the 71.518 records of students’ activity
recollected in the VLE. In order to have coherent result only students who got a final
grade were included in the sample.

Group Subject Degree University Records %

AA-ECO-UCMa Analytical Economics UCM 11,003.00 20.16

CA-BA-UCLM Cost Accounting Business Administration and UCLM 28,792.00 27.91
MA-FBI-UCMa Management Finance, Banking and Insurance UCM 2,010.00 6.98
MA-BA-UCMa Management Business Administration and UCM 29,713.00 44.96
Table II. Accounting Management
Students’ activity 71,518.00
by groups Notes: aUCM Complutense de Madrid University; bUCLM Castilla-La Mancha-University
The assessment percentages for each activity contemplate in each subject analyze are EDM for
published in the teaching guides of its university. improving
Regarding the performance schedule, in Management Accounting and Analytical
Accounting, the student’s final grades consist of 60 percent in a final exam with a
theoretical framework and an exercise and 40 percent is the result of continuous outcomes
assessment. This continuous assessment is done throughout the semester where
students must submit answers to exercises (10 percent), participate in the online search 277
and discussion of news that deals with the subject (10 percent), work in teams and
present a case of budgeting in a company (10 percent). The last 10 percent of seminars
are held in the computer lab where students work on computers with databases of real
companies, videos and discussion of cases (Camacho et al., n.d.; Urquia et al., 2012).
On the subject of Cost Accounting, the final grade is composed of 70 percent from a
final test, and a continuous evaluation, which includes the theoretical and practical
content up to the remaining 30 percent from the evaluation of several course-works
(20 percent course portfolio and 10 percent participation in the search and discussion
of relevant topics). In the Faculty of Talavera all field coursework was carried out at
the computer room, which facilitated the delivery and feedback reports, use of
spreadsheets and other tools through the VLE.
However, some of the activities performances done in VLE are not really valued by
some lecturers therefore we have classified the VLE activities in active or passive
identified with the students’ participation in each activity (i.e. blog view).

3. Results discussion
A deeper analysis of the different activities revealed that not all the activities achieve
similar student participation. For example, “Course view” which means the student has
logged in and viewed the subject rated a 100 percent of participation (meaning that
every student has at least participated once in the activity). This result led us to remove
this variable because this is a compulsory activity that every student performs when
logging in the VLE in accounting and hence is a non-voluntary activity. Other variables
with fewer than 20 percent of students participating were also rejected (see Table III).
In order to demonstrate whether the final grade is influenced by activities carried
out in the different accounting subjects and universities, through a Pearson correlation
matrix (see Table IV) as a preliminary exploratory analysis. In this step although the
researchers have defined and registered 15 value-added activities the students have
performed through the virtual platform only 11 resulted significantly impacting in
students’ academic performance at least in one of the accounting subjects analyzed.
Some of the non-relevant activities were “resource view all,” “forum user report,”
“forum unsubscribe all,” “course recent” and “blog view.” For future accounting
subjects design these results will be taken into account.
In the subject of Management Accounting in the Business Administration Degree in
the UCM (MA-BA-UCM), in general, the performance of the group (final grade) showed
a significant correlation with what were considered to be “active” activities. In this
sense, it can be proved that students who have taken part in the VLE by handing in
exercises and cases (assignment upload); and taken part in the forum with
material-related news and discussed about it with their classmates (forum ad post and
forum update post) have obtained the highest marks. These results are in line with
Agudo-Peregrina et al. (2014); Chaparro Peláez et al. (2010) who affirm that there is a
small correlation between student’s active participation and academic performance.
IJILT Activity Students’ participation in the activities (%)
32,5 a
Course view 100.00
Forum view discussion 93.02
Assignment view 92.25
Forum view forum 91.47
Assignment upload 89.15
278 Resource view 86.82
Forum add post 80.62
Forum view forums 75.97
Assignment view all 70.54
Forum update post 38.76
Course recent 34.88
Forum user report 21.71
Resource view all 20.93
Forum subscribeb 14.73
Blog viewb 10.85
Forum unsubscribe 10.08
Forum searchb 5.43
Table III. Forum subscribe allb 2.33
Students Forum unsubscribe allb 0.78
participation Notes: aActivity removed for further analysis due to its compulsory character; bactivity removed for
per activity further analysis due to low participation

On the other hand these results are opposite to Abdous et al. (2012) who state that
students’ VLE activity do not correlate with academic performance although makes,
admits that applying jointly EDM and traditional statistical analysis facilities the
understanding of students learning behaviors and experiences. On the contrary, the
worst marks were those of students who had made scant and merely passive
participation in VLE activities. Nonetheless, it could be noticed that students who have
looked at the exercise corrections from their teachers, the most up-to-date information
(through the passive activity resource view) performed in a positive manner.
In the subject of Analytical Accounting in the Economics Degree in the UCM
(AA-ECO-UCM) the results obtained are slightly different, the only activity classified as
active which correlates with the students’ is the handing in of exercises by VLE
(assignment upload). However, the forum and their taking part in it (forum view
discussion and forum add post) have no significant correlation opposite to López et al.
(2012). As for passive activities, they only correlated with student performance in the
reading of teacher corrected exercises published in the VLE (resource view). In this
case the group showed itself to be much more passive in learning and their final marks
were lower.
In the subject Cost Accounting in the Business Administration Degree in the UCLM
(CA-BA-UCLM) it has been observed that the participation level is positively related to
contributions through forums (forum add post and forum view discussion), given that
most exercises handed in along with other contributions were given in through forums
so that every student could see each other’s work. This could explain the lack of
significant correlation with regard to exercises handed in from tasks (assignment
upload). In passive participation the lower resource view mark is due to intensive use of
the forum by the teacher including for handing out materials and links with tutorials in
line with Lopez-Perez et al. (2013).
Final grade
EDM for
AA-ECO- CA-BA- MA-FBI- MA-BA- All improving
Activities UCM UCLM UCM UCM groups learning
Active – assignment Pearson outcomes
upload correlation 0.409* 0.228 a
0.760** 0.363**
Sig. (bilateral) 0.038 0.181 0.000 0.000
n 26 36 9 58 129 279
Active – forum add post Pearson
correlation 0.138 0.562** 0.096 0.716** 0.281**
Sig. (bilateral) 0.500 0.000 0.806 0.000 0.001
n 26 36 9 58 129
Active – forum update Pearson
a **
post correlation 0.323 0.247 0.383 0.280**
Sig. (bilateral) 0.107 0.147 0.003 0.001
n 26 36 9 58 129
Active – forum view Pearson
discussion correlation 0.102 0.367* 0.484 0.646** 0.249**
Sig. (bilateral) 0.619 0.028 0.187 0.000 0.005
n 26 36 9 58 129
Passive – assignment Pearson
view correlation 0.499** 0.281 a
0.564** 0.360**
Sig. (bilateral) 0.010 0.097 0.000 0.000
n 26 36 9 58 129
Passive – assignment Pearson
view all correlation 0.313 −0.158 a
0.361** 0.207*
Sig. (bilateral) 0.119 0.357 0.005 0.018
n 26 36 9 58 129
Passive – course view Pearson
** ** * **
correlation 0.596 0.461 0.707 0.589 0.411**
Sig. (bilateral) 0.001 0.005 0.033 0.000 0.000
n 26 36 9 58 129
Passive – forum Pearson
a a
subscribe all correlation 0.656 0.099 0.181*
Sig. (bilateral) 0.055 0.462 0.040
n 26 36 9 58 129
Passive – forum view Pearson
forum correlation 0.194 0.393* 0.686* 0.663** 0.181*
Sig. (bilateral) 0.343 0.018 0.041 0.000 0.040
n 26 36 9 58 129
Passive – forum view Pearson
forums correlation −0.102 0.080 0.453 0.328*
Sig. (bilateral) 0.619 0.643 0.221 0.012 0.257
n 26 36 9 58 129
Passive – resource view Pearson
correlation 0.075 0.309 −0.106 0.628**
0.291** Table IV.
Sig. (bilateral) 0.714 0.067 0.787 0.000 0.001 Activities which
n 26 36 9 58 129 correlate
Notes: aCannot be calculated as at least one variable is a constant. *,**Correlations are significant at significantly with
0.05 and 0.01 levels (bilateral), respectively students’ final grade

Finally, in Management Accounting from the Finance, Banking and Insurance Degree
in the UCM (MA-FBI-UCM) we can observe a lack of correlation between the final grade
and all the activities which can be explained by the characteristics typical of the group
with a reduced number of students and not much need to use the VLE because the close
IJILT teacher-student relationship makes it possible for the student to be individually
32,5 followed in the classroom. Therefore, the effect of the VLE is complemented with the
face to face teaching in line with López-Pérez et al. (2011, 2013).
Additionally, a table has been designed in order to link each accounting group if it
has a significant relation with the final grade and its frequency weights have been
calculated (see Table V). It is interesting to highlight that when all the groups are
280 analyzed by frequencies a 56.52 percent of the activities done by VLE significantly
correlate with the final grade, exactly the same rate as in MA-BA-UCM. AA-ECO.UCM
and CA-BA-UCLM have, respectively, a rate of 21.73 percent and 30.43 percent, with
MA-FBI-UCM rating only 8.7 percent of significantly correlated activities.
This is the reason why lecturers of every group were surveyed about the perceived
level of intensity in using the VLE activities with the students in their accounting
subjects. Precisely, the teacher from the MA-FBI-UCM reported lower intensity
(the rating going from 1 (very low) to 4 (very high)) in the usage of the VLE they
reported what is disclosed in Table VI.
In addition to the information analyzed in this matrix the researchers decided to run
a linear regression to confirm which of the 11 activities carried out on the VLE had a
significant impact on learning accounting. The analysis was first, conducted for all the
groups described in Table IV and after for all students joined together.
A further step was to conduct an Ordinary Least Squares (OLS) regression analysis
with the remaining variables in line with Agudo-Peregrina et al. (2014) and López-Pérez
et al. (2013), following a step-wise methodology where several variables were dismissed
because of multicollinearity. The dismissed variables were course view, forum update
post, course recent, forum user report, resource view all, forum subscribe, blog view,
forum unsubscribe, forum search, forum subscribe all and forum unsubscribe all.
Hence, the variables included in the OLS regression were “assignment view,” “forum
view forum,” “assignment upload,” “resource view,” “forum add post,” “forum view
forums,” “assignment view all.” After several attempts with variables we also dismissed the
following variables “assignment view,” “forum view forum.” The final variables included in
the OLS regression were “assignment upload,” “resource view” and “forum add post.”

Group Activities related to final grade (%)

AA-ECO-UCM 21.74
Table V. CA-BA-UCLM 30.43
Activities linked MA-FBI-UCM 8.70
with academic MA-BA-UCM 56.52
performance All groups 56.52

Group Usage of VLE

Table VI. CA-BA-UCLM 4
Teachers use MA-FBI-UCM 1
intensity of MA-BA-UCM 4
VLE by groups Notes: 1 ¼ reduced; 2 ¼ moderate; 3 ¼ intense; 4 ¼ very high
The constant of the OLS regression was dismissed because it would have prevented the EDM for
occurrence of the value zero that in fact is a grade that despite being objectionable is improving
possible to obtain. The resulting model is shown in Table VII.
We can observe that the activities that have a significant influence on the results in
the subject of Accounting are handing in exercises through the VLE and participation outcomes
in the forum as well as consulting the lecturer provided resources. Participation in
forums and its positive effect in learning outcomes have also been demonstrated by 281
López et al. (2012). The model explains that independent variables have an 84.14
percent influence on the values of the dependent variable. There are many different
ways to calculate the R2 goodness of fit for logistic regression, and no consensus on
which one is best (Mittlbock and Schemper,1996). Also the pseudo R2 values have to be
interpreted with great caution because they do not mean the proportion of variance
explained by the predictors, as it happens with the R2 value in OLS regression, but the
model obtained here (McFadden, Cox, Snell and Nalgerke) have acceptable values.
In this way teachers must increase activities in the VLE that include active assignment
uploads, active forum uploads and publish learning resources for students to consult in
order to increase the learning outcomes in accounting.

4. Conclusions and future work

Research on improving the quality of education with the support of e-learning through
DM is a new research area due to the recent continuous use of virtual teaching in
universities and the complexity of observation and interpretation of data that
stimulates DM techniques use.
This research is the result of a quantitative experiment EDM analysis to the continuous
improvement of university teaching. The research group consists of several accounting
professors at the Complutense University of Madrid and the University of Castilla-La
Mancha, most of them very active in virtual learning. The students belong to various degree
courses in Economics, Business Administration and Finance, Banking and Insurance.
The activities take on a special role as inducers of learning outcomes and quality
indicators as established in the curriculum. Therefore, analysis and contribution to
the achievement of the learning outcomes through VLE are of particular interest.
In general, activities classified as “active” correlate positively with the student’s final
grade therefore correlating with students’ meaningful learning.

Coefficient SE t-ratio p-value

Active assignment upload 0.192065 0.0397343 4.8337 o0.00001***

Active forum add post 0.263127 0.0465055 5.6580 o0.00001***
Passive resource view 0.0102302 0.00271113 3.7734 0.00030***
Mean dependent var 4.157326 SD dependent var 2.438738
Sum squared resid 315.8730 SE of regression 1.950820
R2 0.841421 Adjusted R2 0.837600
F(3, 83) 140.2859 P-value(F ) 2.11e−32
Log-likelihood −177.9714 Akaike criterion 361.9428
Schwarz criterion 369.3059 Hannan-Quinn 364.9061
Notes: Model: OLS, using observations 1-129 (n ¼ 86); Missing or incomplete observations dropped: Table VII.
43; Dependent variable: Final Grade; Heteroskedasticity-robust standard errors, variant HC1; Analysis Ordinary least
conducted by the use of GNU Regression, Econometrics and Time-series Library .Gretl 1.9.91 cvs squares regression
IJILT There were differences between universities and subjects that will help the teachers
32,5 adjust their teaching guide, schedule, explanations in class and in VLE and assessments
in order to achieve better use and learning of the different accounting theoretical
framework basics to understand the subject matter. In addition, lecturers gave different
importance to the activities done and given in by the VLE and still value more the
physical forums and discussions. The results must be different due to the differences in
282 importance perceived by the students as well. The feedback received through the EDM
helps lecturers to review, reflect and adapt their teaching methodology and to discover
student’s learning behavior. In the future lecturers will use EDM to start the subjects,
to make better working groups and improve the training of teams for student
participation in forum. Definitively EDM will help lecturers to identify the value-added
activities performed by the students in accounting in order to achieve better deep
learning outcomes. With the regression analysis accounting teachers can observe that if
they increase activities in the VLE that include active assignment uploads, active forum
uploads and publish learning resources for students to consult possibly they will increase
the students learning outcomes in accounting.
The contribution of this study gives new possibilities for Higher Education to
predict the student’s success rate and encourage students to increase their work
effort in line with Abdous et al. (2012); López-Pérez et al. (2011). On the theoretical
level, this study advocates DM possibilities supported by VLE in detecting which
interactive activities are value-added for the students learning. In this context,
lecturers can decide to promote the use of some type of activities in order for them to
achieve high marks and eliminate some activities because they are associated with
low marks in line with Romero et al. (2009). On the empirical field study, the research
data and results proved that there are interrelations among activities done in VLE
and students’ final grades. Also in this research there was a further auto-evaluation
by the lecturers of its VLE usage intensity which is very important for the student
motivation and participation toward the VLE and its records in DM to observe the
relation with learning outcomes. Additionally, these findings constitute a guide for
decision makers in Higher Education Institutions in line with Paechter et al. (2010)
and lecturers to identify problems regarding student success. Finally, the OLS model
makes it possible to analyze and define which activities help improve students’
success rate. This research can be further expanded to international comparable
research of student data sets taken from different universities ( Jyväskylä University,
Berlin School of Economics and Business, Coimbra Institute) using different VLE and
DM technologies. A comparable study of student final grades for different student
data sets could greatly further expand the research potential and the use of the DM
technology as an important tool for the development of European Higher Education
Area knowledge.
As future trends, EDM tools must be designed for non-technical users. The
objectives of using this methodology in the educational context are several:
to improve active teaching methodology where teachers guide students in learning
improvement, to motivate students’ use of the VLE improving their significant, deep
learning and to assist decision making by institutional authorities toward improving
the quality of teaching.
Therefore, future research could consist of the influence of VLEs in different
subjects and degrees as well as planned activities within the “verified reports”[1] of
Spanish accreditation institutions (ANECA). Another future research trend is to
compare the performance and usefulness of different DM techniques for classifying
students using a DM tool applying algorithms to improve their classification EDM for
performance when we apply such pre-processing tasks as discretization and improving
rebalancing data, but others do not. We have also indicated that a good classifier
model has to be both accurate and comprehensible for instructors. In future
experiments, we want to measure the compressibility of each classification model and outcomes
use data with more information about the students (i.e. profile and curriculum) and of
higher quality (complete data about students that have done all the course activities). 283
In this way, we could measure how the quantity and quality of the data can affect the
performance of the algorithms. Finally, we also want to test the use of the tool by
teachers in real pedagogical situations in order to prove its acceptability.

1. The regulatory guidelines document for Spanish degrees.

Abdous, M., He, W. and Yen, C.-J. (2012), “Using data-mining for predicting relationships between
online question theme and final grade”, Educational Technology & Society, Vol. 15 No. 3,
pp. 77-88.
Agudo-Peregrina, A., Iglesias-Prada, S., Conde-González, M.A. and Hernández-García, A. (2014),
“Can we predict from log data in VLEs? Classification of interactions for learning analytics
and their relation with performance in VLE-supported F2F and online learning”,
Computers in Human Behavior, Vol. 31, February, pp. 542-550.
Alexander, S. and Golja, T. (2007), “Using students’ experiences to derive quality in an e-learning
system: an institution’s perspective”, Educational Technology & Society, Vol. 10 No. 2,
pp. 17-33.
Camacho, M., Urquía, E., Rivero, M.J. and Pascual, D. (n.d.), “Recursos multimedia para el
aprendizaje de la contabilidad financiera en los grados bilingües”, Revista Educación XX1 ,
available at: (in press).
Chaparro Peláez, J., Iglesias Pradas, S. and Pascual Miguel, F. (2010), “Uso del registro de
actividad de Moodle para un estudio del rendimiento académico de alumnos en entornos en
línea y presencial”, 4th International Conference on Industrial Engineering and Industrial
Management. XIV Congreso de Ingeniería de Organización. Donostia- San Sebastián. Spin,
September 8-10.
Coates, J., James, R. and Baldwin, G. (2005), “A critical examination of the effects of learning
management systems on university teaching and learning”, Tertiary Education and
Management, Vol. 11 No. 1, pp. 19-36.
Correa Gorospe, J.M. (2005), “La integración de plataformas de e‐learning en la docencia
universitaria: Enseñanza, aprendizaje e investigación con Moodle en la formación inicial
del profesorado”, Revista Latinoamericana de Tecnología Educativa, Vol. 4 No. 1, pp. 37-48,
available at: (accessed June 24, 2015).
Grob, H.L., Bensberg, F. and Dewanto, B.L. (2004), “Developing, deploying, using and evaluating
an open source learning management system”, Journal of Computing and Information
Technology, Vol. 12 No. 2, pp. 127-134, available at:
CIT/article/viewFile/1537/1241 (accessed June 24, 2015).
Hamalainen, W., Suhonen, J., Sutinen, E. and Toivonen, H. (2004), “Data mining in personalizing
distance education courses”, World Conference on Open Learning and Distance Education,
Hong Kong.
IJILT Heiner, C., Beck, J. and Mostow, J. (2004), “Lessons on using its data to answer educational
research questions”, Proceedings of the ITS2004 Workshop on Analyzing Student–Tutor
32,5 Interaction Logs to Improve Educational Outcomes, pp. 1-9.
Kim, W. (2008), “Using technologies to improve e-learning”, Journal of Object Technology, Vol. 7
No. 8, pp. 51-56.
Lara, J.A., Lizcano, D., Martínez, M.A., Pazo, J. and Riera, T. (2014), “A system for knowledge
284 discovery in e-learning environments within the European higher education area –
application to student data from Open University of Madrid, UDIMA”, Computers &
Education, Vol. 72, March, pp. 23-36.
Liaw, S., Huang, H. and Chen, G. (2007), “Surveying instructor and learner attitudes toward
e-learning”, Computers & Education, Vol. 49 No. 4, pp. 1066-1080.
López, M.I., Luna, J.M., Romero, C. and Ventura, S. (2012), “Classification via clustering for
predicting final marks based on student participation in forums”, Proceedings of the 5th
International Conference on Educational Data Mining, Chania, June 19-21.
López-Pérez, M.V., Pérez-López, M.C. and Rodríguez-Ariza, L. (2011), “Blended learning in higher
education: students’ perception and their relation to outcomes”, Computers & Education,
Vol. 56, pp. 818-826.
López-Pérez, M.V., Pérez-López, M.C., Rodríguez-Ariza, L. and Argente-Linares, E. (2013), “The
influence of the use of technology on student outcomes in a blended learning context”,
Educational Technology Research Development, Vol. 61, pp. 625-638.
Martin-Blas, T. and Serrano-Fernández, A. (2008), “The role of new technologies in the learning
process: Moodle as a teaching tool in Physics”, Computers & Education, Vol. 52, pp. 35-44.
Mazza, R. and Milani, C. (2005), “Exploring usage analysis in learning systems: gaining insights
from visualizations”, workshop on Usage Analysis in Learning Systems at 12th
International Conference on Artificial Intelligence in Education, Amsterdam, July 18,
pp. 65-72.
Mittlbock, M. and Schemper, M. (1996), “Explained variation in logistic regression”, Statistics in
Medicine, Vol. 15, pp. 1987-1997.
Mondéjar, J., Mondéjar, J.A.Y. and Vargas, M. (2006), “Implantación de la metodología e-learning
en la docencia universitaria: una experiencia a través del proyecto Campus virtual
(Implementation of e-learning methodology in university teaching: an experience through
the virtual Campus project)”, Revista Latinoamericana de Tecnología Educativa, Vol. 5
No. 1, pp. 59-71.
Mostow, J. (2004), “Some useful design tactics for mining its data”, Proceedings of the ITS2004
Workshop on Analyzing Student-Tutor Interaction Logs to Improve Educational Outcomes,
Maceio, August.
Nilakant, K. and Mitrovic, A. (2005), “Application of data mining in constraint-based
intelligent tutoring systems”, Proceedings of the Artificial Intelligence in Education, AIED,
pp. 896-898.
Oliveros, L. (2006), “Identificación de competencias: una estrategia para la formación en el
Espacio Europeo de Educación Superior”, Revista Complutense de Educación, Vol. 17 No. 1,
pp. 101-118.
Paechter, M., Maier, B. and Macher, D. (2010), “Students’ expectations of and experiences in
e-learning: their relation to learning achievements and course satisfaction”, Computers &
Education, Vol. 54, pp. 222-229.
Peña-Ayala, A. (2014), “Educational data mining: a survey and a data mining-based analysis of
recent works”, Expert Systems with Applications, Vol. 41, pp. 1432-1462.
Romero, C. and Ventura, S. (2006), Data Mining in E-Learning, WIT Press, Southampton.
Romero, C. and Ventura, S. (2007), “Educational data mining: a survey from 1995-2005”, Expert EDM for
Systems with Applications, Vol. 33 No. 1, pp. 135-146.
Romero, C., Ventura, S. and Bra, P.D. (2004), “Knowledge discovery with genetic programming for
providing feedback to courseware author”, User Modeling and User-Adapted Interaction:
The Journal of Personalization Research, Vol. 14 No. 5, pp. 425-464. outcomes
Romero, C., Ventura, S. and García, E. (2008), “Data mining in course management systems:
Moodle case study and tutorial”, Computers & Education, Vol. 51 No. 1, pp. 368-384. 285
Romero, C., Espejo, P., Zafra, A., Romero, J.R. and Ventura, S. (2010), “Web usage mining for
prediction final marks of student that use moodle courses”, Computer Applications in
Engineering Education, Vol. 21 No. 1, pp. 135-146. doi: 10.1002/cae.20456.
Romero, C., Gonzalez, P., Ventura, S., Del Jesus, M.J. and Herrera, F. (2009), “Evolutionary
algorithms for sub-group discovery in e-learning: a practical application using Moodle
data”, Expert Systems with Applications, Vol. 36, pp. 1632-1644.
Shen, R., Han, P., Yang, F., Yang, Q. and Huang, J. (2003), “Data mining and case-based reasoning
for distance learning”, Journal of Distance Education Technologies, Vol. 1 No. 3, pp. 46-58.
Sun, P., Tsai, R., Finger, G., Chen, Y. and Yeh, D. (2008), “What drives successful e-learning?
An empirical investigation of the critical factors influencing learner satisfaction”,
Computers & Education, Vol. 50, pp. 1183-1202.
Tang, L. and McCalla, G. (2005), “Smart recommendation for an envolving e-learning system”,
International Journal on E-Learning, Vol. 4 No. 1, pp. 105-129.
Urquía, E., Muñoz, C.I. and Cano, E. (2012), “Generating knowledge in management accounting
for the EHEA: using a simulation to learn about the balanced scorecard”, International
Journal of Critical Accounting, Vol. 4 No. 5, pp. 452-477.
Zorrilla, M.E., Menasalvas, E., Marin, D., Mora, E. and Segovia, J. (2005), “Web-usage
mining project for improving web-based learning sites”, Web Mining workshop

Further Reading
Evanschitzky, H., Iyer, G., Hesse, J. and Ahlert, D. (2004), “E-satisfaction: a re-examination”,
Journal of Retailing, Vol. 80 No. 3, pp. 239-247.

Corresponding author
Dr Julian Chamizo-Gonzalez can be contacted at: [email protected]

For instructions on how to order reprints of this article, please visit our website:
Or contact us for further details: [email protected]

You might also like