Using Machine Learning Classifiers To Predict Stock Exchange Index
Using Machine Learning Classifiers To Predict Stock Exchange Index
Using Machine Learning Classifiers To Predict Stock Exchange Index
2, April 2017
with the help neural networks, which can extract useful
Abstract—Predicting stock exchange index is an attractive information from a larger set of data. Halbert reported some
research topic in the field of machine learning. Numerous results of research using neural network techniques to model
studies have been conducted using various techniques to the asset price movements [2]. A step by step model based
predict stock market volume. This paper presents first detailed
study on data of Karachi Stock Exchange (KSE) and Saudi
on artificial neural network for classification and prediction
Stock Exchange (SSE) to predict the stock market volume of is proposed by Jing Tao Yao and Chew Lim in [3].
ten different companies. In this study, we have applied and Stock index (Taiwan) is predicted by Tiffany Hui-Kuang
compared salient machine learning algorithms to predict stock and Kun-Huang Huarng in [4] by using a novel fuzzy time
exchange volume. The performance of these algorithms have series model. Stock market price (Nigeria) is forecasted by
been compared using accuracy metrics on the dataset, collected Akinwale et al in [5] by implementing error back
over the period of six months, by crawling the KSE and SSE
website.
propagation and regression analysis. Predictive relationship
of numerous economic and financial variables is evaluated
Index Terms—Stock exchange prediction, machine learning, by David Enke and Suraphan Thawornwong in [6].
SVM, neural networks, Bayesian network, Ada-boost. Abdulsalam sulaiman et al. in [7] extracted values of
variables from database by using the moving average [MA]
method. These values are used to predict the future values of
I. INTRODUCTION
other variables through the use of time series data. Kuang
Forecasting of stock market has been a vital topic in Yu Huang and Chuen-Jiuan Jane in [8] proposed a hybrid
different fields of computational sciences because of its model to predict stock market that use autoregressive
possible monetary profit. Stock market is a place where high exogenous (ARX) prediction model, grey system and rough
capital is invested and companies do trading of their shares. set theory to predict stock market. Md. Rafiul Hassan et al.
Stock market forecasting poses the challenge of disproving in [9] predicted behavior of financial market by using a
the Efficient Market Hypothesis, which states that the mixed model of Hidden Markov Model, Artificial Neural
market is efficient and cannot be predicted. Researchers Network and Genetic Algorithms.
have worked hard to prove the fact that financial markets are Yi-Fan Wang et al. in [10] improved accuracy in
predictable. With the advancement and availability of prediction of stack indexes by using Markov chain concepts
technology, stock markets are now more accessible to into fuzzy stochastic prediction. Hsien-Lun Wong et al. in
investors. Various models have been proposed, both in [11] claimed that for short term period, time fuzzy time
industry and academia, for stock market prediction ranging series performs better for prediction. They used fuzzy time
from machine learning, to data mining, to statistical models. series employing ARIMA and vector ARMA model for
In this paper, we have predicted the stock exchange prediction.
volume of Karachi stock exchange (www.kse.com.pk) by Johan Bollen et al. in [12] used public sentiment to
crawling the real time data of ten different companies (of improve the prediction accuracy of prediction algorithms by
different sectors) by using salient machine learning analyzing twitter posts with tools like GPOMS and Opinion
classifiers like SVM, KNN, Ada-boost, Naïve Bayes, finder. Shunrong Shen and Tongda Zhang in [13] proposed
Bayesian Networks, Multilayer Perceptron and RBF. The a new prediction algorithm that used the notion of temporal
performance of these classifiers has been compared using correlation among global markets and various important
different matrices that include mean absolute error, root products to predict trend of next day stock. SVM was used
mean square error and accuracy. as a classifier in this study. Osman Hegazy et al. in [14] [15]
proposed a model to predict the stock market prices by using
II. LITERATURE REVIEW Particle Swarm Optimization (PSO) and Least Square
Support Vector Machine.
Stock rate is predicted by R.K. and Pawar D. D. in [1],
Manuscript received February 15, 2017, revised March 23, 2017. III. CLASSIFICATION APPROACHES EMPLOYED FOR
Mustansar Ali Ghazanfar is with University of Engineering and PREDICTING KSE AND SAUDI STOCK DATA
Technology Taxila, Pakistan (e-mail:[email protected].)
Saad Ali Alahmari is with Department of Computer Science, Shaqra In this section, we briefly describe various machine-
University, Saudi Arabia learning algorithms used for forecasting. For their detailed
Yasmeen Fahad Aldhafiri is with Department of Business implementation (and parameter setup), refer to our previous
Administration, Jubail University College, Saudi Arabia
Anam Mustaqeem, Muazzam Maqsood, and Muhammad Awais Azam
work [16].
are with University of Engineering and Technology Taxila, Pakistan.
doi: 10.18178/ijmlc.2017.7.2.614 24
International Journal of Machine Learning and Computing, Vol. 7, No. 2, April 2017
A. Support Vector Machine (SVM) a) Fit the machine learning algorithm ϕm (⋅) using
SVM find the optimal separating hyper-plane between weights wi on the training data.
two classes by solving the linearly constrained quadratic b) Compute errm = Ew[I(y ≠ ϕm (⋅)) ] and cm = log
optimization problem and the solution is relatively globally ((1−errm)/errm)
optimal. Researchers have claimed that SVM offer superior c) Update wi ← wi exp[cm ⋅I(yi ≠ ϕm (⋅)) ], i =1,...,n and
performance than other approaches [17]. As the varying renormalize so that ∑wi =1.
nature of stock market data declares it dynamically as a non- M
linear data; therefore, it is assumed that optimal performance 3) Output ϕ (⋅) = sign [ ∑ cm ϕm (⋅)].m=1
will be achieved by using a non-linear kernel. We have E. Multilayer Perceptron
applied polynomial kernel and used 1v1 (1 verses 1)
The approach used to train the multilayer perceptron is
approach for multiclass classification.
explained in below:
B. Naï ve Bayes Classifier The open, high, low, current and change in volume are
Naïve Bayes classifier is a statistical classifiers, which inputs to the network. The number of inputs to the network
predict class membership based on probabilities. Naive is 4. The output is the prediction of volume of each
Bayes classifiers make use of class conditional company. An architecture of Multi-Layer Perceptron is
independence, which makes it computationally faster. Class shown in Fig. 1.
conditional independence means every attribute in the given
class is independent of other attributes. Naive Bayes
classifier works as follows:
Let us suppose T represents a training set of samples. Let
there are k classes, so class labels would be 𝐶1, 𝐶2... C𝑘.
Each record is represented by an n-dimensional vector, 𝑋 =
{X1, X2,…, X𝑛}. It represents n measured values of the n
attributes 𝐴1, 𝐴2,..., 𝐴𝑛 respectively. Classifier will predict
the class of X based on highest a-posteriori probability. Thus
we find the class that maximizes (𝐶𝑖|𝑋). By Bayes Theorem,
we have k: Figure 1: Structure of multilayer perceptron.
25
International Journal of Machine Learning and Computing, Vol. 7, No. 2, April 2017
1 𝑁
MAE =
𝑁 𝑖=1 |𝑑𝑖 − 𝑧𝑖 | (11)
1 𝑁
RMSE = 𝑖=1(𝑑𝑖 − 𝑧𝑖 )2 (12)
𝑁
𝑁𝑐
Accuracy = 𝑁 (13)
Fig. 3. Structure of RBF.
where N denotes the total number of samples forecasted, di
IV. EXPERIMENTAL SETUP denotes the actual value of a sample, zi denotes the
We conducted experiments on historical dataset, collected forecasting value of a sample, and 𝑁𝑐 denotes the total
over the period of 6 months (Apr 2013 to Sep 2013) for ten number of correctly classified samples.
26
International Journal of Machine Learning and Computing, Vol. 7, No. 2, April 2017
Comparison of Accuracies
120
100
80
60
40
20
0
27
International Journal of Machine Learning and Computing, Vol. 7, No. 2, April 2017
APPENDIX A:
TABLE A1: A COMPARISON OF THE BEST THREE PERFORMING CLASSIFIERS IN TERMS OF ERROR AND ACCURACY FOR DIFFERENT SECTORS
As a future work, we focus on using social media analysis, [4] H.-K. Y. Tiffany and K.-H. Huarng, ―A neural network-based fuzzy
time series model to improve forecasting,‖ Elsevier, pp. 3366-3372,
for instance, using Tweets’ sentiments for specific stock in 2010.
addition to historical data for stocks' trend prediction. We [5] A. T. Akinwale, O. T. Arogundade, and A. F. Adekoya, ―Translated
reckon the resultant hybrid framework of these two Nigeria stock market price using artificial neural network for effective
prediction,‖ Journal of Theoretical and Applied Information
approaches will further improve the results. A cross Technology, 2009.
platform mobile application of this work is also in progress. [6] D. Enke and S. Thawornwong, ―The use of data mining and neural
We can also use more Saudi stock data to develop a model, networks for forecasting stock market returns,‖ 2005.
[7] A. S. Olaniyi et al., ―Stock trend prediction using regression analysis
which can run on both Pakistani and Saudi stock datasets. – A data mining approach,‖ AJSS Journal, 2010.
[8] K. Y. Huang and C.-J. Jane, ―A hybrid model stock market
REFERENCES forecasting and portfolio selection based on ARX, grey system and
RS theories,‖ Expert Systems with Applications, pp. 5387-5392, 2009.
[1] R. K. Daseand D. D. Pawar, ―Application of Artificial Neural [9] M. R. Hassan, B. Nath, and M. Kirley, ―A fusion model of HMM,
Network for stock market predictions: A review of literature,‖ ANN and GA for stock market forecasting,‖ Expert Systems with
International Journal of Machine Intelligence, vol. 2, issue 2, pp. 14- Applications, pp. 171-180, 2007.
17, 2010. [10] Y.-F. Wang, S. M. Cheng and M.-H. Hsu, ―Incorporating the Markov
[2] H. White,‖ Economic prediction using neural networks: The case of chain concepts into fuzzy stochastic prediction of stock indexes,‖
IBM daily stock returns,‖ Department of Economics University of Applied Soft Computing, pp. 613-617, 2010.
California, San Diego. [11] H.-L. Wong, Y.-H. Tu and C.-C. Wang, ―Application of fuzzy time
[3] J. T. Yao and C. L. Tan, Guidelines for Financial Prediction with series models for forecasting the amount of Taiwan export,‖ Experts
Artificial Neural Networks. Systems with Applications, pp. 1456-1470, 2010.
28
International Journal of Machine Learning and Computing, Vol. 7, No. 2, April 2017
[12] B. Johan, H. Mao, and X. Jun Zeng, ―Twitter mood predicts the stock Yasmeen Fahad Aldhafiri holds a MSc from University of Illinois at
market." Journal of Computational Science 2.1 (2011): 1-8. Urbana-Champaign. Her area of research include classification and
[13] S. R. Shen, H. M. Jiang, and T. D. Zhang, Stock Market Forecasting prediction, stock exchange prediction and data mining.
Using Machine Learning Algorithms, 2012.
[14] H. Osman, O. S. Soliman, and M. A. Salam, ―A machine learning
model for stock market prediction,‖ International Journal of Anam Mustaqeem is a PhD Scholar in the
Computer Science and Telecommunications, vol. 4, issue 12, Department of Software Engineering at UET-Taxila.
December 2013. Her areas of interest are Machine learning, Medical
[15] W. E. N. Fenghua et al., ―Stock price prediction based on SSA and Imaging, Software Quality Assurance, Wireless
SVM,‖ Procedia Computer Science, vol. 31, pp. 625-631, 2014. Networks and Adhoc Networks.
[16] G. M. Ali and P.-B. Adam, ―The advantage of careful imputation
sources in sparse data-environment of recommender systems:
generating improved SVD-based recommendations,‖ Informatica, vol.
37, no. 1, pp. 61-92, 2013.
[17] S. Ying, Z. Fengting, and Z. Tao, ―China’s stock index futures
Awais holds a BSc in computer engineering (Gold Medalist) from UET-
regression prediction research based on SVM,‖ China Journal of
Taxila, Pakistan; a MSc in computer engineering and PhD in machine.
Management Science, vol. 3, pp. 35-39, 2013.
She is learning from University of Queen Mary UK. His areas of
[18] G. M. Ali and P.-B. Adam, ―Exploiting context in kernel-mapping
research include recommender systems, Ubiquitous computing, and
recommender system algorithms,‖ in Proc. Sixth International
Internet of things.
Conference on Machine Vision (ICMV 13), Nov. 2013, Italy.
[19] G. M. Ali, P.-B. Adam and S. Sandor, ―Kernel mapping recommender
systems,‖ Information Sciences, vol. 208, pp. 81-104, 2012.
Muazzam Maqsood is a PhD scholar in Software
Engineering Department, UET TAXILA. His area of
research include speech classification and prediction
Mustansar Ali Ghazanfar holds a BSc in Software
engineering (Gold Medalist) from UET-Taxila,
Pakistan; MSc in software engineering and PhD in
machine learning from University of Southampton UK.
His area of research include recommender systems,
prediction, stock market, and socio-economical and
healthcare modelling.
Saad Ali Alahmari gets PhD in artificial intelligence Saad Ali Alahmari gets PhD in artificial intelligence and semantic web
and semantic web from University of Southampton UK. His areas from University of Southampton UK. His areas of research include
of research include recommender systems, Web services, Semantic Web, recommender systems, Web services, Semantic Web, Big data, and data
Big data, and data mining. mining.
29