Machine Learning S
Machine Learning S
Machine Learning S
The use
of credit cards in e-commerce sites and various banking sites has been increased rapidly in recent
times. As modernization will have both positive and negative impacts, the use of credit cards in
online transactions has made it simple; likewise, it also led to the increase of the number of fraud
transactions. As part of the activities happening, it is always advised for the e-commerce sites
and the banks to have automatic fraud detection system. Credit card fraud might result in huge
financial losses. While look for the solutions for credit card frauds that are happening, machine
learning techniques provide favorable solutions. The proposed system uses a random forest
application in solving the problem and to attain more accuracy when compared to the other
algorithms used till now. All the basic classifiers have the same weight but random forest
algorithm has relatively high and others have relatively low weights because of the
randomization of bootstrap sampling of a making decision and selection of attributes cannot
guarantee that all of them have the same stability in decision making.
OVERVIEW In our everyday life, various transactions are done through credit card payments,
cardless transactions like Google Pay, PhonePe, Samsung Pay, and PayPal. There is an ongoing
concern in recent days which is fraud detection, and it leading to the great loss of money every
year. If the fraud continues this way, it is said that by the year 2020, it will reach double digits.
Nowadays, the presence of the card isn’t physically required to finish the exchange which is
prompting increasingly more extortion exchanges [1]. Fraud detection has an emotional impact
on the economy. In this way, fraud detection is fundamental and vital. Financial institutions have
to employ various fraud detection techniques for tackling this problem [2, 3]. But when given
time the fraudsters find ways to overcome the techniques established by the company holders.
Despite all the preventive methods taken by the financial institutions and strengthening of law
and government putting their best efforts to eradicate fraud detection, fraud detection continues
to rise and it remains as a major concern in the society [4, 5]. Credit cards are generally utilized
in the improvement of the Internet business and furthermore portable applications and primarily
in the online-based exchanges. With the help of the credit card, the online transactions and online
payment are easier and convenient for usage [6, 7]. Fraud transactions have a great influence on
enterprises [8]. Machine learning techniques have been widely used, and it has become very
important in many areas where spam classifiers protect our mail id. The fraud detection systems
learn the features of extraction and helps in controlling the fraud detection.
EXISTING SYSTEM In existing System, a research about a case study involving credit card
fraud detection, where data normalization is applied before Cluster Analysis and with results
obtained from the use of Cluster Analysis and Artificial Neural Networks on fraud detection has
shown that by clustering attributes neuronal inputs can be minimized. And promising results can
be obtained by using normalized data and data should be MLP trained. This research was based
on unsupervised learning. Significance of this paper was to find new methods for fraud detection
and to increase the accuracy of results. The data set for this paper is based on real life
transactional data by a large European company and personal details in data is kept confidential.
Accuracy of an algorithm is around 50%. Significance of this paper was to find an algorithm and
to reduce the cost measure. The result obtained was by 23% and the algorithm they find was
Bayes minimum risk.
DISADVANTAGES
1. In this paper a new collative comparison measure that reasonably represents the gains and
losses due to fraud detection is proposed.
2. A cost sensitive method which is based on Bayes minimum risk is presented using the
proposed cost measure.
PROPOSED SCHEME In proposed System, we are applying random forest algorithm for
classification of the credit card dataset. Random Forest is an algorithm for classification and
regression. Summarily, it is a collection of decision tree classifiers. Random forest has advantage
over decision tree as it corrects the habit of over fitting to their training set. A subset of the
training set is sampled randomly so that to train each individual tree and then a decision tree is
built, each node then splits on a feature selected from a random subset of the full feature set.
Even for large data sets with many features and data instances training is extremely fast in
random forest and because each tree is trained independently of the others. The Random Forest
algorithm has been found to provide a good estimate of the generalization error and to be
resistant to over fitting. ADVANTAGES OF PROPOSED SYSTEM
Random forest ranks the importance of variables in a regression or classification problem in a
natural way can be done by Random Forest.
The 'amount' feature is the transaction amount. Feature 'class' is the target class for the binary
classification and it takes value 1 for positive case (fraud) and 0 for negative case (not fraud