Credit Card Fraud Deteciton Using SVM

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Credit Card

Fraud
Detection
using SVM
B13
Introduction
● Payments using credit cards have increased in recent years.Now-a-days credit card payments
are necessary and convenient to use
● The credit card fraud can be of two types: offline fraud and online fraud.
● The fraud begins either with the theft of the credit card or the compromise of data associated
with the account, including the card account number or other information that would
routinely and necessarily be available to a merchant during a legitimate transaction.
● The key concept in fraud detection is to analyze the spending behaviors of the user
● Support vector machine is a method used in pattern recognition and classification. It is a
classifier to predict or classify patterns into two categories; fraudulent or non fraudulent
PROBLEM VS. SOLUTION

Problem Solution
Due to the increase of fraudulent credit SVM is used for classification
card transactions, there is a need to find which can easily detect credit
the efficient fraud detection model. card fraud.
Why SVM?
Support Vector Machine (SVM) is an active
research area and successfully solves classification
problems in noisy and complex domains.

SVM played a major role in the area of machine


learning due to its excellent generalization
performance in a wide range of learning problems,
such as handwritten digit recognition, classification
of web pages and face detection.

It is a classifier to predict or classify patterns into


two categories; fraudulent or non fraudulent. It is
well suited for binary classifications.
Overview of Support Vector Machine
Support Vector Machine or SVM is one of the most
popular Supervised Learning algorithms, which is
used for Classification as well as Regression
problems. However, primarily, it is used for
Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best


line or decision boundary that can segregate n-
dimensional space into classes so that we can easily
put the new data point in the correct category in the
future. This best decision boundary is called a
hyperplane.

SVM chooses the extreme points/vectors that help


in creating the hyperplane. These extreme cases are
called as support vectors, and hence algorithm is
termed as Support Vector Machine.
Overview of Support Vector Machine
The working of the SVM algorithm can be understood So as it is 2-d space so by just using a straight
by using an example. Suppose we have a dataset that has line, we can easily separate these two classes.
two tags (green and blue), and the dataset has two But there can be multiple lines that can separate
features x1 and x2. We want a classifier that can classify these classes. Consider the below image:
the pair(x1, x2) of coordinates in either green or blue.
Consider the below image:
Overview of Support Vector Machine

Hence, the SVM algorithm helps to


find the best line or decision boundary;
this best boundary or region is called as
a hyperplane. SVM algorithm finds
the closest point of the lines from both
the classes. These points are called
support vectors. The distance between
the vectors and the hyperplane is called
as margin. And the goal of SVM is to
maximize this margin. The hyperplane
with maximum margin is called the
optimal hyperplane.
Credit Card
Fraud
Detection
Dataset
Dataset

The dataset
contains
transactions made
by credit cards in
September 2013
by European
cardholders.
284,807
Transactions and 492 frauds
Dataset
● The dataset is highly unbalanced, the positive class
(frauds) account for 0.172% of all transactions.
● It contains only numeric input variables which are the
result of a PCA transformation.
● Features V1, V2, … V28 are the principal components
obtained with PCA, the only features which have not been
transformed with PCA are 'Time' and 'Amount'.
● Feature 'Time' contains the seconds elapsed between each
transaction and the first transaction in the dataset.
● The feature 'Amount' is the transaction Amount, this
feature can be used for example-dependant cost-sensitive
learning.
● Feature 'Class' is the response variable and it takes value 1
in case of fraud and 0 otherwise.
Features used for Classification

Transaction Date

Transaction Time

Transaction Amount

Frequency of card usage

Transaction Location

Average amount of transaction per


month
Steps
Result
Model has been trained on data processed with PCA

The accuracy rate obtained is 86.21%


Result
Model has been trained on data pre-processed with PCA and further subjected to dimension reduction

The accuracy rate obtained is 91.93%


Result

After re-balancing the weight of each class….


Result
Model has been trained on data processed with PCA

The accuracy rate obtained is 89.61%


Result
Model has been trained on data pre-processed with PCA and further subjected to dimension reduction

The accuracy rate obtained is 94.78%


Thanks!

You might also like