Detection of Phishing Website

Title: Detection of Phishing Website
Project Guide: Candidate:

Dr Ankit Kumar Jain Raut Omprakash Jagannath
32213219
Introduction:
 Phishing is a type of social engineering attack which is used for stealing
personal details of an individual. The information includes passwords,
bank account details ,credit card details. It occurs when an attacker tries
to or behave like an legitimate provider. The individual is then fooled into
clicking a malicious websites link, which can lead to the installation of
malware, the freezing of the system as part of a ransomware attack or the
revealing of sensitive information.
 The most phishing attacks are carried out by fake emails, containing a
URL of a fake malicious Website. There are three types of phishing
attacks. 1)Spear Phishing 2)Clone Phishing 3)Whaling.
Problem Statement:
 With the Increasement in online transactions, cybercrime has rapidly
increased and Hackers attempt to trap the end-users through various
forms such as phishing, SQL injection, malware, man-in-the-middle,
domain name system tunnelling, ransomware, web trojan, and so on.
Among all these attacks, phishing is the most reported attack.
 We know that Phishing attack exploits vulnerabilities of human like

getting trapped by fake URLs and most of the protection protocols
fails to detect whole phishing attacks.
Motivation:
 In present scenario, detecting phishing websites are difficult and so we
are creating a system that will help user to identify phishing without
any hesitation. In this system Admin can add phishing website URL or
fake website into system where system could access and scan the
phishing website and by using algorithm, it will add new suspicious
keywords to database.
 There are many solution available but the accuracy is low whenever
new phishing websites comes.
 In the third quarter of 2022, APWG observed 12,70,883 total phishing

attacks, a new record and the worst quarter for phishing that APWG
has ever observed.
Most Targeted Industries,3Q2022
Social Media Logistics/Shipping Payment eCommerce/Retail Telecom Cryptocurrency
Financial Institution Webmail Others

Ransomware Victim Industries,3Q2022
Manufacturing Business Services Retail & Wholesale Construction Finance

Health Care Education Legal Services Government Energy/Resources
Transportation Others
Objective:
 The Objective of this project is to provide a better solution for
detection of phishing websites. Overall, the project aims to improve
the security and privacy of users personal information like
passwords, bank credentials and many more.
 Protect them from the websites which seems to be malicious and can
phish the user and to design detection system based on URL.
Related Work :
ALGORITHM / THEIR ADVANTAGE THEIR
TECHNIQUES DISADVANTAGE
Simple forward and Easy to use and used to It generally fails when
linear classification classify fake or phishing normal text is replaced by
technique.[1][3] websites. image.
Random Forest Model.[1] It can handle the It fails on phishing

[3] overfitting problem. websites which use
Captcha.
Logistic Regression Effective detection Limits in certain factors

technique.[1][3] like repetition of and
incapability of features.
Support Vector Machine More suitable for many Don’t work perfectly
Technique.[1][3] variables which are when noisy large datasets
independent in nature. are used.
Methodology: Deep Learning Method
 Deep Learning is defined as a machine learning technique, where many

layers of information processing stations are exploited, by classification
patterns and characteristics, or by learning by representation. In fact, Deep
Learning is implemented by neural networks.
 However, it has become popular recently, due to three factors: First, there
is a notable increase in processing capabilities (e.g., video cards, graphical
processors, etc.); second, by affordable computer hardware; and third, due
to recent advances and developments in Deep Learning research.
 Our work focuses on the exploration of surface-level features from URLs

to train a confidence-weighted learning algorithm. The idea is to restrict
the source of possible features to the character string of the URL and avoid
having the vulnerability of extracting host-based information
References:
 Miyamoto, D., Hazeyama, H., & Kadobayashi, Y. (2009). An evaluation of machine
learning-based methods for detection of phishing sites. In advances in neuro-information
processing: 15th international conference, ICONIP 2008, auckland, new zealand,
november 25-28, 2008, revised selected papers, part I 15 (pp. 539-546). Springer berlin
heidelberg.
 Ali, waleed. "Phishing website detection based on supervised machine learning with
Odeh, A., Keshta, I., & Abdelfattah, E. (2021, January). Machine learning techniques for
detection of website phishing: A review for promises and challenges. In 2021 IEEE 11th
Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0813-
0818). IEEE.
 Marchal, Samuel, Kalle Saari, Nidhi Singh, and N. Asokan. "Know your phish: Novel
techniques for detecting phishing sites and their targets." In 2016 IEEE 36th
International Conference on Distributed Computing Systems (ICDCS), pp. 323-333.
IEEE, 2016.
 Blum, Aaron, et al. "Lexical feature based phishing URL detection using online
learning." Proceedings of the 3rd ACM Workshop on Artificial Intelligence and Security.
2010.
 https://apwg.org/trendsreports/

Detection of Phishing Website

Uploaded by

Copyright:

Available Formats

Detection of Phishing Website

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Detection of Phishing Website

Uploaded by

Copyright:

Available Formats

Title: Detection of Phishing Website

Project Guide: Candidate:

 We know that Phishing attack exploits vulnerabilities of human like

 In the third quarter of 2022, APWG observed 12,70,883 total phishing

Social Media Logistics/Shipping Payment eCommerce/Retail Telecom Cryptocurrency

Financial Institution Webmail Others

Manufacturing Business Services Retail & Wholesale Construction Finance

Random Forest Model.[1] It can handle the It fails on phishing

Logistic Regression Effective detection Limits in certain factors

 Deep Learning is defined as a machine learning technique, where many

 Our work focuses on the exploration of surface-level features from URLs

You might also like