Ali Docs

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

PROJECT REPORT ON

MOVIE RECOMMENDATION USING MACHINE LEARNING


Developed By
Rohit Narendra Chaudhari (07)
Ali Ambar Shamim Sheikh (34)
s

2023-2024
1|P a ge
CERTIFICATE
Exam Seat No:

This is to certify that Mr. Rohit Narendra Chaudhari (07) and


Mr. Ali Ambar Shamim Sheikh (34) have successfully completed the
Project Work entitled “Movie Recommendation Using Machine
Learning” for S.Y. MSc. (Data Science) Sem-III of Savitribai Phule
Pune University for the Academic Year 2023-2024.

Project Guide Head of Department

Internal Examiner External Examiner

2|P a ge
DECLARATION

We hereby declare that the project entitled – “Movie


Recommendation Using Machine Learning”, is being submitted
as a Major Project of the 3rd semester in MSc. Data Science to
Savitribai Phule Pune University is an authentic record of our
genuine work done under the guidance of Assistant Prof.
Yogesh Ingale, Department of Computer Science of DR.D.Y.
PATILARTS, COMMERCE AND SCIENCE COLLEGE, PIMPRI(PUNE).

Date: Name: Rohit Narendra Chaudhari.


Place: Pimpri Ali Ambar Shamim Sheikh.

3|P a ge
ACKNOWLEDGEMENT

There are so many people who contributed either directly


or indirectly to complete this project. I shall mention a few of
them, who personally or professionally encouraged and
assisted us during the entire duration of the project a very
pleasant endeavor.

I am extremely thankful to our project guide Mrs. Tejal


Pegwar for her valuable suggestions while the developing
project. I am thankful to our head of Department Dr. Ranjit Patil
for his constant motivation and encouragement.

It was a fantastic and knowledgeable experience for


both of us to work together on the project topic given to us.
We learned that ‘A team Spirit nature makes any difficult task
easier and joyful’.

We would also like to thank all the Teaching and Non-


Teaching staff members of the Computer Science
Department who have helped us in this project without
which this project was an absolute dream for us.

4|P a ge
Table of Contents/Index Page

Chapter Chapter Name Page


No. No.

1. Introduction 7-11

2. System Requirements 12

3. Algorithms /Methodology Used 13-17

4. Output and Result analysis /Dataset 18-28


analysis
5. Conclusion and Recommendations 29

6. Future Enhancement /Further Steps 30ssss

References/Bibliography

5|P a ge
ABSTRACT

A recommendation system is a system that provides


suggestions to users for certain resources like books,
movies, songs, etc., based on some data set. Movie
recommendation systems usually, predict what movies
a user will like based on the attributes present in
previously liked movies. Such recommendation
systems are beneficial for organizations that collect
data from large amounts of customers and wish to
effectively provide the best suggestions possible. A lot
of factors can be considered while designing a movie
recommendation system like the genre of the movie,
the actors present in it, or even the director of the
movie, the cast of the movies, and the crew of the
movies. The systems can recommend movies based on
one or acombination of two or more attributes. In this
paper, the recommendation system has been built on
the movies-id, title, overview, genres, keywords, cast, and
crew that the user might prefer to watch. The approach
adopted to do so is content-based filtering using Cosine
Similarity. The dataset used for the systemis the Movie
Lens dataset.

6|P a ge
1. Introduction

In today’s world, we all use many platforms for


entertainment, like YouTube, in the initial stages. Further, going
forward, many platforms emerged like Aha, Hotstar, Netflix,
Amazon Prime Video, Zee5, Sony Liv, and many more. First, we
will see a video or movie based on our interest by searching for
the desired movie on the search engine. The recommendation
system works here. The system will analyze the video or the
movie that we have watched. Analyzation may be based on the
film genre, cast, director, music director, crews etc. Based on this
analysis made by the recommendation system, we will be getting
some recommendations for the next videos.

---Movie recommendation with machine learning.


Movie recommender systems are intelligent
algorithms that suggest movies for users to watch based on
their previous viewing behavior & preferences.

These systems analyze data such as users' ratings,


reviews, and viewing histories to generate personalized
recommendations.
Movie recommender system has revolutionized the
way people discover & consume movies, enabling users to
navigate through vast catalogs of films more efficiently.
7|P a ge
-Background /problem statement

This recommendation system recommends


different movies to users. Since this system is based on content-
based filtering, it will give progressively explicit outcomes
contrasted with different systems that are based on the
collaborative approach. Content-based recommendation
systems are constrained to people, these systems don't
prescribe things out of the box. These systems work on individual
users’ ratings, hence limiting your choice to explore more.

Our system is based on a contain-based filtration


strategy for movie recommendation systems, in which it uses the
data provided about the items (movies). This data plays a crucial
role here and is extracted from only one user. An ML algorithm
used for this strategy recommends motion pictures that are
similar to the user’s preferences in the past. Therefore, the
similarity in content-based filtering is generated by the data
about past film selections and likes by only one user.

8|P a ge
1.1 : - Scope and Purpose /Working of the proposed
project

Scope: -
A movie recommendation system’s scope includes
implementing diverse recommendation algorithms, data
collection, and user profiling. It encompasses user
interface design, privacy measures, scalability
considerations, and feedback mechanisms. The system
should provide accurate, diverse, and engaging movie
suggestions while aligning with business objectives,
ensuring continuous improvement, and adhering to
security and privacy standards.

Purpose: -
The purpose of a movie recommendation system
is to enhance the user experience by offering personalized
movie suggestions that match individual preferences,
fostering increased user engagement, content
consumption, and platform loyalty. It drives revenue
growth through content discovery, optimizes resource
allocation, and provides valuable data-driven insights.

9|P a ge
1.2 : - Advantages of project

1. Personalized Content Discovery: - Movie recommendation


systems analyze user preferences, past viewing habits, and
ratings to provide personalized movie suggestions. This helps
users discover new movies they might enjoy, leading to a more
satisfying viewing experience.

2. Improved User Satisfaction: - When users consistently find


movies they like, they are more likely to have a positive
experience with a streaming service or platform. This can
enhance user satisfaction and loyalty.

3. Enhanced Content Monetization: - Recommender systems


can increase the consumption of content, leading to more paid
subscriptions, rentals, or purchases. This can boost revenue for
streaming platforms and movie distributors.

4. Competitive Advantage: - Streaming platforms and movie


providers that implement effective recommendation systems
gain a competitive advantage. Users are more likely to choose
platforms that provide personalized recommendations over
those that do not.

5. Increased Customer Satisfaction: - When users consistently


find the content they enjoy through recommendations, they are
more likely to be satisfied with the platform's service. Higher
customer satisfaction can lead to increased loyalty and word-of-
mouth recommendations.
10 | P a g e
1.3 : Limitations /Drawbacks of project

1. Cold Start Problem: - One of the significant challenges is the


"cold start" problem, which occurs when a recommendation
system has limited data about a new user or movie. It's difficult
to provide accurate recommendations without sufficient user
interactions or movie ratings.

2. Exploration: - Some recommendation systems focus on


maximizing user engagement by providing highly personalized
recommendations, but this can limit opportunities for users to
explore new genres or styles of movies.

3. Sparsity of Data: - systems rely on user interactions (ratings,


views, etc.) with movies to make recommendations. However,
user interactions are typically sparse, meaning many users have
not rated or interacted with a significant portion of available
movies. This can make it challenging to provide accurate
recommendations for all users.

11 | P a g e
2. System Requirements:

-Hardware requirements

Operating system- Windows 7,8,10,11; Processor-


dual core 2.4 GHz (i3 to i7 series Intel processor); RAM-
4GB or higher and 100GB ROM.

-Software requirements

1. Python
2. Jupiter Notebook Chrome
3. PyCharm

12 | P a g e
3. Algorithms /Methodology Used:

 Content-based filtering.

Content-based filtering utilizes the attributes &


metadata of a movie to generate recommendations that
share similar properties. For instance, the analysis of the
genre, director, actors, & plot of a movie recommendation
system dataset would be leveraged for suggesting movies
of the same genre, with similar actors or themes.

The primary advantage of content-based filtering is that it


can produce reliable recommendations, even with the
absence of user data. However, the quality of content-
based filtering can be affected if a movie's metadata is
incorrectly labelled, misleading, or limited in scope.

 NLTK: -

It contains text-processing libraries for tokenization,


parsing, classification, stemming, tagging, and semantic
reasoning. It also includes graphical demonstrations and
sample data set as well as accompanied by a cook book and a
book that explains the principles behind the underlying
language processing tasks that NLTK supports.

13 | P a g e
 Porter Stemmers: -

Porter Stemmers vary in their aggressiveness. Porter


is one of the most aggressive Porter Stemmer for English. It
usually hurts more than it helps. On the lighter side, you can
either use a lemmatize instead as already suggested, or a
lighter algorithmic stemmer.

 Count Vectorizer: -

Movie attributes, such as genre, director, actors, and


keywords, can be transformed into numerical representations.
These representations can be vectors or embeddings that
capture the relationships between movies based on their
attributes. Similarity measures like cosine similarity or Euclidean
distance can be used to recommend movies with similar attribute
profiles.

 Cosin Similarity: -
To recommend movies to a user, the systemcalculates
the cosine similarity between the user's profile vector and the
vectors representing the moviesin the database.
The cosine similarity between two vectors, A and B, is
calculated using the following formula:

14 | P a g e
Cosine Similarity (A, B) = (A • B) / (||A|| * ||B||)
--- A • B represents the dot product of vectors A and B.
--- ||A|| and ||B|| represent the magnitudes (lengths)
of vectors A and B.

After calculating the cosine similarity between the userprofile


and each movie vector, the system ranks the movies in
descending order of similarity. Movies with higher cosine
similarity scores are considered moresimilar to the user's
preferences and are recommendedaccordingly.

 Stream lit: -

The trend of Data Science and Analytics is increasing


day by day. From the data science pipeline, one of the most
important steps is model deployment. We have a lot of options in
python for deploying our model. Some popular frameworks are
Flask and Django. But the issue with using these frameworks is
that we should have some knowledge of HTML, CSS, and
JavaScript. Keeping these prerequisites in mind, Adrien Treuille,
Thiago Teixeira, and Amanda Kelly created “Stream lit”. Now using
stream lit you can deploy any machine learning model and any
python project with ease and without worrying about the
frontend. Stream lit is very user-friendly.

15 | P a g e
 enumerate: -
In content-based recommendation systems,
enumerate is used to create an index for items, making it
easier to retrieve item attributes or features for similarity
calculations. It allows efficient mapping between item
identifiers and their corresponding data. User profiles are
often represented as vectors. enumerate helps create user
profile vectors with unique indices for eachitem, reflecting
the user's interactions or preferences.

 pickle: -
The pickle module implements binary protocols
for serializing and de-serializing a Python object structure.
“Pickling” is the process whereby a Python object hierarchy
is converted into a byte stream, and “unpickling” is the
inverse operation, whereby a byte stream (from a binary file
or bytes-like object) is converted back into an object
hierarchy. Pickling (and unpickling) is alternatively known as
“serialization”, “marshaling,” 1, or “flattening”; however, to
avoid confusion, the terms used here are “pickling” and
“unpickling”.

16 | P a g e
 Requests: -

Requests library is one of the integral parts of


Python for making HTTP requests to a specified URL.
Whether it be REST APIs or Web Scraping, requests is a must
to be learned for proceeding further with these
technologies. When one makes a request to a URI, it returns
a response. Python requests provide inbuilt functionalities
for managing both the request and response.

17 | P a g e
4. Output and Result analysis /Dataset analysis
Dataset Description: -
A movie recommendation dataset contains movie
details (ID, title, genre, etc.), user interactions (ratings,
timestamps), user profiles (demographics, preferences),
and optional metadata (keywords, reviews). It's used to
build recommendation systems, suggesting movies to users
based on their behavior and movie attributes like genre and
cast.

Features (Attributes):
The dataset includes details such as budget,
Genres, homepage, id, keywords, original language, original
title, Overview, popularity, runtime, spoken language,
status, crew, director, title, vote average, vote count,
tagline, and cast.
Target Variable:
The target variable is "Title" denoting the name of a
recommended set of movies.

Dataset Size:
It comprises a substantial number of movie
records, ranging from several hundred to several thousand
entries, sourced from online listings, automotive websites,
and government databases, OTT platforms.

18 | P a g e
Data Preparation:
Before building a model, data pre-processing is
required in which we extract the relevant features fromthe
dataset and, in cleaning replace the null values withnull
strings. After that converting the text data to feature vectors
or converting text data to numerical data.

Purpose:
A movie recommendation system aims to enhance
the user experience by offering personalized movie
suggestions that match individual preferences, fostering
increased user engagement, content consumption, and
platform loyalty. It drives revenue growth through content
discovery, optimizes resource allocation, and provides
valuabledata-driven insights.

19 | P a g e
-Result/Output Snapshots

1) Libraries Used: –

20 | P a g e
2) Dataset used: –

21 | P a g e
22 | P a g e
 Data header: -

23 | P a g e
 Checking rows and columns in the dataset: -

 Selecting the relevant features: -

24 | P a g e
 Removing the null values: -

 Removing the space between words: -

25 | P a g e
 Adding a tags column: -

 Converting text data into numerical data:-

26 | P a g e
3) Getting the similarity score: -

27 | P a g e
4) Recommended List of movies(output):-

28 | P a g e
5. Conclusion and Recommendations.

A content-based movie recommendation system


offers several advantages. It leverages the content and
characteristics of movies to provide personalized
recommendations, making it a valuable tool for users
seeking tailored movie suggestions.

By analyzing features such as genre, actors, director,


and user preferences, this system can help users discover
movies that align closely with their tastes.

Overall, a content-based movie recommendation


system is a valuable addition to the world of entertainment,
offering users a convenient way to discover movies they are
likely to enjoy based on their past viewing history and
preferences.

As technology continues to advance, we can expect


further improvements and innovations in recommendation
systems, ultimately enhancing the user experience in the
worldof cinema.

29 | P a g e
6. Future Enhancement /Further steps

Future enhancements for a movie


recommendation system using content-based filtering can
help it stay competitive and effective in an ever-evolving
landscape.
Invest in state-of-the-art deep learning
techniques for feature extraction. Use Transformer-based
models or other advanced architectures to extract more
nuanced and context- aware features from movie content
like posters, trailers, and textual descriptions.
Extend the system to incorporate multiple data
modalities, including text, audio, and video. This will enable
the recommendation of movies based on not only content
but also audiovisual elements, such as soundtrack or
cinematography style.

Explore the possibility of generating personalized


movie content, such as customized movie trailers or posters,
based on individual user preferences and data. This can
enhance user engagement and provide a unique viewing
experience.
Integrate real-time contextual information, such
as location, weather, or time of day, to offer
recommendations thatalign with a user's current situation
and mood.

30 | P a g e
-----References/Bibliography:

1. Alyari F., Navimipour N.J. Recommender systems:


A systematic review of the state-of-the-art literature and
suggestions for future research.
Kubernetes. 2018;47:985 [Google Scholar]

2. Caro-Martinez M., Jimenez-Diaz G., Recio-Garcia J.A. A


theoretical model of explanations in recommender systems;
Proceedings of the ICCBR; Stockholm, Sweden. 9–12 July
2018. [Google Scholar]

3. Gupta S. A Literature Review on Recommendation


Systems. Int. Res. J. Eng. Technol. 2020; 7:3600–3605.
[Google Scholar]

4. Abdulla G.M., Borar S. Size recommendation system for


fashion e-commerce; Proceedings of the KDD Workshop on
Machine Learning Meets Fashion; Halifax, NS, Canada. 14
August 2017. [Google Scholar]

5. Aggarwal C.C. Recommender Systems. Springer;


Berlin/Heidelberg, Germany: 2016. An Introduction to
Recommender Systems; pp. 1–28. [CrossRef] [Google
Scholar]

31 | P a g e
6. Ghazanfar M.A., Prugel-Bennett A. A scalable, accurate
hybrid recommender system; Proceedings of the 2010 Third
International Conference on Knowledge Discovery and Data
Mining; Washington, DC, USA. 9–10 January 2010. [Google
Scholar]

7. Deldjoo Y., Elahi M., Cremonesi P., Garzotto F., Piazzolla P.,
Quadrana M. Content-Based Video Recommendation
System Based on Stylistic Visual Features. J. Data Semant.
2016;5:99– 113. doi: 10.1007/s13740-016-0060-9.
[CrossRef] [Google Scholar]

32 | P a g e

You might also like