This project aims to develop a personalized recommendation system using a Knowledge Graph to model relationships between various entities such as users, products, and interactions. The recommendation system leverages graph-based algorithms and machine learning techniques to provide tailored suggestions to users based on their interactions and preferences.
- Project Overview
- Dataset
- Technologies and Tools
- Installation and Setup
- Usage
- Model Training and Evaluation
- Results
- Contributing
- License
The dataset consists of three CSV files containing information about products, customers, and sales interactions. The columns for each file are as follows:
- Uniqe Id
- Product Name
- Brand Name
- Asin
- Category
- Upc Ean Code
- List Price
- Selling Price
- Quantity
- Model Number
- About Product
- Product Specification
- Technical Details
- Shipping Weight
- Product Dimensions
- Image
- Variants
- Sku
- Product Url
- Stock
- Product Details
- Dimensions
- Color
- Ingredients
- Direction To Use
- Is Amazon Seller
- Size Quantity Variant
- Product Description
- Customer ID
- Age
- Gender
- Item Purchased
- Category
- Purchase Amount (USD)
- Location
- Size
- Color
- Season
- Review Rating
- Subscription Status
- Shipping Type
- Discount Applied
- Promo Code Used
- Previous Purchases
- Payment Method
- Frequency
- user id
- product id
- Interaction type
- Time stamp
- Programming Language: Python
- Graph Database: Neo4j
- Query Language: Cypher
- Libraries: Py2neo, Pandas, Scikit-learn, Pyspark, Faker
- Clone the repository:
git clone https://github.com/yourusername/knowledge-graph-recommender.git cd knowledge-graph-recommender
- Clone the repository:
pip install -r requirements.txt
- Set up Neo4j:
- Install Neo4j from Neo4j Download Center.
- Start Neo4j and set up your database.
- Update the connection details in your code if necessary.
-
Prepare the data:
- Run the
data_preparation.ipynb
notebook to load and preprocess the dataset.
- Run the
-
Train the model:
- Execute the
model_training.ipynb
notebook to train the recommendation model using the ALS algorithm.
- Execute the
-
Generate recommendations:
- Use the
recommendations.ipynb
notebook to get personalized recommendations for users.
- Use the
The project employs the Alternating Least Squares (ALS) algorithm for training the recommendation model. The evaluation metrics used to assess the model's performance include Root Mean Square Error (RMSE), Precision, Recall, and F1 Score.
The training process involves the following steps:
- Load and preprocess the data.
- Encode user and product IDs.
- Split the data into training and test sets.
- Train the ALS model using the training set.
The model's performance is evaluated using the test set. The metrics calculated are:
- RMSE: Measures the difference between predicted and actual interactions.
- Precision: Measures the accuracy of the positive predictions.
- Recall: Measures the ability of the model to identify all relevant items.
- F1 Score: The harmonic mean of precision and recall.
The results of the recommendation model are presented in terms of the evaluation metrics. The recommendations generated by the model can be visualized using Neo4j Browser.
Contributions to this project are welcome. Please follow these steps to contribute:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Commit your changes (
git commit -am 'Add new feature'
). - Push to the branch (
git push origin feature-branch
). - Create a new Pull Request.
This project is licensed under the MIT License. See the LICENSE file for more details.