Machine Leearning (1)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

ABSTRACT

The Internship about the Data Analysis with an emphasis on the


application of Machine Learning, conducted in a dynamic and fast- paced
environment, with a primary focus on practical implementation and real world
data scenarios.

Machine learning was on establishing a foundational understanding of


data analysis principles, including data preprocessing, exploratory data
analysis, and future engineering, with the concept of data science. The
utilization of different python libraries, such as Pandas and Numpy, facilitated
the efficient handling and manipulation of large datasets. Basic statistical
techniques were employed to gain insight into data distribution, trends, and
anomalies.

An immersive exploration of machine learning models for predictive analysis.


Supervised learning techniques, including regression and classification
algorithms, were employed to build models that could predict and classify
future data points accurately. Unsupervised learning includes the clustering
method to divide the data according to the similar attributes or nature, into
clusters.

The role emphasized the importance of data-driven decision making and


critical role of ML in extracting meaningful insight from complex datasets.
List of Figures
No Name of Figure
Fig. 3.1 Bar Graph
Fig. 3.2 Line Graph
Fig. 3.3 Area Graph
Fig. 3.4 Histogram
Fig. 3.5 Scatter Plot
Fig. 3.6 HeatMap
Fig. 3.7 Random Forest Algorithm
Fig. 3.8 KNN Classifier
Fig. 3.9 Naïve Byes Classifier
Fig. 3.10 Decision Tree Classifier
Fig. 3.11 Simple Decision Regression
Fig. 3.12 Multiple Linear Regression
Fig. 3.13 KMeans Clustering
Fig. 3.14 KMeans Clustering
Table of Contents

CH. NO. CONTENT


Certificate
Industrial Letter Head
Declaration
Acknowledgement
Abstract
List of Figures
List of Tables
List of Abbreviation
Table of Content
Chapter-1 Overview of R&B Department
1.1 History
1.2 Different Services/Scope under the Institute
Chapter- 2 Introduction to the Internship
2.1 Internship Summary
2.2 Purpose
2.3 Objective
2.4 Scope
2.5 Internship Planning
2.5.1 Internship Effort & Time, Cost Estimation
2.5.2 Roles and Responsibilities
2.6 Internship Scheduling
Chapter-3 Detail Report
3.1 Details of Week - 1
3.2 Details of Week - 2
Chapter- 4 Conclusion & Discussion
4.1 Overall Analysis of Internship viabilities
4.2 Summary of Internship Work
CHAPTER 1 OVERVIEW OF THE COMPANY:

HISTORY:
The history of NIELIT dates back to 1974 when the Department of Electronics
(DoE) now Ministry of Electronics and Information Technology (MCIT), Govt. of
India and the University Grants Commission (UGC) set up the first CEDT within
the premises of Indian Institute of Science (IISc.), Bangalore with assistance from
Swiss Development Corporation. The objective was to bridge the gap between the
academic institutions and industries. A decade after the successful running of
CEDT, Bangalore, the then Department of Electronics (DoE), initiated a
programme to set up similar centres in other parts of the country with a wider
objective to develop human resources at different levels and in different specialised
areas of Electronics Design.

DIFFERENT SERVICES/ SCOPE UNDER THE INSTITUTE:


Education and Training:
Education and Training is provided by this in the field of computer , bio informatics,
IT enabled services, etc. with theoretical and practical implementations by
providing certification.

Recruitment:
Recruitment of people at different post is available for management and running the
institution, either for internal work or for the role to teach and guide the students
learning under this.

R & D Projects:
Research and Development projects for increasing the scientific society of India ,
under the control of Ministry of Electronics and Information Technology(Meity),
Government of India.
CHAPTER 2 INTRODUCTION TO THE INTERNSHIP

INTERNSHIP SUMMARY:
I can honestly say that my time spent interning with NIELIT resulted in one of
the best summers of my life. Not only did I gain practical skills but I also had
the opportunity to meet many fantastic faculty members. The internship in data
analysis using machine learning (ML) provided practical, hands-on experience
in leveraging ML algorithms for data interpretation. Through a focus on data
preprocessing, exploratory data analysis, and the application of various ML
models, the internship enhanced understanding of data-driven decision-making.
This experience equipped participants with the skills necessary to tackle
complex real-world challenges through data analysis and ML techniques.

PURPOSE:
An internship is a temporary job role that's often related to one's academic field of
study or career interests. It can offer a beginner in a career field practical experience
within a professional role. Internships are often useful to college students and recent
graduates, as many internship programs provide college credit rather than an hourly
pay rate.
Internships may also offer individuals insight on a particular industry's culture and
daily operations, assist a young professional with completing a degree or provide
an income while a student earns their degree.

Industry experience is often an important part of applying for full-time positions.


Employers often prefer applicants who have some experience working in positions
that may be similar to the one they are offering.

If you are new to a particular industry or are still in school, an internship may
promote professional growth and help you determine whether the career path you're
pursuing is the right fit for you. In this article, we define an internship, explore what
the purpose of an internship is, list some of its benefits and provide answers to
commonly asked questions regarding internships.
OBJECTIVE:
 Internship Project aims at widening the student's perspective by providing an
exposure to real life organizational environment and its various functional
activities.
 This will enable the students to explore an industry/organization, build a
relationship with a prospective employer, or simply hone their skills in a familiar
field.
 Internship Project also provides invaluable knowledge and networking
experience to the students. During the internship.
 Some ideal projects for internships can be in the areas of strategy formulation.
 An additional benefit that organizations may derive is the unique opportunity to
evaluate the student from a long-term perspective. Thus, the Internship Project
can become a gateway for final placement of the student.
 The student should ensure that the data and other information used in the study
report is obtained with the permission of the institution concerned. The students
should also behave ethically and honestly with the organization.
 Explore career alternatives prior to graduation.
 Integrate theory and practical.
 Assess interests and abilities in the field of study.
 Learn to appreciate work and its function in the economy.
 Develop work habits and attitudes necessary for job success.
 Develop communication, interpersonal and other critical skills in the job interview
process.
 Build a record of work experience.

SCOPE:
Internship experience plays a vital role for every student to implement they
theoretical knowledge and get a practical knowledge from any organization. A
student can implement this internship experience in her future work area.
A student can get knowledge about the side area, supervising, supervision of
work, estimate of work, process of work, material testing, marking and mixing.
A student use these experiences and this knowledge in future work.
Gain experience and increase marketability having an internship gives you
experience in the career field you want to pursue providing students with
experiences that can lead to personal and professional growth. To further support
the participation and benefit of being on intern.
CHAPTER 3 DETAIL REPORT

DETAILS OF WEEK – 1:
Task Name: - Basis understanding of ML and Data Analysis, Data Science using
Machine Learning methods, Using Supervised and Unsupervised Learning for
prediction
 Task Requirements: -
 Python libraries like pandas, numpy, etc.
 Visualization by Matplotlib
 Supervised Learning by classification and regression
 Unsupervised Learning by clustering
 Model selection for ML by sklearn

First we got knowledge about the Machine Learning Concepts-

Machine Learning: Learning the machines to give the responses based on the past
experience or learning. Machine learning is a branch of artificial intelligence (AI) that
focuses on the development of algorithms and statistical models that enable computers to
progressively improve their performance on a specific task through the use of data, without
being explicitly programmed. The core idea behind machine learning is to allow machines
to learn from data, identify patterns, and make decisions or predictions based on that data.

Types of Machine Learning: There are two types of machine learning –


1.Supervised Learning: Supervised learning is a type of machine learning where the
algorithm is trained on a labelled dataset, meaning the input data is accompanied by the
correct output. The goal is for the algorithm to learn the mapping between the input and the
output and to be able to make predictions or decisions when new input data is presented. In
supervised learning, the algorithm is provided with a set of input-output pairs during the
training phase. It learns from these pairs to make predictions or classify new unseen data.
Types of Supervised Learning-

1. Classification: A predictive modelling problem where a class label is predicted for a

given example of input data.


2. Regression: It is used when the output variable is real or continuous value such as salary

and weight.

2. Reinforcement Learning: It comes in between the supervised and unsupervised learning

where an agent learns to make decisions by interacting with its environment. It learns to
achieve a goal or maximize a cumulative reward by taking actions and receiving feedback
from the environment. Unlike supervised learning, reinforcement learning does not require
labelled input-output pairs, but rather relies on a system of rewards and punishments to
shape the learning process.

3. Unsupervised Learning: Unsupervised learning is a type of machine learning where the

algorithm is trained on an unlabeled dataset, without any specific output or target variable.
The primary goal of unsupervised learning is to uncover hidden patterns or structures within
the data. It aims to learn the underlying structure of the data, such as grouping or clustering
similar data points, without any explicit guidance. It mainly use the clustering method to
divide the output based on similar nature.

Intro to Visualization using Matplotlib:


Matplotlib is a plotting library for the Python programming language and its numerical
mathematics extension NumPy.
 It provides an object-oriented API for embedding plots into applications using
general-purpose GUI toolkits.
 It helps in developing publication quality plots with just a few lines of code.
 You can use interactive figures that can zoom, pan, update...
 You can also take full control of line styles, font properties, axes properties...
 You can export and embed to a number of file formats and interactive environments.
We generally use pyplot interface of matplotlib to plot interactive graphs for data analysis.
Fig 3
Fig 3.1 Bar Graph
Fig 3.2 Line Graph
Fig 3.3 Area Graph
Fig 3.4 Histogram
Fig 3.5 Scatter Plot
Fig 3.6 HeatMap
1. Supervised Learning:

 Classification:
1. Random Forest Algorithm:
Random Forest in classification uses multiple decision trees
trained on random subsets of data. It combines their predictions
through voting to determine the final class. This reduces
overfitting and enhances accuracy, making it a popular choice for
classification tasks.
Fig 3.7 Random Forest Algorithm

2. KNN Classifier:
K-Nearest Neighbors (KNN) is a simple and intuitive machine
learning algorithm used for classification tasks. It works by finding
the K closest training data points to a new data point and assigns
the class based on the majority class among these neighbors. It is
non-parametric and doesn't make strong assumptions about the
underlying data distribution. However, it can be computationally
expensive, especially with large datasets.
Fig 3.8 KNN Classifier
DETAILS OF WEEK 2 :
3. Naïve Byes Classifier:
Naive Bayes is a simple yet effective machine learning algorithm
used for classification tasks. It works based on Bayes' theorem,
assuming that features are independent of each other. During
training, it builds a probabilistic model for each class. In the
prediction phase, it calculates the probability of a new data point
belonging to each class and assigns the class with the highest
probability as the predicted class. Naive Bayes is easy to
implement, performs well with small training datasets, and is
particularly effective for text classification tasks.
Fig 3.9 Naïve Byes Classifier
4. Decision Tree Classifier:
A decision tree classifier is a simple yet powerful machine learning
algorithm used for classification tasks. It works by partitioning the
data into subsets based on features that lead to the best split. It
creates a tree-like model where each internal node represents a
feature, each branch represents a decision rule, and each leaf node
represents the outcome or class label. During prediction, the
algorithm navigates through the tree based on the features of the
input data to determine the final class label. Decision trees are easy
to interpret and visualize, making them valuable for understanding
the underlying decision-making process. However, they can be
prone to overfitting, especially with complex datasets.
Regularization techniques and ensemble methods such as Random
Forests can help mitigate this issue.
Fig 3.10 Decision Tree Classifier

 Regression:
1. Simple linear regression:
Simple Linear Regression is a basic machine learning algorithm used
for predicting a continuous target variable based on a single input
feature. It assumes a linear relationship between the input and the
output. During training, it finds the best-fitting line through the data
points by minimizing the sum of squared differences between the actual
and predicted values. In the prediction phase, it uses this line to make
predictions for new data points. Simple Linear Regression is
straightforward to implement and interpret, making it a fundamental
tool for understanding the relationship between two variables.
s
Fig 3.11 Simple Linear Regression

2. Multiple Linear Regression:


Multiple Linear Regression is an extension of Simple Linear
Regression that can accommodate multiple input features for predicting
a continuous target variable. It assumes a linear relationship between
the inputs and the output. During training, it finds the best-fitting
hyperplane through the data points by minimizing the sum of squared
differences between the actual and predicted values. In the prediction
phase, it uses this hyperplane to make predictions for new data points
with multiple features. Multiple Linear Regression is widely used in
various fields for understanding complex relationships between
multiple variables.
Fig 3.12 Multiple Linear Regression
2. Unsupervised Learning:
 Clustering:
Clustering is an unsupervised machine learning technique used to group similar data
points together based on inherent patterns or similarities within the data. It doesn't
require predefined classes or labels, and the goal is to identify natural groupings or
clusters in the dataset. The algorithm assigns data points to clusters to maximize
intra-cluster similarity and minimize inter-cluster similarity. K-means and
hierarchical clustering are common clustering algorithms used to analyze and
discover structures within data. Clustering is valuable for various applications,
including customer segmentation, image segmentation, and anomaly detection.
Fig 3.13 K Means Clustering
Fig 3.14 K Means Clustering
CHAPTER 4 CONCLUSION AND DISCUSSION

OVERALL ANALYSIS OF INTERNSHIP VIABILITIES:


This internship has really helped my career as an undergraduate student aiming to
acquire the undergraduate in computer engineering. During my internship, I had the
opportunity to work on real world problems and guided directly by best mentors who
has several years of field experience.
 It helps to enhance and develop my skills, abilities, and knowledge.
 It was a good experience and memories as not only I have gained experience, but
also new friends and acquired domain knowledge.
 It is not only to get experience on technical practices but also to observe
management practices and to interact with employees.
Also, I learnt the way of work in an organization, the importance of being punctual,
the importance of maximum commitment, and the importance of team spirit.
Generally during the internship program, we gained lots of knowledge and skills in
terms of upgrading: -Practical skill, Management skill, Team
playing skill and Interpersonal communication.
SUMMARY OF INTERNSHIP WORK:
During the two-week internship, I acquired practical knowledge and hands-on experience
in applying data science principles and machine learning techniques to solve real-world
problems. I developed a comprehensive understanding of the end-to-end data science
process, from data collection and preprocessing to model building, evaluation, and
deployment.

Throughout the internship, I gained proficiency in various data manipulation and analysis
tools, including Python libraries such as NumPy, Pandas, and Matplotlib. I became adept
at handling large datasets, cleaning data, and performing exploratory data analysis to
extract valuable insights and identify patterns and trends.

Moreover, I learned to implement popular machine learning algorithms, including


regression, classification, and clustering techniques. I became familiar with advanced
topics such as ensemble methods, dimensionality reduction, and model tuning for optimal
performance.

The internship provided me with practical experience in building and evaluating predictive
models, understanding the importance of feature engineering, and implementing techniques
to address overfitting and underfitting issues. I also gained exposure to data visualization
techniques, effectively communicating complex findings and results to diverse audiences.

Overall, the internship enhanced my understanding of data science and machine learning
concepts, equipping me with valuable skills and knowledge to tackle complex data-driven
challenges in various industries.
REFERENCE:

 www.kaggle.com
 www.w3schools.com/datascience/
 www.youtube.com
 github.com

You might also like