Data Minds - Data Science Curriculum 2023 V2
Data Minds - Data Science Curriculum 2023 V2
Data Minds - Data Science Curriculum 2023 V2
CLASSROOM
ONLINE 3 MONTHS- Certificate Course
HYBRID MODEL 6 MONTHS - Diploma in Data Science
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
1
PYTHON PROGRAMMING
TM
PYTHON ANALYTICS
TM
NUMPY PANDAS
Introduction to NumPy Introduction to Pandas
What is NumPy? What is Pandas?
History of NumPy History and evolution of Pandas
DATA VISUALIZATION
TM
MATPLOTLIB SEABORN
Overview of Matplotlib Overview of Matplotlib
Matplotlib Basics Seaborn Introduction
Installing Matplotlib Installing Seaborn
Basic plotting with Matplotlib Overview of Seaborn's capabilities
Line plots, scatter plots, and bar plots
DATA SCIENCE
TM TM
Handling Duplicates
Identifying and removing duplicate records
Strategies for handling duplicate values
DATA SCIENCE
TM
STATISTICS
Descriptive Statistics
Inferential Statistics:
Hypothesis Testing
Formulating a Hypothesis
Choosing Null and Alternative Hypotheses
Type I or Alpha Error and Type II or Beta Error
Confidence Level, Significance Level, Power of Test
Confidence Intervals
Confidence Interval - Concept
P-value:
MACHINE LEARNING
TM
Regression Classification
Correlation Definition of classification.
Scatter Diagram Understanding the concept of class labels.
Correlation coefficient Binary and multiclass classification.
Correlation analysis
Correlation coefficient
Regression
MACHINE LEARNING
TM
MACHINE LEARNING
TM
Clustering
Distance Metrics
k-Means clustering
Natural Language Processing (NLP)
Hierarchical Clustering
Non-Hierarchical Clustering DBSCAN Tokenization and text processing
Clustering Evaluation metrics Introduction to language models
Text Mining and Natural Language
K-Means Clustering: Processing (NLP)
In-depth coverage of the K-means algorithm, Sources of data
its initialization methods, and convergence Bag of words
Practical implementation and examples. Pre-processing, corpus Document
Term Matrix (DTM) & TDM
Association Rules Word Clouds
Corpus-level word clouds
Assocation rules mining
Sentiment Analysis
Market Basket Analysis
Positive Word clouds
Apriori Algorith,Fp Growth
Negative word clouds
Metrics - Support,Confidence,Lift
Unigram, Bigram, Trigram
Recommender Systems Semantic network
Extract, user reviews of the
User Based Collaborative Filtering
product/services from Amazon and
Similarity Metrics
tweets from Twitter
Item Based Collaborative Filtering
Install Libraries from Shell
Search Based Methods
Extraction and text analytics in Python
SVD Method
LDA / Latent Dirichlet Allocation
Topic Modelling
Dimensionality Reduction: Sentiment Extraction
Lexicons & Emotion Mining
Principal Component Analysis (PCA):
In-depth coverage of PCA, including eigenvalue Applications and Use Cases
decomposition and feature extraction.
Applications in reducing dimensionality.
Live Projects
t-Distributed Stochastic Neighbor Embedding (t-SNE):
Explanation of t-SNE and its use for visualizing high-dimensional data.
Comparison with other dimensionality reduction techniques.
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
Artificial Intelligence
TM
Deep Learning
Perceptron/Multilayer Perceptron Tensor Flow / Keras
Neurons of a Biological Brain
Artificial Neuron
Perceptron
Sequential API / Functional API
Perceptron Algorithm
Use case to classify a linearly Artificial Neural Networks
separable data
Multilayer Perceptron to handle ANN Structure
Error Surface
Building Blocks of Neural Network - ANN Gradient Descent Algorithm
Integration functions Backward Propagation
Activation functions Network Topology
Weights Principles of Gradient Descent (Manual
Bias Calculation)
Learning Rate (eta) - Shrinking Learning Learning Rate (eta)
Rate, Decay Parameters Batch Gradient Descent
Error functions - Entropy, Binary Cross Stochastic Gradient Descent
Entropy, Categorical Cross Entropy, Minibatch Stochastic Gradient Descent
KL Divergence, etc. Optimization Methods: Adagrad, Adadelta,
RMSprop, Adam
Back Propagation Convolution Neural Network (CNN)
Neural Network Algorithm ImageNet Challenge – Winning Architectures
Parameter Explosion with MLPs
Introduction to Perceptron
Convolution Networks
Introduction to Multi-Layered Perceptron (MLP)
Recurrent Neural Network
Activation functions – Identity Function, Step Function,
Language Models
Ramp Function, Sigmoid Function, Tanh
Traditional Language Model
Function, ReLU, ELU, Leaky ReLU & Maxout
Disadvantages of MLP
Back Propagation Visual Demonstration
Back Propagation Through Time
Network Topology – Key characteristics and
Long Short-Term Memory (LSTM)
Number of layers
Gated Recurrent Network (GRU)
Weights Calculation in Back Propagation
Artificial Intelligence
TM
Deep Learning
Convolution Neural Networks - CNN
ImageNet Challenge – Winning Architectures, Difficult Vision Problems & Hierarchical Approach
Parameter Explosion with MLPs
Convolution Networks - 1D ConvNet, 2D ConvNet, Transposed Convolution
Convolution Layers with Filters and Visualizing Convolution Layers
Pooling Layer, Padding, Stride
Transfer Learning - VGG16, VGG19, Resnet, GoogleNet, LeNet, etc.
Practical Issues – Weight decay, Drop Connect, Data Manipulation Techniques & Batch Normalization
Computer Vision
Introduction to Vision
Importance of Image Processing
Image Processing Challenges – Interclass Variation, ViewPoint Variation, Illumination, Background
Clutter, Occlusion & Number of Large Categories
Artificial Intelligence
TM
Deep Learning
LSTM & GRUs - Bidirectional & Deep Bidirectional
Transformers, BERT, GPT3 & Transformer Variants
Speech Recognition
DALL-E
DALL-E is a groundbreaking generative model in the field of data science and artificial
intelligence, developed by OpenAI. The name "DALL-E" is a combination of the famous
artist Salvador Dalí and the robot character WALL-E from the Pixar film.
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
DATABASE
TM
3.History of MongoDB
4.Update Evolution and development of MongoDB
Modifying existing records in a table using the
UPDATE statement
4.Features of NoSQL Databases
Flexibility, scalability, and other key features
5.Delete of NoSQL databases
Removing records from a table using
the DELETE statement
DATA SCIENCE
TM
BIG DATA
Section 1
1.Hadoop Introduction 8.Flask
Flask Introduction
Definition and Purpose Overview and purpose of Flask
Historical Context Flask Application
2.Hadoop Architecture Building a basic Flask application
Flask URL
Components: NameNode, DataNode, Handling URLs in Flask
ResourceManager, NodeManager Templates
High-Level Architecture Overview Using templates in Flask
Merging the ML Model
3.Hadoop Eco-system Integrating Flask with a
Machine Learning model
Overview of various tools in the
Hadoop ecosystem (e.g., MapReduce, Hive, Pig)
Section 2
4.Hadoop Distributed File System (HDFS) Amazon Web Services (AWS)
Basics of HDFS 1.Cloud Computing
File Storage and Replication
Definition and Characteristics
Cloud Service Models (IaaS, PaaS, SaaS)
5.Hadoop Coursera
Supplementary learning resources on 2.AWS Introduction
for Hadoop
Overview of Amazon Web Services
6.Py-Spark
3.Creating AWS Account
Introduction to Py-Spark
Step-by-step guide to creating an
Py-Spark for distributed data processing
AWS account
Understanding AWS Free Tier
7.Hive
Overview of Hive and its role in Hadoop
HiveQL and Data Warehousing 4.EC2 Details
Introduction to Elastic Compute Cloud (EC2)
EC2 Instance Types and Configurations
DATA SCIENCE
TM
BIG DATA
Section 3:
Agile Scrum Methodology
1.Agile Introduction
Principles and Values of Agile
Agile Manifesto
2.Advantages of Agile
Benefits of Agile over traditional methodologies
3.Scrum Introduction
Framework Overview
Roles in Scrum: Scrum Master, Product Owner, Development Team
4.Scrum Process
Sprint Planning, Daily Standups, Sprint Review, Sprint Retrospective
5.Scrum Terminology
User Stories, Backlog, Burndown Charts, Sprint Backlog
Section 4:
Kafka
1.What is Message Service
Introduction to message-oriented middleware
2.Kafka Introduction
Overview of Apache Kafka
Messaging System for Distributed Streaming
3.Kafka Architecture
Components: Producers, Brokers, Consumers
Topics and Partitions