BDM Curriculum 1665047518017

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Professional Certificate Program

in Data Science for Business


Decision Making
(Program Curriculum)

Duration: 8 months
Note: The Curriculum is subject to change as per the
inputs from the institute or industry experts

PRE-PROGRAM PREPARATORY CONTENT

Module

• Intro to Excel (1 week)


Intro to Python; Python for DS; Data Visualization in Python (Optional)

1. Introduction to Excel - understanding the interface, slicing & dicing, formatting & report making
2. Data analysis in Excel - formulae, complex functions, cell referencing & text functions,
logical formulae, creating & formatting charts, pivoting, VLOOKUP, etc.)

COURSE-I: BUSINESS PROBLEM SOLVING, INSIGHTS AND


STORYTELLING (7 WEEKS)
Module

• Understanding the Business Problem and Formulating Hypotheses

1. Understanding business problems through frameworks - Intro


2. 5W & 5WHYs frameworks + industry demos
3. SPIN framework + industry demos
4. Business model canvas and Issue tree framework + industry demos
5. Specialised frameworks - 7Ps, 5Cs, etc.

• Exploratory Data Analysis

1. Data sourcing - public and private data


2. Data cleaning - handling missing values, handling invalid data, flitering data, standardisation, etc.
3. Univariate analysis - unordered vs ordered categorical variables, quantitative variables &
measures of central tendency
4. Segmented univariate analysis - basis of segmentation, comparison of averages and other metrics
5. Bivariate analysis - bivariate on quantitative and categorical variables, correlation
6. Derived metrics - introduction, type-driven metrics, business-driven metrics, data-driven metrics

• Visualisation using Tableau

1. Introduction, installation, and UI walkthrough


2. Visualising and analyis data in Tableau - I: Bar charts, line charts, area charts, box plots,
hierarchies, pie-chart, grouping and treemaps, histogram, scatterplots
3. Visualising and analyis data in Tableau - II: Joins & splits, dual axes charts. top N parameters,
stacked bar charts
4. Dashboarding & storytelling using Tableau

• Data Storytelling: Narrate Stories in a Memorable Way

1. Introduction to, importance and characteristics of a good story with data


2. Components of a good story with data - objective and agenda, narrative, patterns of insights,
structure and flow, pyramid principle, visualisations
3. Golden rules for effective data storytelling - visual design principles and storyboarding, best
practices, industry demo

• Project on deriving business insights and storytelling

Break

COURSE-II: STATISTICS & MACHINE LEARNING (16 WEEKS)

Module

• Descriptive & Inferential Statistics

1. Introduction to probability - PnC, probability, events, additive & multiplicative rule


2. Basics of probability - random variables, probability distribution, expected value
3. Probability distributions - discrete vs continuous distributions, uniform distribution,
cumulative probabilities, normal & standard normal distribution
4. Central limit theorem - sampling, sampling distribution, properties of sampling distribution,
central limit theorem, estimating mean using CLT (+Excel demo)
Professional Certificate Program
in Data Science for Business
Decision Making
(Program Curriculum)

Duration: 8 months
Note: The Curriculum is subject to change as per the
inputs from the institute or industry experts

• Hypothesis Testing

1. Concepts of hypothesis testing - business relevance, framing hypotheses, hypothesis


testing process and p-value
2. Types of hypothesis tests - left- and right-tailed tests, two-tailed tests, types of errors,
hypothesis testing using T-distribution
3. Industry demos on hypothesis testing (Excel) - two-sample mean test, two-sample
proportion test, A/B testing

• Linear Regression

1. Introduction to regression - ideation, equation, limitations


2. Simple linear regression - best-fit line, OLS, goodness of fit, assumptions, model building,
model evaluation (regression parameters), residual analysis and prediction, model interpretation
3. Multiple linear regression - moving from SLR to MLR, adjusted r-squared, multicollinearity,
feature selection, model building, evaluation & prediction
4. The Mathematics of regression (parameter estimation using OLS, the gradient descent
algorithm, ANOVA)
5. Transformation of variables
6. Polynomial regression

• Forecasting

1. Introduction to forecasting - purpose, process, components of a time series, overfitting and


data partitioning
2. Model building - level, linear trend, quadratic trend, exponential trend, additive seasonality,
multiplicative seasonality, combining trend & seasonality, forecasting future data
3. ARIMA model - lag analysis, model building, implementation

• Project: Regression

Break

• Classification

1. Introduction - regression vs classification, types of classification, evaluating classification models


2. Logistic regression - best-fit sigmoid curve, odds & log odds, multivariate logistic regression,
confusion metrics and accuracy, sensitivity & specificity, precision & recall, trade-offs, RoC,
predictions, logistic regression implementation
3. Decision trees - descriptive vs discriminative classification, the decision tree algorithm, measuring
purity (accuracy, gini index, entropy), decision trees implementation
4. Ensembles & random forests - introduction and types of ensembles, introduction to random forests,
OOB, feature importance in random forests, implementation

• Clustering & Market Basket Analysis

1. Introduction to clustering, types of clustering, Euclidean distance & centroid, k-means clustering
algorithm, k-means clustering implementation, scaling and standardisation
2. Other forms of clustering - hierarchical clustering, DBSCAN
3. Introduction to market basket analysis, cross-selling & upselling, bag vs basket of products, the
Apriori algorithm, implementation of market basket analysis

• Text Analytics, Processing, and Predictive Modelling

1. Introduction to text analytics (text encoding, regular expressions*, word frequencies & stop
words, tokenisation, bag-of-words representation, stemming & lemmatisation, TF-IDF)
2. The Naive Bayes algorithm (Bayes' theorem and its building blocks, Naive Bayes for
text classification)

• Model Selection

1. Principles of model selection - model & learning algorithm, simplicity, complexity & overfitting,
bias-variance tradeoff, regularisation, hyperparameters and cross validation
2. Model building & evaluation - model building, feature selection using CV, hyperparameter
tuning using grid-search and randomised-search CV
3. Feature engineering - introduction, handling numeric features, handling categorical features,
handling time-based features, implementation
4. Handling class imbalance

• Project: Classification

Break

COURSE III: DS STRATEGY - BECOMING A DATA EVANGELIST (3 WEEKS)

Module

• Framework to DS Strategy

1. Identifying the business problems (Revenue/Cost Perspective, Value Chain)


2. Using the DS project assessment framework - Measuring Impact/ Reach vs Feasibility/ Complexity
3. Ethics and Bias Considerations
4. Creating the Roadmap

• Mapping DS with Data Architecture Strategy

1. Given a broad business challenge, describe how you would approach the development of a
Machine Learning Architecture strategy using the Structured Problem Solving Method
2. Given a particular business context, detail how a Machine Learning Architecture strategy fits
into its Data and Data Architecture strategy.
3. Identify the strengths and weaknesses of a given business’s Data and Data Architecture strategy

• Executing DS Strategy

1. Product management -
- Steps involved from production to deployment
- How do you manage the solution creation and solution deployment phase simultaneously?
- What are the security considerations
- What are the hiccups one faces while deploying a solution?
- Developing the Data Culture
- How do you stay up to date with the latest technology
- Should you establish a COE or outsource the project?
- How do you identify when the AI strategy is failing?

2. DS Lifecycle Management - Pilot, MVP, ML ops, agile methodology & typical timelines

• Capstones (6 Weeks)

Option between 2 capstones


1. Finance / Stocks - Dashboarding for Stocks through Time Series Analysis
2. Retail - Dashboarding for Products through Market Basket Analysis

You might also like