PySpark 30 Days Practice Guide?

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Learn

In Just 30 Days
*Disclaimer*

Everyone has their own way of learning. The key


is focusing on the core elements of PySpark to
build a strong understanding.

This guide is designed to assist you in that


journey.

www.bosscoderacademy.com 1
Introduction:

www.bosscoderacademy.com 2
Week 1: Introduction to PySpark and
Environment Setup

Day 1

Introduction to Big Data and Spark


Ecosystem

Objective:

Topics:

Practice:

www.bosscoderacademy.com 3
Day 2

Setting Up PySpark Environment

Objective:

Topics:

Practice:

www.bosscoderacademy.com 4
Day 3

SparkContext and SparkSession

Objective:

Topics:

Practice:

www.bosscoderacademy.com 5
Day 4

RDDs (Resilient Distributed


Datasets) Basics
Objective:

Topics:

Practice:

www.bosscoderacademy.com 6
Day 5

RDD Transformations and Actions


Objective:

Topics:

Practice:

www.bosscoderacademy.com 7
Day 6

Key-Value Pair RDDs

Objective:

Topics:

Practice:

www.bosscoderacademy.com 8
Day 7

Data Persistence and Partitioning

Objective:

Topics:

Practice:

www.bosscoderacademy.com 9
Week 2: DataFrames and Spark SQL
Day 8
Introduction to DataFrames
Objective:

Topics:

Practice:

www.bosscoderacademy.com 10
Day 9

DataFrame Operations
Objective:

Topics:

Practice:

www.bosscoderacademy.com 11
Day 10

Aggregations in DataFrames
Objective:

Topics:

Practice:

www.bosscoderacademy.com 12
Day 11

DataFrame Joins

Objective:

Topics:

Practice:

www.bosscoderacademy.com 13
Day 12

DataFrame Functions and


Expressions

Objective:

Topics:

Practice:

www.bosscoderacademy.com 14
Day 13

Working with Dates and Timestamps


Objective:

Topics:

Practice:

www.bosscoderacademy.com 15
Day 14

Spark SQL Introduction


Objective:

Topics:

Practice:

www.bosscoderacademy.com 16
Week 3: Advanced Data Processing
and Optimization

Day 15

Working with Complex Data Types

Objective:

Topics:

Practice:

www.bosscoderacademy.com 17
Day 16

User-Defined Functions (UDFs)

Objective:

Topics:

Practice:

www.bosscoderacademy.com 18
Day 17

Broadcasting Variables and


Accumulators

Objective:

Topics:

Practice:

www.bosscoderacademy.com 19
Day 18

Performance Tuning and


Optimization
Objective:

Topics:

Practice:

www.bosscoderacademy.com 20
Day 19

Working with Parquet and ORC


Formats
Objective:

Topics:

Practice:

www.bosscoderacademy.com 21
Day 20

Window Functions
Objective:

Topics:

Practice:

www.bosscoderacademy.com 22
Day 21

Handling Missing Data

Objective:

Topics:

Practice:

www.bosscoderacademy.com 23
Week 4: Machine Learning and PySpark

MLlib

Day 22

Introduction to PySpark MLlib

Objective:

Topics:

Practice:

www.bosscoderacademy.com 24
Day 23

Feature Engineering
Objective:

Topics:

Practice:

www.bosscoderacademy.com 25
Day 24

Classification Models in PySpark


Objective:

Topics:

Practice:

www.bosscoderacademy.com 26
Day 25

Regression Models in PySpark

Objective:

Topics:

Practice:

www.bosscoderacademy.com 27
Day 26

Clustering Models in PySpark

Objective:

Topics:

Practice:

www.bosscoderacademy.com 28
Day 27

Dimensionality Reduction
Techniques

Objective:

Topics:

Practice:

www.bosscoderacademy.com 29
Day 28

Model Persistence and Deployment


Objective:

Topics:

Practice:

www.bosscoderacademy.com 30
Day 29

Spark Streaming Basics


Objective:

Topics:

Practice:

www.bosscoderacademy.com 31
Day 30

Spark Structured Streaming


Objective:

Topics:

Practice:

www.bosscoderacademy.com 32
Conclusion:

www.bosscoderacademy.com 33
Why Bosscoder?
1000+ Alumni placed at Top Product-
based companies.

More than 136% hike for every 



2 out of 3 Working Professional.

Average Package of 24LPA.

Explore More

You might also like