How These Books Were Found: Get Updates in Your Inbox
How These Books Were Found: Get Updates in Your Inbox
How These Books Were Found: Get Updates in Your Inbox
3
SHARES
!
Author: Brendan Martin
Founder of LearnDataSci
"
Contents Index +
Due to the amount of time it takes to wade through degree requirements, course
codes, and catalogs, this article will continue to evolve as I gather more data.
In each book below, I’ve given an example of how the author(s) decided to
introduce Linear Regression, one of the most basic machine learning algorithms.
If you’re a beginner in data science, I think this will give you some insight into
what sort of math background each book requires.
Without further ado, here’s the most assigned and recommended books from top
universities.
This book was either the assigned textbook or recommended reading in every
Masters program I researched. Due to its advanced nature, you’ll find that book
#5 in this list — An Introduction to Statistical Learning with Applications in R
(ISLR) — was written as a more accessible version, and even includes exercises
in R.
It’s usually recommended for beginners in data science to master the content in
ISLR before moving to ESL, where you’ll get a more theoretical background. Just
mastering ISLR is often enough for data analyst roles.
To get an idea of the math required, Linear Regression is introduce like so:
Despite being very clear and rich in diagrams, to get the full benefit of PRML
you'll need advanced calculus, linear algebra, and optimization knowledge. Many
of the derivations do not show the intermediate steps so it'll be important for you
to go through each step on your own for a good understanding.
Unlike the applied approach of ESL, PRML is more theoretical. Here's how Linear
Regression in introduced by Bishop:
Luckily, Bishop has also authored solutions to the exercises labeled “www” in the
book, making this book a possibility for self-study. You can find those solutions
as a PDF here.
A great resource for graduate courses, but since it's not freely available and the
solutions manual can only be purchased by professors, it's a little more closed
off than others in this list and is not recommended for self-study. Also, If you're a
beginner in machine learning, this textbook isn't an ideal starting point.
Within the next couple of lines, Murphy redefines this in probabilistic terms like
so:
Without a more advanced math foundation, it's easy to get caught in the notation
when reading this book on your own.
#4 Deep Learning
Unlike the previous two books listed, this textbook goes into a nice general
survey of math and machine learning methods. There's many concrete examples
and the math is much simpler than MLAPP and PRML.
Let be the value that our model predicts should take on. We
define the output to be
This notation is much more straightforward for beginners, and very similar to
how both the next book, ISLR, presents it, as well as Andrew Ng’s famous
Machine Learning course on Coursera.
Overall, this book serves as a good reference and starting point for digging
deeper elsewhere, but isn’t comprehensive by any means. There’s not much
direct application, so you won’t gain any insight in how to actually implement
neural networks, but it is a good high-level complement to deep learning
courses — which Andrew Ng has also created.
Amazon or Free— Authors: Gareth James, Daniela Witten, Trevor Hastie, and
Robert Tibshirani
I’ll start out by saying that this a fantastic book. ISLR is usually recommended in
the first course of programs specifically built for data science, which makes a lot
of sense from how this book is structured.
Although not a thick book by any means, it’s derived from the #1 book, The
Elements of Statistical Learning, and comprehensively covers the fundamentals
every data scientist should know.
Not only is it extremely clear and accessible to those with a basic undergrad
math background, but it has a very applied approach. Every chapter comes with
exercises in R that let you work applying the concepts you’re learning directly on
some data.
You might read " " as "is approximately modeled as". We will
sometimes describe [this equation] by saying that we are
regressing on (or onto ).
Have you read any of the books listed? Did you use any of these in a course?
What did you think?
I'm going to continue compiling books I find in course syllabi from top
universities and frequently update this article, but I would also love to know what
you all think about each of these.
If there wasn't a book mentioned that you've found particularly helpful, leave a
comment and let me know!
Get updates in
your inbox
Join over 7,500 data science learners.
Brendan Martin
Founder of LearnDataSci
Best Data Science Courses Best Machine Learning Courses Best Udemy Courses
Data Science & Machine Learning Glossary Free Data Science Books
Privacy Policy