Homework Exercise 3: Statistical Learning, Fall 2020

Uploaded by

This document outlines six homework problems for a statistical learning course. The problems cover topics like similarity of LDA and linear regression for binary classification, properties of logistic loss and naive Bayes assumptions, equivalence of reference classes in multinomial logistic regression, and investigating tree-based models on competition data.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Homework Exercise 3: Statistical Learning, Fall 2020

Uploaded by

Elinor Rahamim

0% found this document useful (0 votes)

13 views2 pages

Original Title

HW3

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

13 views2 pages

Homework Exercise 3: Statistical Learning, Fall 2020

Uploaded by

Elinor Rahamim

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

Statistical Learning, Fall 2020

Homework exercise 3
Due date: 22 December in class

1. ESL 4.2: Similarity of LDA and linear regression for two classes
In this problem you will show that for two classes, linear regression leads to the same discriminating
direction as LDA, but not to the exact same classification rule in general.
The derivations for this problem are rather lengthy. Consider part (b) (finding the linear regression
direction) to be extra credit. If you fail to prove one step, try to comment on its geometric interpretation
instead, and move to the next step.
2. Short intuition problems
Choose and explain briefly. If you need additional assumptions to reach your conclusion, specify them.

(a) What is not an advantage of using logistic loss over using squared error loss with 0-1 coding for
2-class classification?
i. That the expected prediction error is minimized by correctly predicting P (Y |X).
ii. That it has a natural probabilistic generalization to K > 2 classes.
iii. That its predictions are always legal probabilities in the range (0, 1).
(b) In the generative 2-class classification models LDA and QDA, what type of distribution does
P (Y |X = x) have?
i. Unknown
ii. Gaussian
iii. Bernoulli
(c) We mentioned in class that Naive Bayes assumes P (x|Y = g) = Πpj=1 Pj (xj |Y = g). In what
situation would you expect this simplifying assumption to be most useful?
i. Small number of predictors, not highly correlated.
ii. Small number of predictors, highly correlated between them.
iii. Large number of predictors, not highly correlated.
iv. Large number of predictors, many highly correlated between them.

3. Equivalence of selecting “reference class” in multinomial logistic regression

In class we defined the logistic model as:

P (G = 1|X)
log = X T β1
P (G = K|X)
..
.
P (G = K − 1|X)
log = X T βK−1 ,
P (G = K|X)

1
with resulting probabilities:

exp{X T βk }
P (G = k|X) = P , k<K
1 + l<K exp{X T βl }
1
P (G = K|X) = P .
1 + l<K exp{X T βl }

Show that if we choose a different class in the denominator, we can obtain the same set of probabilities
by a different set of linear models (i.e., values of β). Hence the two representations are equivalent in
the probabilities they yield.
4. Separability and optimal separators
ESL 4.5: Show that the solution of logistic regression is undefined if the data are separable.

5. (* A real challenge1 )
In the separable case, consider adding a small amount of ridge-type regularization to the likelihood:
X
β̂(λ) = arg min −l(β; X, y) + λ βj2
β
j

where l(β; X, y) is the standard logistic log likelihood.

Show that β̂(λ)/kβ̂(λ)k2 converges to the support vector machine solution (margin maximizing hyper-
plane) as λ → 0.
Hint:You may find the equivalent formulation of SVM in equation (4.44) of ESL useful (equation
(4.48) in the book’s second Edition).
6. Playing around with trees
Run a variety of tree-based algorithms on our competition data and show their performance. Compare:
• Small tree without pruning
• Large tree without pruning
• Large tree after pruning with 1-SE rule
• Bagging/RF on small trees (100 iterations)
• Bagging/RF large trees (100 iterations)
Do this under five-fold cross validation on our competition training set, and use the results of the five
different folds to calculate confidence intervals for performance. Plot all the results in a reasonable
way (e.g. using boxplot()) and comment on them. Explain your choices of “small” and ”large”.
Hints: a. Start early since bagging may take a while to run. b. Use as a basis the code from class
which implements much of this.

1 +50 points extra credit for original solution; +20 points for finding a solution in the literature and explaining it clearly; +5

for finding and citing it only

Stanford University CS 229, Autumn 2014 Midterm Examination
Document23 pages
Stanford University CS 229, Autumn 2014 Midterm Examination
Erico Archeti
No ratings yet
Homework 1
Document4 pages
Homework 1
Bilal Yousaf
0% (1)
Test 1B Scatter Plots and Data Analysis
Document2 pages
Test 1B Scatter Plots and Data Analysis
ahmed5030 ahmed5030
No ratings yet
Bayesian Inference For Multistage and Other Incomplete Designs
Document19 pages
Bayesian Inference For Multistage and Other Incomplete Designs
timobechger
No ratings yet
Classification: K N X X X y I y
Document6 pages
Classification: K N X X X y I y
usasua1112
No ratings yet
Tut5 Questions
Document2 pages
Tut5 Questions
Amir Sharifi
No ratings yet
Pattern Classification: All Materials in These Slides Were Taken From
Document18 pages
Pattern Classification: All Materials in These Slides Were Taken From
avivro
No ratings yet
ML Assignment 3
Document5 pages
ML Assignment 3
ADDAGIRI VENKATA RAVICHANDRA,CSE(2021) Vel Tech, Chennai
No ratings yet
Ps 1
Document5 pages
Ps 1
Emre Uysal
No ratings yet
PrincipalComponentAnalysisofBinaryData - Applicationstoroll Call Analysis
Document32 pages
PrincipalComponentAnalysisofBinaryData - Applicationstoroll Call Analysis
shree_saha
No ratings yet
HW3
Document4 pages
HW3
Bevis Wi
No ratings yet
Assign 1
Document5 pages
Assign 1
darkmanhi
No ratings yet
Week 5 Tutorial
Document3 pages
Week 5 Tutorial
297752644
No ratings yet
Exam 2011
Document22 pages
Exam 2011
Anas Bachiri
No ratings yet
Notes
Document47 pages
Notes
Ubanede Ebenezer
No ratings yet
Notation Example
Document11 pages
Notation Example
vishwanath444
No ratings yet
Tut 4
Document2 pages
Tut 4
Djamen
No ratings yet
University of Edinburgh College of Science and Engineering School of Informatics
Document5 pages
University of Edinburgh College of Science and Engineering School of Informatics
fusion
No ratings yet
Tut7 Questions
Document2 pages
Tut7 Questions
Amir Sharifi
No ratings yet
Machine Learning - Unit 2
Document104 pages
Machine Learning - Unit 2
sandt
No ratings yet
CS19M016 PGM Assignment1
Document9 pages
CS19M016 PGM Assignment1
avinash
No ratings yet
Bayes Sample Questions
Document2 pages
Bayes Sample Questions
Rekha Rani
No ratings yet
Softmax For The Layman
Document10 pages
Softmax For The Layman
Ein Niemand
100% (1)
AdaBoost Is Consistent
Document22 pages
AdaBoost Is Consistent
Robson Souza
No ratings yet
On Parameter Estimation by Nonlinear Least Squares in Some Special Two-Parameter Exponential Type Models
Document7 pages
On Parameter Estimation by Nonlinear Least Squares in Some Special Two-Parameter Exponential Type Models
Kiros Fiseha
No ratings yet
Lecture 4
Document51 pages
Lecture 4
dylan.j.gormley
No ratings yet
Notes6_Classification
Document10 pages
Notes6_Classification
czf1643605493
No ratings yet
Machine Learning and Pattern Recognition Week 3 Intro - Classification
Document5 pages
Machine Learning and Pattern Recognition Week 3 Intro - Classification
zeliawillscumberg
No ratings yet
Assignment MEF 1 2018
Document5 pages
Assignment MEF 1 2018
rtchuidjangnana
No ratings yet
Concise - Lecture - Notes - On - Optimization - Methods - 1722728042 2024-08-03 23 - 34 - 09
Document258 pages
Concise - Lecture - Notes - On - Optimization - Methods - 1722728042 2024-08-03 23 - 34 - 09
Thiago
No ratings yet
Logit PDF
Document44 pages
Logit PDF
Felipe Quezada Castañeda
No ratings yet
CS 229, Public Course Problem Set #3: Learning Theory and Unsuper-Vised Learning
Document4 pages
CS 229, Public Course Problem Set #3: Learning Theory and Unsuper-Vised Learning
suhar adi
No ratings yet
Unit 5 Problem Set
Document2 pages
Unit 5 Problem Set
Karen Lu
No ratings yet
Practice Midterm 2010
Document4 pages
Practice Midterm 2010
Erico Archeti
No ratings yet
Lec12 PDF
Document9 pages
Lec12 PDF
juanagallardo01
No ratings yet
Lec5 Class
Document14 pages
Lec5 Class
araymundom
No ratings yet
CS 229, Autumn 2017 Problem Set #2: Supervised Learning II
Document6 pages
CS 229, Autumn 2017 Problem Set #2: Supervised Learning II
nxp He
No ratings yet
Problem Set 2
Document2 pages
Problem Set 2
Mygod
No ratings yet
Mathematical Tripos: at The End of The Examination
Document27 pages
Mathematical Tripos: at The End of The Examination
Dedli
No ratings yet
PYQ (MStat)
Document428 pages
PYQ (MStat)
Aniv Mazumder
No ratings yet
Reading List 2020 21
Document8 pages
Reading List 2020 21
septian_bby
No ratings yet
Analysis of Complex Sample Survey Data: Multinomial and Ordinal Logistic Regression For Complex Samples
Document39 pages
Analysis of Complex Sample Survey Data: Multinomial and Ordinal Logistic Regression For Complex Samples
Camilo Gutiérrez Patiño
No ratings yet
Assignment 5
Document3 pages
Assignment 5
Minh Luan
No ratings yet
Using Deep Belief Nets To Learn Covariance Kernels For Gaussian Processes
Document8 pages
Using Deep Belief Nets To Learn Covariance Kernels For Gaussian Processes
robin
No ratings yet
Advanced Programme Mathematics IEB 2008
Document19 pages
Advanced Programme Mathematics IEB 2008
Jack Williams
No ratings yet
Additive Models: 36-350, Data Mining, Fall 2009 2 November 2009
Document16 pages
Additive Models: 36-350, Data Mining, Fall 2009 2 November 2009
machinelearner
No ratings yet
Paper DCRE
Document13 pages
Paper DCRE
Alexandre Bourdain
No ratings yet
A least-change secant algorithm for solving generalized_VPA
Document26 pages
A least-change secant algorithm for solving generalized_VPA
Hevert Vivas
No ratings yet
IE684 Lab03
Document6 pages
IE684 Lab03
Tirthankar Adhikari
No ratings yet
Logistic Regression
Document11 pages
Logistic Regression
Sayan Ghosal
No ratings yet
Mathematical Tripos: at The End of The Examination
Document28 pages
Mathematical Tripos: at The End of The Examination
Dedli
No ratings yet
The Odd Lindley G Family of Distribution
Document23 pages
The Odd Lindley G Family of Distribution
pratibhapp2921600
No ratings yet
Chapter - 5 (New) PDF
Document17 pages
Chapter - 5 (New) PDF
Moein Razavi
No ratings yet
Temario Isl or
Document15 pages
Temario Isl or
tamadite95
No ratings yet
Kernel Canonical Correlation Analysis: Max Welling
Document3 pages
Kernel Canonical Correlation Analysis: Max Welling
Mregank Soni
No ratings yet
Homework 4
Document4 pages
Homework 4
Jeremy Ng
No ratings yet
MedTerm Machine Learning
Document14 pages
MedTerm Machine Learning
MOhmedSharaf
No ratings yet
Sums of Powers of Primes in Arithmetic P
Document22 pages
Sums of Powers of Primes in Arithmetic P
Trong Phong Ngo
No ratings yet
Slide ML 0915
Document24 pages
Slide ML 0915
zentea99
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
Rating: 3 out of 5 stars
3/5 (1)
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
Rating: 1 out of 5 stars
1/5 (1)
Ejercicois Eviews
Document10 pages
Ejercicois Eviews
Juan Meza
100% (1)
Standard Normal Distribution
Document28 pages
Standard Normal Distribution
Mae Angela Bautista
No ratings yet
Agus Tri Basuki, Nano Prawoto (Hal 1-19) - 0
Document19 pages
Agus Tri Basuki, Nano Prawoto (Hal 1-19) - 0
Bagus Krishnayana
No ratings yet
This Study Resource Was
Document2 pages
This Study Resource Was
Afysha Diadara
No ratings yet
MNL7 8TH Foreword
Document12 pages
MNL7 8TH Foreword
Rith Makara
No ratings yet
JMP Stat Graph Guide
Document1,068 pages
JMP Stat Graph Guide
kfaulk1
100% (4)
Chi-Square Test Q.1: Objective
Document5 pages
Chi-Square Test Q.1: Objective
kareena singhal
No ratings yet
Definition of Statistics
Document17 pages
Definition of Statistics
Leo Raymund Ridad
No ratings yet
Melli Eliza (Kel - 10)
Document18 pages
Melli Eliza (Kel - 10)
Rizky
No ratings yet
Expectation and Variance
Document16 pages
Expectation and Variance
Wahab Hassan
No ratings yet
Sample of Test Feedback
Document9 pages
Sample of Test Feedback
api-327991016
No ratings yet
Application of SPSS in Research Writing
Document77 pages
Application of SPSS in Research Writing
Karen Joy Sablada
No ratings yet
Sampling and Sampling Distribution
Document4 pages
Sampling and Sampling Distribution
Aguswan Furwendo
100% (1)
Problems-Chapter 2
Document5 pages
Problems-Chapter 2
An Vy
No ratings yet
Dummy-Variable Regression Model: © 1999 Prentice-Hall, Inc. Chap. 14 - 1
Document53 pages
Dummy-Variable Regression Model: © 1999 Prentice-Hall, Inc. Chap. 14 - 1
Ziya Yash
No ratings yet
Jurnal Peran Sektor Pariwisata Dufan Terhadap Perekonomian Jakarta
Document15 pages
Jurnal Peran Sektor Pariwisata Dufan Terhadap Perekonomian Jakarta
adebudipamungkas
No ratings yet
P&S Modal Paper - 2020
Document3 pages
P&S Modal Paper - 2020
Ashreeth Katta
No ratings yet
Group 10 - Curve Fitting
Document81 pages
Group 10 - Curve Fitting
kaye
No ratings yet
CS 464 Introduction To Machine Learning: Feature Selection
Document36 pages
CS 464 Introduction To Machine Learning: Feature Selection
Mathias Bueno
No ratings yet
Analisis Sistem Dinamik ... (13 HLM)
Document7 pages
Analisis Sistem Dinamik ... (13 HLM)
Rico Indra
No ratings yet
The VARCOMP Procedure: Chapter Contents
Document20 pages
The VARCOMP Procedure: Chapter Contents
Baruch Chamorro
No ratings yet
Análisis de Covarianza
Document17 pages
Análisis de Covarianza
Yerling Gomez
No ratings yet
Dependent Indep1 Indep2 Indep3: No. of Variables No. of Observations
Document14 pages
Dependent Indep1 Indep2 Indep3: No. of Variables No. of Observations
contact7809
No ratings yet
Statistics 1st PUC Formula Book
Document21 pages
Statistics 1st PUC Formula Book
Lucky Chougale
100% (4)
QM2 Tutorial 3
Document26 pages
QM2 Tutorial 3
ducminhlniles
No ratings yet
Faculty of Information Science & Technology (FIST) : PSM 0325 Introduction To Probability and Statistics
Document7 pages
Faculty of Information Science & Technology (FIST) : PSM 0325 Introduction To Probability and Statistics
MATHAVAN A L KRISHNAN
No ratings yet
QCA Quiz
Document3 pages
QCA Quiz
erravinder.tonk
No ratings yet
Auronova Consulting
Document8 pages
Auronova Consulting
Rishi Chourasia
No ratings yet
Applied Environmental Assignment 1 - Mawuli Kuevor Index Number-11328955
Document3 pages
Applied Environmental Assignment 1 - Mawuli Kuevor Index Number-11328955
Mawuli
No ratings yet