4 Numerical Approaches to OLS
Readings
Intro to Statistical Learning: 3.1, 3.2
If you have not yet taken at least the first week of linear algebra:
Intro to Statistical Learning: pp. 20-23
(“Notation and Simple Matrix Algebra”)
If you are new to defining functions in R/Python:
R for Data Science: 25
Python for Data Analysis: 3.2
Lecture
Slides from Lecture 4
Lab
Objective: in this lab, you will simulate data-generating processes and then recover OLS coefficients through numerical minimization of the residual sum of squares (RSS).
The lab has eight steps:
- Generate a simple data set where X is a standard normal and Y is 2X plus a standard normal error term.
- Save the OLS coefficients from a regression of Y on X.
- Calculate the OLS coefficients “by hand” using the formula given in lecture.
- Set up a grid search by defining a range of candidate values of beta.
- Define a function that calculates the RSS for a given candidate value of beta.
- Find the candidate beta that minimizes the RSS.
- Find the beta that minimizes the RSS using R or Python’s numerical minimization functions.
- Repeat the process for a multivariate regression.
A text file outlining the steps in the lab is available here. When you are finished with the lab,
you can upload it here.