Muddy Points
Lesson 3: Intro to SLR
Muddy Points from 2026
Lots of overlap with muddy points from 2025 and 2024! See below for more questions and answers
1. Linear regression model vs linear regression line
See question 2 in 2024 muddy points.
2. Are residuals actually values observed in the data?
Residuals are the difference between the observed value (\(Y\)) and the expected value given our model (\(\hat{Y}\)). So residuals themselves are not observed values in the data, but they are calculated from observed values.
3. Are we essentially finding the line of best fit for our data or is that too simplified of a way of thinking about it?
Yes, we are finding the line of best fit for our data! The line of best fit is the line that minimizes the sum of squared errors (SSE).
Muddy Points from 2025
1. In a homework problem, or in a real-life scenario: I am a bit confused on when to use population vs estimate SLR. How do we know when to use the population parameters and estimated coefficients?
The population model is only used to describe what we will fit. Before we use the data to estimate anything, we need a reference model for what we will fit. Once we fit the population model using our data, we report the estimated model with the best fit line.
Muddy Points from 2024
If I saw overlap between questions from this year and 2024, I am leaving my old answer below:
1. What does the epsilon mean and how does it relate to the line in the linear model?
\(\epsilon\) is our error term, our residual. It is the difference between our observed value \(Y\) and the expected value of \(Y\) given \(X\). It’s a mathematical way to represent the fact that not every oberved \(Y\) value directly falls on our line. \(\epsilon\) is the difference between our line and our observed value for \(Y\).
2. Different betas and stuff: make the table for the class!! and epsilon
Below is a table that I started to construct with a student after class. We often use the model or the line to represent linear regression. When we refer to the model, most people think of the row named model. The line is just another way to represent the model. Remember that \(\epsilon = Y - E(Y|X)\) and \(\widehat\epsilon = Y - \widehat{E}(Y|X)\). Try substituting \(\epsilon = Y - E(Y|X)\) into the population model \(Y = \beta_0 + \beta_1X + \epsilon\). Does it simplify to the population line?
I think it can help a lot with this confusion.
| Population | Estimated | |
|---|---|---|
| Model | \[Y = \beta_0 + \beta_1X + \epsilon\] | \[Y = \widehat{\beta}_0 + \widehat{\beta}_1 X + \widehat\epsilon \] |
| Line | \[E(Y|X) = \beta_0 + \beta_1 X \] OR \[\mu_Y = \beta_0 + \beta_1 X \] |
\[\widehat{Y} = \widehat{\beta}_0 + \widehat{\beta}_1 X \] OR \[ \widehat{E}[Y|X] = \widehat{\beta}_0 + \widehat{\beta}_1X \] OR \[ \widehat{E[Y|X]} = \widehat{\beta}_0 + \widehat{\beta}_1X \] |
2.1 Someone else asked: Why does the population model have an error term epsilon in the equation but the estimated line does not?
I think this is referring to this slide. This was because I wanted to put the population model next to the estimated line. I realize this is very confusing. Both estimated and population models can be represented as the lines and models in the above table.
2.2 Someone else asked: Why does the population equation even matter?
Huh, I’m scratching my head with this one. Why does it matter? We basically mirror all the mathematical manipulations with the estimated model anyway…
But then I thought: What would our world or class lectures look like without the population model? The answer might be more philosophical than mathematical. The representation of the true, underlying model that we are aspiring for with our sample data reminds us that our estimated model is not perfect. That we are just trying out best to uncover some fraction of the truth. And at the end of the day, when we perform hypothesis tests, we’re working to provide evidence for the value of the population parameters from the population model. We know what the estimated values are, but can they help us get an idea of what the parameter values are?
3. Math for minimizing SSE (aka OLS process)
I am very sorry that this math was intimidating! Most of us don’t need to see the math, but there are a handful of students that should see it, and get a sense of the underlying math. Just wanted to make sure they saw it!
The important things for us to know is the information on the slide for Step 1 and 2, where we talk about the process itself. If I asked you why we minimize the SSE with respect to our coefficients, would you be able to answer?
4. The lecture materials don’t always feel like they apply to the homework. If asked to “state the linear regression models,” are we just running lm()?
Stating the linear regression model is asking us to show the population model that we are fitting. This is just to make sure we are aware of the model that we plan to fit. So the generic form of this is: \(Y = \beta_0 + \beta_1 X + \epsilon\).
Running lm() is the equivalent of fitting the model.
Keep letting me know what feels disconnected in the class! Sometimes I purposefully say things in different ways to build our understanding, but sometimes that fails!