Week 3

Introduction to Logistic Regression

Published

April 15, 2024

Modified

April 18, 2024

Resources

Lesson	Topic	Slides	Annotated Slides	Recording
5	Simple Logistic Regression
6	Tests for GLMs using Likelihood function

If you ever have trouble with the above links to the videos, this Echo360 link should take you to our class page.

On the Horizon

Lab 1 due this Thursday (4/18)
Quiz 1 opens on Monday, 4/22, at 2pm and will close on Wednesday, 4/24, at 1pm

Class Exit Tickets

Monday (4/15)

Wednesday (4/17)

Announcements

Monday 4/15

I am trying to stay on track of the Exit tickets this quarter
- That may mean you have a 0 in your gradebook
- As long as you complete the exit ticket within the 7 days, I will change the 0 to a 1
- I plan to have a scheduled block on Fridays to check them

Wednesday 4/17

Quiz 1 info

Muddiest Points

1. Not entirely sure I understand what IRLS is about

Fair enough. It’s a little confusing. IWLS is an iterative solving technique that let’s us solve the coefficient estimates ( \(\beta_0\) , \(\beta_1\)) without solving the equations theoretically.

We start with an educated guess of the estimates, put them into the likelihood, and calculate the likelihood. Then we update the estimates using some complicated math, put them into the likelihood, and calculate the likelihood again. We compare the two likelihoods, and if the likelihood increases, then we keep going. We stop when the increase in likelihood between iterations is small. This means we are at or very close to the maximum likelihood.

2. Link functions

Yes! Link functions are the important transformations we need to make to our outcome in order to connect them to our perdictors/covariates. Specifically, it’s the transformation we make to our mean/expected value.

The same link function can be used different types of outcomes. And here’s a few examples:

Continuous data: identity
Binary: logit, log
Count/Poisson: log

Our goal with link functions is to put our outcome on a flexible range so that any range of covariates can be mapped to it with coefficients. So think about trying to map age onto a 0 or 1… We can’t come up with an equation like \(\beta_0 + \beta_1 Age\) that perfectly maps to only 0’s and 1’s.

3. Is GLM the umbrella over the other functions? The 4 functions all use different distributions, yes?

GLM is the umbrella term for different types of regression! Not all types of regression have different outcome distributions. For example, a binary outcome can be used in logistic regression with the logit link or log-binomial regression with the log link.

4. What would you need to change in your model to reduce a high IRLS number? As I understand it from the lecture, a high number suggests convergence but it appeared like something unfavorable even though a model that converges might be closer to maximum likelihood or maybe the distance to maximum likelihood

A high number suggests that the model did NOT converge! Thus, we did not land on an estimate close to our maximum likelihood. You can think of the IRLS number as the number of iterations it is taking to find the maximum likelihood estimate (MLE). If it takes too many iterations, then it just stops without finding the MLE.

5. We’re using linear vs logistic, but which are we focusing on? Regarding linear, how does linear used in categorical differ from continuous?

We are focusing on logistic! We cannot use linear regression on our binary outcomes anymore. When I say “linear” mapping I mean the mapping between our covariates and the transformed mean outcome using the link function.

6. By the end of class (Lesson 6) my understanding is that the saturated model likelihood is the same between the two models being compared, right?

Yep!!

7. The differences between each test and when to use them.

In terms of what each test is measuring:

The Wald test measures the distance between two potential values of \(\beta\). One under the null and one under the alternative. The further they are from each other, the more evidence we have that they are different.
- The Wald test approximates the differences in the likelihood function, but we do not actually compare the likelihoods under the null vs. alternative. We are only comparing the difference in the \(\beta\) value, that is a reasonable approximation of the difference in the likelihood.
The Score test measures how close the tangent line of the likelihood function is to 0 (under the null). If it is close to 0 under the null, this indicates that our MLE of \(\beta\) is not far from 0. Again, this is no a direct comparison of the likelihoods, but only an approximation of the difference.
The likelihood ratio test measures the difference in the log-likelihoods. This is a direct comparison of likelihoods, and is not an approximation!
- Thus, we compare the likelihoods (horizontally, as someone asked) because we are making direct comparisons between the likelihood under the null and under the alternative.