ggplot(df, aes(x = age, y = y)) + geom_point() + geom_smooth()
Lesson
March 4, 2026
Mmm good question! VIFs work for continuous and binary variables. So if your model only has continuous or binary covariates, then the VIFs and GVIFs are the same, and you can use either. The GVIFs are needed for multi-level covariates.
You can always center a value! There are two scenarios where centering is really helpful:
When we have an interaction. Centering makes coefficients more interpretable
When we have a transformation of the variable. Centering avoids issues with multicollinearity.
Here’s a pretty good video about the differences! About 8 minutes long, but easily played at 1.25/1.5 speed.
We would only have age and age-squared if we noticed the relationship between age and our outcome was not linear. For example, our plot could look like this:
And let’s say we see the following plot for age-squared:
Then we would make the transformation of age for our model. When we include age-squared in the model, we still need to include age. We can run the model with both:
And we can look at the regression table. Notice that the standard error of age and age-squared’s coefficients are okay, but the intercept’s standard error is really big.
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | −87.51 | 36.05 | −2.43 | 0.02 | −158.35 | −16.67 |
| age | 3.67 | 1.60 | 2.29 | 0.02 | 0.53 | 6.82 |
| age_sq | 0.57 | 0.02 | 35.32 | 0.00 | 0.54 | 0.60 |
We can also look at the VIF:
age age_sq
38.0167 38.0167
age age_sq
38.0167 38.0167
The VIFs are really big, so centering age will help the multicolinearity of the model.
Where age_c is age centered at the mean, and age_c_sq is the centered age squared.
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 1,459.01 | 6.96 | 209.68 | 0.00 | 1,445.34 | 1,472.68 |
| age_c | 59.52 | 0.26 | 229.32 | 0.00 | 59.01 | 60.03 |
| age_c_sq | 0.57 | 0.02 | 35.32 | 0.00 | 0.54 | 0.60 |
age_c age_c_sq
1.000054 1.000054
Yay! The VIFs are much better now! And the intercept and age coefficient estimate have better standard error!