| Characteristic | Beta | 95% CI1 | p-value |
|---|---|---|---|
| (Intercept) | 6.4 | 2.4, 10 | 0.002 |
| Fatalism | 0.15 | 0.06, 0.24 | <0.001 |
| Optimism | -0.32 | -0.46, -0.18 | <0.001 |
| Spirituality | 0.36 | 0.11, 0.61 | 0.006 |
| 1 CI = Confidence Interval | |||
Homework 4 Answers
BSTA 512/612
Answers are not necessarily complete! This is just meant to serve as a check if you are stuck.
Questions
Question 1
Part a
Fit the regression model with all the covariates (Fatalism, Optimism, Spirituality), display the regression table, and write out the fitted regression line.
\[\begin{aligned} \widehat{\text{Depression}} &= 6.4 + 0.15 \cdot \text{Fatalism} -0.32 \cdot \text{Optimism} + 0.36 \cdot \text{Spirituality} \end{aligned}\]
Part b
Does at least one of the covariates contribute significantly to the prediction of Depression? (Note: this is an overall test. Please follow the hypothesis test steps. To complete step 4-6, simply output your ANOVA table.)
| term | df.residual | rss | df | sumsq | statistic | p.value |
|---|---|---|---|---|---|---|
| Depression ~ 1 | 611.0000 | 17,167.8366 | NA | NA | NA | NA |
| Depression ~ Fatalism + Optimism + Spirituality | 608.0000 | 15,514.0044 | 3.0000 | 1,653.8322 | 21.6048 | 0.0000 |
Part c
Does the addition of Spirituality add significantly to the prediction of Depression achieved by Fatalism and Optimism?
| term | df.residual | rss | df | sumsq | statistic | p.value |
|---|---|---|---|---|---|---|
| Depression ~ Fatalism + Optimism | 609.00 | 15,710.94 | NA | NA | NA | NA |
| Depression ~ Fatalism + Optimism + Spirituality | 608.00 | 15,514.00 | 1.00 | 196.94 | 7.72 | 0.01 |
Part d
| term | df.residual | rss | df | sumsq | statistic | p.value |
|---|---|---|---|---|---|---|
| Depression ~ Fatalism | 610.00 | 16,141.67 | NA | NA | NA | NA |
| Depression ~ Fatalism + Optimism + Spirituality | 608.00 | 15,514.00 | 2.00 | 627.67 | 12.30 | 0.00 |
Question 2
Part a
Using R, make a variable that is a factor for Diet. Make sure to check what values the original variable for Diet can take. How many indicator functions do you need to represent the categorical variable Diet (protein-rich vs. protein-poor)?
2 levels, 1 indicator
Part b
At a level of significance \(\alpha = 0.10\), test whether protein diet modifies the effect of age on height. Justify your answer (e.g., perform a hypothesis test for the interaction between diet and age).
| term | df.residual | rss | df | sumsq | statistic | p.value |
|---|---|---|---|---|---|---|
| HT ~ AGE + DIET | 24.0000 | 399.8259 | NA | NA | NA | NA |
| HT ~ AGE + DIET + AGE * DIET | 23.0000 | 119.4200 | 1.0000 | 280.4059 | 54.0055 | 0.0000 |
Part c
Is it possible that diet is a confounder? Note: this will depend on your results from Part b.
Part d
Write the fitted regression equation for our model in Part b. Write the respective regression lines for each specific diet group: protein rich and protein poor. Interpret the slope of each regression line (no need for a 95% CI here).
Fitted regression equation:
\[\widehat{\text{Height}}|\text{Age}, \text{Diet} = 51.23+8.69\cdot \text{Age}-0.90\cdot I(\text{Protein-rich})+7.32\cdot \text{Age}\cdot I(\text{Protein-rich})\] Protein rich group \(I(\text{Protein-rich})=1\):
\[\widehat{\text{Height}}|\text{Age}, \text{Diet} = ??\]
Protein poor group \(I(\text{Protein-rich})=0\):
\[\widehat{\text{Height}}|\text{Age}, \text{Diet} = ??\]
Interpretation You need to work through this!
Question 3
Part a
Use \(\alpha= 0.05\), test whether if there is a crude association between HDL measurement and total cholesterol. Note: testing for a crude association means we fit a simple linear regression model and see if the association is significant.
| term | df | sumsq | meansq | statistic | p.value |
|---|---|---|---|---|---|
| X1 | 1.000 | 46.236 | 46.236 | 0.405 | 0.528 |
| Residuals | 40.000 | 4,567.383 | 114.185 | NA | NA |
Part b
Sometimes simple linear regression leads us to believe that there is no association between two variables, but missing interaction might be obscuring the association. Use \(\alpha= 0.1\) to test whether total triglyceride is an effect modifier of the association between HDL and total cholesterol.
| term | df.residual | rss | df | sumsq | statistic | p.value |
|---|---|---|---|---|---|---|
| Y ~ X1 + X2 | 39.000 | 4,478.237 | NA | NA | NA | NA |
| Y ~ X1 + X2 + X1 * X2 | 38.000 | 4,195.314 | 1.000 | 282.923 | 2.563 | 0.118 |
Part c
Is it possible that total triglyceride is a confounder? No need to test this explicity.