| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | −3.059 | 0.696 | −4.394 | 0.000 | −4.546 | −1.797 |
| AGE | 0.028 | 0.011 | 2.607 | 0.009 | 0.008 | 0.050 |
Homework 2 Answers
BSTA 513/613
Questions
Question 1
Part a
Write down the equation for the logistic regression model of STA on AGE. What characteristic of the outcome variable, STA, leads us to consider the logistic regression model as opposed to the usual linear regression model to describe the relationship between STA and AGE?
Answer:Not given
Part b
Write down an expression for the log-likelihood for the logistic regression model in Part a. This will me a mathematical expression. Please do not use generic expressions like \(\pi(X)\), instead replace \(X\) with the specific variables in this question.
Answer:\[L(\beta_0,\beta_1) = \ln(l(\beta_0,\beta_1))= \ldots\]
Part c
Using the glm() function, obtain the maximum likelihood estimates of the coefficient parameters of the logistic regression model in Part a. Using these estimates, write down fitted logistic regression model.
Coefficient estimates:
Part d
Use the Wald test to test whether or not the intercept (\(\beta_0\)) of the logistic regression model is significantly different from 0. Make sure to include your: hypothesis test, test statistic, code/work leading to the computed test statistic, distribution of the test statistic, interpretation of the intercept, and conclusion. Please refer to the Additional Tips to guide you on what a complete/correct answer contains.
Answer:Estimated odds and odds ratio:
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 0.047 | 0.696 | −4.394 | 0.000 | 0.011 | 0.166 |
| AGE | 1.028 | 0.011 | 2.607 | 0.009 | 1.008 | 1.051 |
Part e
Use the Likelihood Ratio test to test whether or not the coefficient for age (\(\beta_1\)) of the logistic regression model is significantly different from 0. Make sure to include your: hypothesis test, test statistic, code/work leading to the computed test statistic, distribution of the test statistic, and conclusion. Please refer to the Additional Tips to guide you on what a complete/correct answer contains. You do not need to include an interpretation of the coefficient since we have not covered this.
Answer:Likelihood ratio test
Model 1: STA ~ AGE
Model 2: STA ~ 1
#Df LogLik Df Chisq Pr(>Chisq)
1 2 -96.153
2 1 -100.080 -1 7.8546 0.005069 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Part f
Write a sentence interpreting the estimated odds ratio for the coefficient in Part e. Please include the 95% confidence interval.
Estimated odds and odds ratio for the fitted model:
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 0.047 | 0.696 | −4.394 | 0.000 | 0.011 | 0.166 |
| AGE | 1.028 | 0.011 | 2.607 | 0.009 | 1.008 | 1.051 |
Question 2
We will continue to work with the ICU data in Question 1. Please refer back to the information above. In this question, we will use the ICU data to fit a multivariable logistic regression model.
Loading dataset:
Part a
From the above list (AGE, CAN, CPR, INF, and LOC) of independent variables, identify if each is a continuous, binary, or multi-level (>2) categorical variable.
Answer:Continuous: ??
Binary: CAN, CPR, ???
Multi-level (>2) categorical: ??
Part b
For the binary and multi-level categorical variables, please identify a reference group for each. Include justification for the reference group.
Answer:Not given - just need some thought
Part c
Compute the predicted probability of hospital discharge for a subject who is 63 years old. Compute the 95% confidence interval for the predicted probability and interpret the predicted probability.
Answer:Prediction.1 1 1
0.210 0.152 0.269
Part d
For the categorical variables (binary and multi-group), please mutate the variables within the ICU dataset to set your chosen reference groups.
Answer:Not given
Part e
Write down the equation for the logistic regression model of STA on CPR.
Answer:\[\text{logit} \left(\pi(CPR)\right)= \ldots\]
Part f
Using the glm() function, obtain the maximum likelihood estimates of the coefficient parameters of the logistic regression model in Part f. Using these estimates, write down fitted logistic regression model.
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | −1.540 | 0.192 | −8.031 | 0.000 | −1.933 | −1.179 |
| CPRYes | 1.695 | 0.588 | 2.880 | 0.004 | 0.533 | 2.887 |
\[\begin{aligned} \text{logit} \left(\widehat{\pi}(CPR)\right) &= -1.54 + \ldots \end{aligned}\]
Part g
Write a sentence interpreting the odds ratio for the coefficients in Part g’s model. Please include the 95% confidence interval.
Answer:| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 0.214 | 0.192 | −8.031 | 0.000 | 0.145 | 0.308 |
| CPRYes | 5.444 | 0.588 | 2.880 | 0.004 | 1.705 | 17.939 |
OR PRESENT THIS WAY:
| Characteristic | OR1 | 95% CI1 | p-value |
|---|---|---|---|
| CPR | |||
| No | — | — | |
| Yes | 5.44 | 1.70, 17.9 | 0.004 |
| 1 OR = Odds Ratio, CI = Confidence Interval | |||
Part h
Write down the equation for the logistic regression model of STA on LOC.
Answer:\[\text{logit} \left(\pi(LOC)\right)= \ldots\]
Part i
Using the glm() function, obtain the maximum likelihood estimates of the coefficient parameters of the logistic regression model in Part h. Present the coefficient estimates. No need to write out the fitted regression equation.
Please take note of the warnings that you receive from fitting the glm() model and any large coefficient estimate with large confidence intervals. In this case, we have a category within LOC that has very few observations.
Check the number of observations that have a deep stupor and death at discharge and the number of observations that have a deep stupor and live at discharge. You can do this using the table() function to create a contingency table.
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | −1.767 | 0.208 | −8.484 | 0.000 | −2.196 | −1.377 |
| LOCComa | 3.153 | 0.818 | 3.857 | 0.000 | 1.708 | 5.080 |
| LOCDeep Stupor | 18.333 | 1,073.109 | 0.017 | 0.986 | −93.019 | NA |
Part j
Write a sentence interpreting the odds ratio of death for the indicator of coma. Please include the 95% confidence interval.
Answer:| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 0.171 | 0.208 | −8.484 | 0.000 | 0.111 | 0.252 |
| LOCComa | 23.407 | 0.818 | 3.857 | 0.000 | 5.515 | 160.803 |
| LOCDeep Stupor | 91,589,444.626 | 1,073.109 | 0.017 | 0.986 | 0.000 | NA |