Homework 4

BSTA 513/613

Due: Thursday May 23, 2024 at 11pm
Author

Your name here - update this!!!!

Published

May 23, 2024

Modified

May 20, 2024

Directions

  • Download the .qmd file here.

  • You will need to download the datasets. Use this link to download the homework datasets needed in this assignment. If you do not want to make changes to the paths set in this document, then make sure the files are stored in a folder named “data” that is housed in the same location as this homework .qmd file.

  • Please upload your homework to Sakai. Upload both your .qmd code file and the rendered .html file

    • Please rename you homework as Lastname_Firstinitial_HW0.qmd. This will help organize the homeworks when the TAs grade them.

    • Please also add the following line under subtitle: "BSTA 513/613": author: First-name Last-name with your first and last name so it is attached to the viewable document.

  • For each question, make sure to include all code and resulting output in the html file to support your answers.

  • Show the work of your calculations using R code within a code chunk. Make sure that both your code and output are visible in the rendered html file. This is the default setting.

  • If you are computing something by hand, you may take a picture of your work and insert the image in this file. You may also use LaTeX to write it inline.

  • Write all answers in complete sentences as if communicating the results to a collaborator. This means including a sentence summarizing results in the context of the research study.

Tip

It is a good idea to try rendering your document from time to time as you go along! Note that rendering automatically saves your qmd file and rendering frequently helps you catch your errors more quickly.

Questions

Question 1

This question is taken from the Hosmer and Lemeshow textbook. The ICU study data set consists of a sample of 200 subjects who were part of a much larger study on survival of patients following admission to an adult intensive care unit (ICU). The dataset should be available within Course Materials. The major goal of this study was to develop a logistic regression model to predict the probability of survival to hospital discharge of these patients. In this question, the primary outcome variable is vital status at hospital discharge, STA. Clinicians associated with the study felt that a key determinant of survival was the patient’s age at admission, AGE. We will be building to a multivariable logistic regression model while adjusting for cancer part of the present problem (CAN), CPR prior to ICU admission (CPR), infection probable at ICU admission (INF), and level of consciousness at ICU admission (LOC).

A code sheet for the variables to be considered is displayed in Table 1.5 below (from the Hosmer and Lemeshow textbook, pg. 23). We refer to this data set as the ICU data.

You will need to use some of the transformations of variables from Homework 3, Question 1, Part e.

Part a

Write down the equation for the logistic regression model of STA on AGE, CAN, CPR, and INF. How many parameters does this model contain?

Part b

Using glm(), obtain the maximum likelihood estimates of the parameters of the logistic regression model in Part a. Using these estimates, write down the equation with the fitted values.

Part c

Assess the significance of the group of coefficients for all variables in the model using the likelihood ratio test. (Hint: part of the ratio in the LRT will be an intercept only model)

Part d

Fit a new model using only CAN and INF as the predictors, including an interaction between CAN and INF. Write a sentence interpreting the odds ratio for the main effects in the model. Please include the 95% confidence interval.

Part e

From the above model, fill out the following table for the odds ratios. Note, you will only need to report two odds ratios and you already have one from Part d.

Cancer Infection Odds ratio
Cancer part of present problem Infection probable at ICU intake
      No
      Yes FILL HERE
Cancer not part of present problem Infection probable at ICU intake
      No
      Yes FILL HERE

This is a really good way to report odds ratios for interactions between two categorical predictors! Might want to keep this in mind for your project!!

Part f

Compute the predicted probability for a subject who does not have a present issue with cancer nor an infection upon admittance to the ICU. Compute the 95% confidence interval for the predicted probability. Can you use the Normal approximation? Interpret the predicted probability including the confidence interval.