library(palmerpenguins)
Attaching package: 'palmerpenguins'
The following objects are masked from 'package:datasets':
penguins, penguins_raw
data(penguins)Your name here - update this!!!!
Please turn in this homework on Sakai. You will need to turn in an html file and a qmd file, Please submit your homework in pdf format if you are not rendering a Quarto document. If you are using a Quarto doc to render your full homework, then you can keep it as an html!
You can download the .qmd file for this assignment from Github if you want to work in a Quarto doc. You do not need to work in a Quarto doc for this homework!!
It is a good idea to try rendering your document from time to time as you go along! Note that rendering automatically saves your Qmd file and rendering frequently helps you catch your errors more quickly.
Note from Nicky: I’m sorry. This problem was not supposed to be in HW 7… If you already did it, you can paste it here. Try to reflect and see if this problem is clearer now that we’ve covered power and sample size.
Skip parts c and d.
Skip part a!
Test whether the proportion of US residents who think marijuana should be made legal is different than 0.586.
Are the results from CI and hypothesis test consistent? Why or why not?
In exercise 5.12 in Homework 6, we tested whether police officers appear to have been exposed to a higher concentration of lead than 35. Calculate the power for the hypothesis test and include an interpretation of the power in the context of the research question. Was it sufficiently powered?
For the same test, what sample size would be needed for 80% power? Would it be reasonable to conduct the study with these sample sizes? Why or why not? (Hint: think about the assumptions of our distributions when using a t-test)
Suppose the study has resources to include 30 people. What minimum effect size would they be able to detect with 85% power assuming the same sample mean and standard deviation. Use \(\alpha\) = 0.05.
penguins data from the palmerpenguins package.
palmerpenguins package
Attaching package: 'palmerpenguins'
The following objects are masked from 'package:datasets':
penguins, penguins_raw
Make a plot of the penguins’ bill depths stratified by species type. Include points for the mean of each species type as well as a horizontal dashed line for the overall mean. See example from class for the plot I’m describing.
Investigate whether the assumptions for using an ANOVA been satisfied.
Based on the figure, which pairs of species look like they have significantly different mean bill depths?
Write out in symbols or words the null and alternative hypotheses.
Using R, run the hypothesis test and display the output.
Using the values from the ANOVA table, verify (calculate) the value of the F statistic.
Based on the p-value, will we reject or fail to reject the null hypothesis? Why?
Write a conclusion to the hypothesis test in the context of the problem.
The Strong Heart Study is an ongoing study of American Indians residing in 13 tribal communities in three geographic areas (AZ, OK, and SD/ND) to study prevalence and incidence of cardiovascular disease and to identify risk factors. We will be examining the 4-year cumulative incidence of diabetes with one risk factor, glucose tolerance. We are curious if the proportion of individuals diagnosed with diabetes is different between glucose tolerances.
Impaired glucose: normal or impaired glucose tolerance at baseline visit (between 1988 and 1991)
Diabetes: Indicator of diabetes at follow-up visit (roughly four years after baseline) according to two-hour oral glucose tolerance test
The data are in SHS_data.csv located in the Data folder of the shared OneDrive folder. The following table summarizes the data:
| Glucose |
Diabetes
|
Total | |
|---|---|---|---|
| Not diabetic | Diabetic | ||
| Impaired | 334 | 198 | 532 |
| Normal | 1004 | 128 | 1132 |
| Total | 1338 | 326 | 1664 |
Complete the hypothesis test to see if the proportion of individuals diagnosed with diabetes is different between glucose tolerances. (Reminder: Follow all steps and put your conclusion in context of the Strong Heart Study)
Calculate and interpret the 95% confidence interval for the difference in proportions using the formula. Is it consistent with CI from the R output of the hypothesis test? (Reminder: Make sure to check the assumptions!)