library(palmerpenguins)
data(penguins)
Homework 9
Homework is ready to be worked on!! (12/5/24)
Due 12/13/24
Directions
Please turn in this homework on Sakai. This homework must be submitted using a Quarto document. Please keep it rendered as an html and turn in the html document! I know past homeworks said pdf, but all Quarto docs should be rendered as html for this class!
You can download the .qmd file for this assignment from Github
It is a good idea to try rendering your document from time to time as you go along! Note that rendering automatically saves your Qmd file and rendering frequently helps you catch your errors more quickly.
Book exercises
8.28 True or false, Part II
8.34 Coffee and Depression
0.0.1 (a)
0.0.2 (b)-(f)
Instead of doing part (b) - (f), please run a hypothesis test using the Chi-squared test.
0.0.3 (g)
5.46 Child care hours
5.48 True/False: ANOVA, Part II
1 R exercise
1.1 Load all the packages you need below here.
1.2 R1: Palmer Penguins ANOVA
- Use the
penguins
data from thepalmerpenguins
package.- Don’t forget to first install the
palmerpenguins
package
- Don’t forget to first install the
- You can learn more about the Palmer penguins data at https://allisonhorst.github.io/palmerpenguins/
- We will test whether there are differences in penguins’ mean bill depths when comparing different species.
1.2.1 Dotplots
Make a dotplot of the penguins’ bill depths stratified by species type. Include points for the mean of each species type as well as a horizontal dashed line for the overall mean. See example from class for the plot I’m describing.
1.2.2 Technical conditions
Investigate whether the technical conditions for using an ANOVA been satisfied.
1.2.3 Which groups significantly different?
Based on the figure, which pairs of species look like they have significantly different mean bill depths?
1.2.4 Hypotheses in symbols or words
Write out in symbols or words the null and alternative hypotheses.
1.2.5 Run ANOVA in R
Using R, run the hypothesis test and display the output.
1.2.6 F statistic
Using the values from the ANOVA table, verify (calculate) the value of the F statistic.
1.2.7 Decision?
Based on the p-value, will we reject or fail to reject the null hypothesis? Why?
1.2.8 Conclusion
Write a conclusion to the hypothesis test in the context of the problem.
2 Nonparametric-Tests
2.1 NPT 1: (Wilcoxon) Signed-rank test
Vegetarian diet and cholesterol levels
When covering paired t-tests on Day 10 Part 2, the class notes used the example of testing whether a vegetarian diet changed cholesterol levels. The data are in the file chol213.csv
at https://niederhausen.github.io/BSTA_511_F23/data/chol213.csv. In this exercise we will use non-parametric tests to test for a change and compare the results to the paired t-test.
2.1.1 Hypotheses
What are the hypotheses for the signed-rank test (2-sided) in the context of the problem?
2.1.2 Test in R
Run the (Wilcoxon) Signed-rank test in R. What is the p-value and how does it compare to the p-value of the sign test and the paired t-test (check the class notes for this)?
8.38 (a) & (extra) Salt intake and CVD
Do not do parts (b)-(c) in the book
(a)
- You can use the expected cell counts from
expected()
in R (you do not need to compute them using the formula). - Comment on whether the sample size condition is met or not for these data.
(extra)
Run a Fisher’s Exact test. Include the hypotheses and a conclusion in the context of the problem.
3 Extra R exercises (optional)
3.1 R2: Palmer Penguins SLR
Below I frequently use the terminology variable1 vs. variable2. When we write this, the first variable is \(y\) (vertical axis) and the second is \(x\) (horizontal axis). Thus it’s always \(y\) vs. \(x\) (NOT \(x\) vs. \(y\)).
3.1.1 Scatterplots
- For each of the following pairs of variables, make a scatterplot showing the best fit line and describe the relationship between the variables.
- In particular address
- whether the association is linear,
- how strong it is (based purely on the plot), and
- what direction (positive, negative, or neither).
body mass vs. flipper length
bill depth vs. flipper length
bill depth vs. bill length
3.1.2 Correlations
- For each of the following pairs of variables, find the correlation coefficient \(r\).
body mass vs. flipper length
bill depth vs. flipper length
bill depth vs. bill length
3.1.3 Compare associations
Which pair of variables has the strongest association? Which has the weakest? Explain how you determined this.
3.1.4 Body mass vs. flipper length SLR
Run the simple linear regression model for body mass vs. flipper length, and display the regression table output.
3.1.5 Regression equation
Write out the regression equation for this model, using the variable names instead of the generic \(x\) and \(y\), and inserting the regression coefficient values.
3.1.6 Interpret intercept
Write a sentence interpreting the intercept for this example. Is it meaningful in this example?
3.1.7 Interpret slope
Write a sentence interpreting the slope for this example.