Homework 7
Homework is ready to be worked on!! (11/13/24)
Directions
Please turn in this homework on Sakai. This homework must be submitted using a Quarto document. Please keep it rendered as an html! I know past homeworks said pdf, but all Quarto docs should be rendered as html for this class!
You can download the .qmd file for this assignment from Github
It is a good idea to try rendering your document from time to time as you go along! Note that rendering automatically saves your Qmd file and rendering frequently helps you catch your errors more quickly.
0.1 Complete the group evaluation
Please complete the group evaluation form by the homework assignment due date.
1 Book exercises
5.16 Paired or not, Part II
5.22 DDT exposure
5.34 Placebos without deception
2 Non-book exercises
2.1 R1: Swim times
- In these exercises you will use R to work through the swim times example from Section 5.2 in the textbook.
- The data are in the
oibiostats
package and calledswim
.
2.1.1 Mean & SD of differences
Calculate the mean and standard deviation for the differences in swim times, and compare them to the ones in the book. Which order were the differences calculated, wet suit - swim suit or the opposite? Were all the differences positive?
2.1.2 Histogram of differences
Create a histogram of the differences in swim times and comment on the distribution shape.
2.1.3 Hypothesis test
Run the appropriate statistical test in R to verify the test statistic, p-value, and CI in the textbook.
2.2 R2: 2-sample independent t-test
This problem was adapted from Dr. Maria Tackett’s Intro to Data Science homework.
The dataset is adapted from Little et al. (2007), and contains voice measurements from individuals both with and without Parkinson’s Disease (PD), a progressive neurological disorder that affects the motor system. The aim of Little et al.’s study was to examine whether Parkinson’s Disease could be diagnosed by examining the spectral (sound-wave) properties of patients’ voices.
147 measurements were taken from patients with PD, and 48 measurements were taken from healthy patients who served as controls. For the purposes of this assignment, you may assume that measurements are representative of the underlying populations (PD vs. healthy).
The variables in the dataset are as follows:
clip
: ID of the recording numberjitter
: a measure of variation in fundamental frequencyshimmer
: a measure of variation in amplitudehnr
: a ratio of total components vs. noise in the voice recordingstatus
: PD vs. Healthyavg.f.q
: 1, 2, or 3, corresponding to average vocal fundamental frequency- 1 = low,
- 2 = mid
- 3 = high
The data are in parkinsons.csv
located in the Data
folder of the shared OneDrive folder.
We will be focusing on the variable HNR. We will see if there is evidence that the mean HNR is different for people with PD and people without PD.
2.2.1 Histogram of HNR
Use histograms to visualize the distribution for HNR, comparing people with and without PD.
2.2.2 Mean & SD of HNR
Calculate the mean and standard deviation for HNR in the voice recordings of adults with and without Parkinson’s disease.
2.2.3 Hypothesis test
Run the appropriate statistical test in R. Please include all steps in the hypothesis test!