Homework 7
Lessons 13 and 14
Directions
Please turn in this homework on Sakai. You will need to turn in an html file and a qmd file, Please submit your homework in pdf format if you are not rendering a Quarto document. If you are using a Quarto doc to render your full homework, then you can keep it as an html!
You can download the .qmd file for this assignment from Github if you want to work in a Quarto doc. You do not need to work in a Quarto doc for this homework!!
It is a good idea to try rendering your document from time to time as you go along! Note that rendering automatically saves your Qmd file and rendering frequently helps you catch your errors more quickly.
1 Book exercises
5.16 Paired or not, Part II
5.22 DDT exposure
5.34 Placebos without deception
2 Non-book exercises
2.1 R1: Swim times
- In these exercises you will use R to work through the swim times example from Section 5.2 in the textbook.
- The data are in the
oibiostatspackage and calledswim.
2.1.1 Mean & SD of differences
Calculate the mean and standard deviation for the differences in swim times, and compare them to the ones in the book. Which order were the differences calculated, wet suit - swim suit or the opposite? Were all the differences positive?
2.1.2 Histogram of differences
Create a histogram of the differences in swim times and comment on the distribution shape.
2.1.3 Hypothesis test
Run the appropriate statistical test in R to verify the test statistic, p-value, and CI in the textbook.
2.2 R2: 2-sample independent t-test
This problem was adapted from Dr. Maria Tackett’s Intro to Data Science homework.
The dataset is adapted from Little et al. (2007), and contains voice measurements from individuals both with and without Parkinson’s Disease (PD), a progressive neurological disorder that affects the motor system. The aim of Little et al.’s study was to examine whether Parkinson’s Disease could be diagnosed by examining the spectral (sound-wave) properties of patients’ voices.
147 measurements were taken from patients with PD, and 48 measurements were taken from healthy patients who served as controls. For the purposes of this assignment, you may assume that measurements are representative of the underlying populations (PD vs. healthy).
The variables in the dataset are as follows:
clip: ID of the recording numberjitter: a measure of variation in fundamental frequencyshimmer: a measure of variation in amplitudehnr: a ratio of total components vs. noise in the voice recordingstatus: PD vs. Healthyavg.f.q: 1, 2, or 3, corresponding to average vocal fundamental frequency- 1 = low,
- 2 = mid
- 3 = high
The data are in parkinsons.csv located in the Data folder of the shared OneDrive folder.
We will be focusing on the variable HNR. We will see if there is evidence that the mean HNR is different for people with PD and people without PD.
2.2.1 Histogram of HNR
Use histograms to visualize the distribution for HNR, comparing people with and without PD.
2.2.2 Mean & SD of HNR
Calculate the mean and standard deviation for HNR in the voice recordings of adults with and without Parkinson’s disease.
2.2.3 Hypothesis test
Run the appropriate statistical test in R. Please include all steps in the hypothesis test!