Homework 7

Author

Your name here - update this!!!!

Published

July 11, 2025

Important

Homework is ready to be worked on!! (11/13/24)

Directions

Please turn in this homework on Sakai. This homework must be submitted using a Quarto document. Please keep it rendered as an html! I know past homeworks said pdf, but all Quarto docs should be rendered as html for this class!

You can download the .qmd file for this assignment from Github

Tip

It is a good idea to try rendering your document from time to time as you go along! Note that rendering automatically saves your Qmd file and rendering frequently helps you catch your errors more quickly.

0.1 Complete the group evaluation

Please complete the group evaluation form by the homework assignment due date.

https://forms.office.com/Pages/ResponsePage.aspx?id=V3lz4rj6fk2U9pvWr59xWFMopmPUjRtDiHLswhgacNhUQ0JOQjRaQkpPNFpPSlBKVjJUMDdMTkFRNi4u

1 Book exercises

5.16 Paired or not, Part II

5.22 DDT exposure

5.34 Placebos without deception

2 Non-book exercises

2.1 R1: Swim times

  • In these exercises you will use R to work through the swim times example from Section 5.2 in the textbook.
  • The data are in the oibiostats package and called swim.

2.1.1 Mean & SD of differences

Calculate the mean and standard deviation for the differences in swim times, and compare them to the ones in the book. Which order were the differences calculated, wet suit - swim suit or the opposite? Were all the differences positive?

2.1.2 Histogram of differences

Create a histogram of the differences in swim times and comment on the distribution shape.

2.1.3 Hypothesis test

Run the appropriate statistical test in R to verify the test statistic, p-value, and CI in the textbook.

2.2 R2: 2-sample independent t-test

This problem was adapted from Dr. Maria Tackett’s Intro to Data Science homework.

The dataset is adapted from Little et al. (2007), and contains voice measurements from individuals both with and without Parkinson’s Disease (PD), a progressive neurological disorder that affects the motor system. The aim of Little et al.’s study was to examine whether Parkinson’s Disease could be diagnosed by examining the spectral (sound-wave) properties of patients’ voices.

147 measurements were taken from patients with PD, and 48 measurements were taken from healthy patients who served as controls. For the purposes of this assignment, you may assume that measurements are representative of the underlying populations (PD vs. healthy).

The variables in the dataset are as follows:

  • clip: ID of the recording number
  • jitter: a measure of variation in fundamental frequency
  • shimmer: a measure of variation in amplitude
  • hnr: a ratio of total components vs. noise in the voice recording
  • status: PD vs. Healthy
  • avg.f.q: 1, 2, or 3, corresponding to average vocal fundamental frequency
    • 1 = low,
    • 2 = mid
    • 3 = high

The data are in parkinsons.csv located in the Data folder of the shared OneDrive folder.

We will be focusing on the variable HNR. We will see if there is evidence that the mean HNR is different for people with PD and people without PD.

2.2.1 Histogram of HNR

Use histograms to visualize the distribution for HNR, comparing people with and without PD.

2.2.2 Mean & SD of HNR

Calculate the mean and standard deviation for HNR in the voice recordings of adults with and without Parkinson’s disease.

2.2.3 Hypothesis test

Run the appropriate statistical test in R. Please include all steps in the hypothesis test!