2024-10-23
Visualize relationships between two numeric variables using scatterplots and determine their correlation
Visualize relationships between two categorical variables using contingency tables and segmented barplots
Visualize relationships between a categorical variable and a numeric variable using side-by-side boxplots, density plots, and ridgeline plots
Example questions about relationships:
Response Variable
A response variable is defined by the particular research question a study seeks to address
Explanatory Variable
A study will typically examine whether the values of a response variable differ as values of an explanatory variable change, and if so, how the two variables are related.
Sometimes we’re interested in viewing the relationship between our response variable and explanatory variable(s)
Sometimes we’re just interested in viewing the relationship between explanatory variables
Visualize relationships between two categorical variables using contingency tables and segmented barplots
Visualize relationships between a categorical variable and a numeric variable using side-by-side boxplots, density plots, and ridgeline plots
Two variables \(x\) and \(y\) are
Positively associated if \(y\) increases as \(x\) increases
Negatively associated if \(y\) decreases as \(x\) increases
The term “association” is a very general term
Female body size and clutch volume are positively associated with each other
The (Pearson) correlation coefficient of variables \(x\) and \(y\) can be computed using the formula \[r = \frac{1}{n-1}\sum_{i=1}^{n}\Big(\frac{x_i - \bar{x}}{s_x}\Big)\Big(\frac{y_i - \bar{y}}{s_y}\Big)\] where
cor()
in R to calculate this!Rossman & Chance’s applet
Tracks performance of guess vs. actual, error vs. actual, and error vs. trial
Or, for the Atari-like experience
Age Group | Hypertension | No Hypertension | Total |
---|---|---|---|
18-39 years | 8836 | 112206 | 121042 |
40-59 years | 42109 | 88663 | 130772 |
60+ years | 39917 | 21589 | 61506 |
Total | 90862 | 222458 | 313320 |
Age Group | Hypertension | No Hypertension | Total |
---|---|---|---|
18-39 years | 0.0282 | 0.3581 | 0.3863 |
40-59 years | 0.1344 | 0.2830 | 0.4174 |
60+ years | 0.1274 | 0.0689 | 0.1963 |
Total | 0.2900 | 0.7100 | 1.0000 |
Joint probability: intersection of row and column
Marginal probability: row or column total
We can work towards visualizing the data in contingency and probability tables
Counts (below) vs. percentages (right)
Visualize relationships between two numeric variables using scatterplots and determine their correlation
Visualize relationships between two categorical variables using contingency tables and segmented barplots
Useful visualizations for directly comparing how the distribution of a numerical variable differs by category:
Lesson 8 Slides