Some words on Lab 3

Nicky Wakim

2024-03-11

Overall

Great work! I know there was a lot of plotting, but you made some great plots!
I also appreciate your thoughtful approaches to the relationships between variables
Just a reminder: when visualizing, making tables, displaying information from the data: always keep in the back of your mind:
- What can a reader get from this if they have never seen the data?
- Is it easy for the reader to understand the plot?
- Does everything in the plot have a purpose?
- Is the main thing I’m trying to communicate also the thing that stands out?

A few catches

I said glimpse, but that was a bad choice of words
- head() might be better to show each row with their observations and variables
When we were looking into observations that might be suspicious…
- We should look at the intersection of multiple suspicious observations
- A few combinations that people came up with that I thought were good ideas:
  - Someone who chose that they felt extremely similar to fat and thin people
  - Someone who reported they were 11-14 years old with education above high school (or other age/education combos)
If the variable is an explanatory variable and originally categorical, it’s good to keep it categorical
- Lots of lm() functions with variable treated as continuous

Multi-selection variables 1

Splitting gender identities is NOT mutually exclusive
- Can identify as all genders!
If you are trying pinpoint one group and make it mutually exclusive, we need to take extra steps
Someone wanted to identify three groups:
- Identifies as trans man, trans woman, genderqueer/non-conforming, and/or different identity
- Identifies as man only
- Identifies as woman only
Point on data equity:
- Always write out all identities within a grouped category
- I did not call the group something like non-normative genders, instead I said “Identifies as trans man, trans woman, genderqueer/non-conforming, and/or other”
- Your coding names in R can be different, but when you write it out, make sure you define the group
  - I would maybe call the first group “Trans and non-binary genders” with an exact make up of the group.

Multi-selection variables 2

Create indicators for each identity: don’t any lose information on the individual

iat_new = iat_old %>%
  mutate(ind_m = grepl(1, genderIdentity), 
         ind_f = grepl(2, genderIdentity),
         ind_tmm = grepl(3, genderIdentity),
         ind_twf = grepl(4, genderIdentity),
         ind_gqnc = grepl(5, genderIdentity),
         ind_diff = grepl(6, genderIdentity))

Create mutually exclusive groups: lose some information on the individual

iat_new = iat_old %>% # iat_old should not have any NA's in genderIdentity
  mutate(genderID = case_match(genderIdentity,
                               "[1]" ~ "Male/Man only",
                               "[2]" ~ "Female/Woman only",
                               .default = "Trans male/man, trans female/woman, 
                               genderqueer/non-conforming, or differnt identity"))

Example : if someone identifies as a man and a trans man, their data are like:
- \(I(Gender = \text{Male/Man only}) = 0\)
- \(I(Gender = \text{Female/Woman only}) = 0\)
- \(I(Gender = \text{Trans male/man, trans female/woman, genderqueer/non-conforming, or different identity}) = 1\)

About confounders and effect modifiers

When hypothesizing whether a variable a confounder or effect modifier
- Make sure to back any claims in your final report with sources
- We can speculate what’s at play, but we can’t actually know
- Our own identity may bias how we perceive specific dynamics
Education: I saw many of us speculate that higher education may be associated with lower IAT (as a potential confounder)
- Data showed the opposite: higher education, higher mean IAT score
- We can think about people’s perception of controllability of weight: do people assume certain behaviors about fat people?
  - Does that align with or go against people’s assumptions about behavior needed for higher education?
  - We also need to think about how education might be linked to socio-economic status, and how that might change what food is affordable
- Heavily discussed in Maintenance Phase podcast, but I don’t have direct sources

Potential effect modifier from plots

Notes on plotting

scale_x_discrete(labels = function(x) str_wrap(x, width = 10)): use this to wrap the text on x-axis
- Student used this and I loved it!
Keep your explanatory variable on the x-axis when you are plottinf three variables at once
hjust and vjust will move your text on the x axis so it does not cover your plot
Plotting age vs. IAT
- geom_smooth() to show moving mean value
- Boxplots and plotting each mean not exactly right for continuous variables
- See Lab 4 for how I plot this!
In geom_smooth(), when to use method = lm
- Do not use if trying to see how data look

As we go into Lab 4 and Project Report

If IAT score ranges from -2 to 2, what changes in mean IAT is a lot?
- 0.05: about 1.25% change
- 0.5: about 12.5% change
A lot of the coefficients may be significant, but are they clinically meaningful?