
2025-10-01
Definition: Outcome
The possible results in a random phenomenon.
Definition: Sample Space
The sample space \(S\) is the set of all outcomes
Definition: Event
An event is a collection of some outcomes. An event can include multiple outcomes or no outcomes (a subset of the sample space).
When thinking about events, think about outcomes that you might be asking the probability of. For example, what is the probability that you get a heads or a tails in one flip? (Answer: 1)
Definition: Random Variable
For a given sample space \(S\), a random variable (r.v.) is a function whose domain is \(S\) and whose range is the set of real numbers \(\mathbb{R}\). A random variable assigns a real number to each outcome in the sample space.
A random variable’s value is completely determined by the outcome \(\omega\), where \(\omega \in S\)
A random variable is a function from the sample space (with outcomes \(\omega\)) to the set of real numbers
Thus, we can take our sample space (all outcomes) and make functional transformations to it
Do you remember our coin example from Lesson 1? We tossed one or two coins.
We make the random variable a function of the sample space.
There are two types of random variables:
Discrete random variables (RVs): the set of possible values is either finite or can be put into a countably infinite list
You could theoretically list the specific possible outcomes that the variable can take
If you sum the rolls of three dice, you must get a whole number. For example, you can’t get any number between 3 and 4.
Continuous random variables (RVs): take on values from continuous intervals, or unions of continuous intervals
Variable takes on a range of values, but there are infinitely possible values within the range
If you keep track of the time you sleep, you can sleep for 8 hours or 7.9 hours or 7.99 hours or 7.999 hours …
A probability model for a random phenomenon includes a sample space, events, random variables, and a probability measure.
Simulation
Simulation involves using a probability model to artificially recreate a random phenomenon, many times, usually using a computer.
We simulate outcomes and values of random variables according to the model’s assumptions.
In Lesson 1, we flipped a coin 100 times and recorded the proportion of heads.


Example: Simulating Two Rolls of a Fair Four-Sided Die
We’re going to roll two four-sided die. Let \(X\) be the sum of two rolls, and \(Y\) be the larger of the two rolls. How would we simulate \(X\) and \(Y\) separately?
Let’s build up some coding tools to do this!
We can also use R to sample from the box or spinner
The sample() function is a powerful tool for simulating draws from a box model.
For example, we can simulate a coin flip
x?size?
size to be larger than 1 to simulate multiple draws at once
replace = FALSE?Example: Simulating Two Rolls of a Fair Four-Sided Die
We’re going to roll two four-sided dice. Let \(X\) be the sum of two rolls, and \(Y\) be the larger of the two rolls. How would we simulate \(X\) and \(Y\) separately?
replicate() function to repeat this process many times (we’ll do 10) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
[1,] 4 1 2 2 4 2 1 2 4 1 4 2 2 3
[2,] 4 4 1 3 1 1 2 3 2 4 1 2 2 2
Probabilities (relative frequencies) are calculated by:
Define the probability space and related random variables and events, including assumptions.
Run the simulation to generate outcomes according to the assumptions.
Analyze the output using plots and summary statistics like relative frequencies and averages.
Investigate how results change when assumptions or parameters of the model are altered.
Example: Simulating Two Rolls of a Fair Four-Sided Die
We’re going to roll two four-sided die. Let \(X\) be the sum of two rolls, and \(Y\) be the larger of the two rolls. How would we simulate \(X\) and \(Y\) separately?
Define the probability space and related random variables and events, including assumptions.
\[\begin{aligned} S = \{ &(1,1), (1,2), (1,3), (1,4), (2,1), (2,2), (2,3), (2,4),\\ & (3,1), (3,2), (3,3), (3,4), (4,1), (4,2), (4,3), (4,4)\} \end{aligned}\]
Run the simulation to generate outcomes according to the assumptions.
Analyze the output using plots and summary statistics like relative frequencies and averages.
Y_df <- as.data.frame(Y_simulated) %>%
rename(Y = Y_simulated)
ggplot(Y_df, aes(x = Y)) +
geom_histogram(binwidth = 1, color = "black", fill = "#B3C8BF") +
scale_x_continuous(breaks = seq(1, 4, by = 1)) +
labs(title = "Simulated Distribution of Y (Larger of Two Rolls)",
x = "Value of Y",
y = "Frequency") +
theme_minimal() +
theme(
axis.text.x = element_text(size = 20),
axis.text.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
axis.title.y = element_text(size = 20),
plot.title = element_text(size = 20)
)
Investigate how results change when assumptions or parameters of the model are altered.
What if we rolled three die instead of two?
Y_df <- as.data.frame(Y_simulated) %>%
rename(Y = Y_simulated)
ggplot(Y_df, aes(x = Y)) +
geom_histogram(binwidth = 1, color = "black", fill = "#B3C8BF") +
scale_x_continuous(breaks = seq(1, 4, by = 1)) +
labs(title = "Simulated Distribution of Y (Larger of Two Rolls)",
x = "Value of Y",
y = "Frequency") +
theme_minimal() +
theme(
axis.text.x = element_text(size = 20),
axis.text.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
axis.title.y = element_text(size = 20),
plot.title = element_text(size = 20)
)
Lesson 2 Slides