Lesson 3: Introduction to Simulations

Nicky Wakim

2025-10-06

Learning Objectives

  1. intro to simulations
  2. discrete RVs simmies
  3. continuous RVs simmies

Recall: Outcomes, events, sample spaces

Definition: Outcome

The possible results in a random phenomenon.

Definition: Sample Space

The sample space \(S\) is the set of all outcomes

Definition: Event

An event is a collection of some outcomes. An event can include multiple outcomes or no outcomes (a subset of the sample space).

When thinking about events, think about outcomes that you might be asking the probability of. For example, what is the probability that you get a heads or a tails in one flip? (Answer: 1)

What is a simulation?

A probability model for a random phenomenon includes a sample space, events, random variables, and a probability measure.

Simulation

Simulation involves using a probability model to artificially recreate a random phenomenon, many times, usually using a computer.

We simulate outcomes and values of random variables according to the model’s assumptions.

The Foundation: Relative Frequencies

  • Probabilities can be interpreted as long-run relative frequencies
  • By simulating a random phenomenon a large number of times, we can approximate the probability of an event by calculating the relative frequency of its occurrence.
  • Simulation is a powerful tool for approximating probabilities, distributions of random variables, long-run averages, and other characteristics.

Still confused about long-run relative frequencies?

Seeing Theory, Chapter 1: Basic Probability, Chance Events

4 (S)teps of a Simulation

  1. Set up

Define the probability space and related random variables and events, including assumptions.

  1. Simulate

Run the simulation to generate outcomes according to the assumptions.

  1. Summarize

Analyze the output using plots and summary statistics like relative frequencies and averages.

  1. Sensitivity analysis

Investigate how results change when assumptions or parameters of the model are altered.

Types of simulations

  • We can simulate discrete and continuous random variables

 

  • Discrete random variables (RVs) are a little easier to understand
    • We can use “tactile” simulations to give us a sense of what the computer is doing

 

  • Continuous random variables (RVs) are more complex
    • Using the computer is the best way to simulate these

Learning Objectives

discrete RVs simmies

Tactile simulations

  • We’ve already seen coin flips!
  • We can also use cards, dice, and other objects to simulate discrete random variables

 

  • Other common method: A box model uses a box/hat/bucket of “tickets” with labels to represent possible outcomes
    • Allows us to increase the number of “tickets” with appropriate labels
    • Coin flip as box model: A box with two tickets (H and T).
    • 90% free throw shooter: A box with 10 tickets (9 “make” and 1 “miss”).
    • Draws can be with replacement (e.g., coin flips) or without replacement (e.g., dealing a poker hand).

Tactile simulations: Spinners

  • If draws are with replacement, we can use a single circular spinner instead of a box

 

  • The area of each sector corresponds to the probability of an outcome

 

  • Let’s Google “spinner” and see what we find!
    • We can try to make a spinner for a coin flip

Example: Dice Rolls

Example: Simulating Two Rolls of a Fair Four-Sided Die

Let \(X\) be the sum of two rolls of a fair four-sided die, and \(Y\) be the larger of the two rolls. How would we simulate \(X\) and \(Y\) separately?

How do we extend a single simulation to useful simulations?

  • We can also use R to sample from the box or spinner

  • The sample() function in R allows us to simulate equally likely draws

  • For example, we can simulate a coin flip

sample(x = c("H", "T"), size = 1, replace = TRUE)
[1] "T"
  • We can simulate our example of the four sided die
sample(x = c(1, 2, 3, 4), size = 2, replace = TRUE)
[1] 1 2

Learning Objectives

continuous RVs simmies

The Problem

  • Regina and Cady’s arrival times are between noon (0 minutes) and 1 PM (60 minutes).
  • The arrival times are a continuous random variable on the interval \([0, 60]\).
  • We cannot use a box model because there are uncountably many possible outcomes.
  • A spinner with a continuous axis is the tactile analog.

Uniform Model

  • We can construct a spinner where the needle is equally likely to land on any value between 0 and 60.
  • We would spin it twice to get Regina’s and Cady’s arrival times.

Normal Model

  • What if Regina is more likely to arrive around 12:30?
  • We can create a spinner where values near 30 are “stretched” to occupy more area.
  • This is an example of a Normal distribution, where probabilities are concentrated around a central value.

Learning Objectives

doing it in R

Computer Simulation in R

  • We can use R to perform simulations without a dedicated package like Symbulate.
  • R has built-in functions for generating random numbers from different distributions.

Simulating Outcomes

  • The sample() function is a powerful tool for simulating draws from a box model.
# A box model for two rolls of a fair four-sided die
rolls <- sample(x = c(1, 2, 3, 4), size = 2, replace = TRUE)

# Simulate 10 repetitions, with each row representing a pair of rolls
reps <- 10
replicate(reps, sample(x = 1:4, size = 2, replace = TRUE))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    4    1    1    4    2    4    4    1     3
[2,]    3    2    2    3    1    1    1    1    1     3
  • sample() can handle both with (replace = TRUE) and without (replace = FALSE) replacement.

Simulating Random Variables

We can apply functions to the simulated outcomes to get the values of our random variables.

Let’s simulate X (sum) and Y (max).

reps <- 1000
simulations <- replicate(reps, sample(x = 1:4, size = 2, replace = TRUE))

# Calculate X (the sum) for each repetition
X_simulated <- apply(simulations, 2, sum)

# Calculate Y (the max) for each repetition
Y_simulated <- apply(simulations, 2, max)

# Display the first 10 values for X and Y
head(X_simulated, 10)
 [1] 3 6 8 4 5 8 4 5 4 6
head(Y_simulated, 10)
 [1] 2 3 4 2 3 4 3 4 2 3