Monte Carlo Examples

Lecture 26

March 27, 2026

Review

Monte Carlo

Stochastic simulation: propagate random variables $X$ through model $f$ and want to estimate $\mathbb{E}[f(X)] \approx \sum_{i=1}^n f(X_i)$ using samples $X_1, \ldots, X_n$.
Example: probability of a standard Gaussian between $[0.8, 0.9]$: draw $X_1, \ldots, X_n \sim N(0, 1)$, \[f(x) = \mathbb{I}_{[0.8, 0.9]}(x).\]

Statistics of Monte Carlo

Denote $Y = f(X)$, $\mathbb{E}[Y] = \mu$, and $\sum_{i=1}^n Y_i = \tilde{\mu}_n$.
MC is unbiased: $\mathbb{E}[\tilde{\mu}_n] = \mu$;
Monte Carlo standard error (MCSE): $Y_i$ i.i.d., then $\text{SE}(\tilde{\mu}_n) = \sigma_Y / \sqrt{n}$.

Monte Carlo Confidence Intervals

The $\alpha$-confidence interval is: \[\tilde{\mu}_n \pm \Phi^{-1}\left(1 - \frac{\alpha}{2}\right) \frac{\sigma_Y}{\sqrt{n}}\]

For example, the 95% confidence interval is \[\tilde{\mu}_n \pm 1.96 \frac{\sigma_Y}{\sqrt{n}}.\]

How Many Samples Do I Need?

Try:

Run a “pilot” MC with a relatively small number of samples to estimate $\mathbb{V}\left[Y_i\right]$.
Pick a target MCSE $\varepsilon$, then we need $N \approx \mathbb{V}\left[Y_i\right] / \varepsilon^2$ samples.
Run the “real” MC with sample size $N$ and check if the MCSE is right ($N$ might be off due to error in the variance estimate).
If not, run more.

Airshed Modeling

Airshed Model

Figure 1: Illustration of the airshed, including notation.

Goal: Find the probability of exceeding the 1-hour SO₂ average exposure concentration standard, which is 0.14 ppm.

\[\mathbb{P}[\text{SO}_2(\theta) > 0.14] = \int \mathbb{I}(\text{SO}_2(\theta) > 0.14) p(\theta) d\theta\]

Airshed Model

Figure 2: Illustration of the airshed, including notation.

\[\frac{dC}{dt} = \frac{u}{L} C_\text{in} + \frac{S-D}{WHL} - \left(\frac{u}{L} + k\right)C\]

Forward Euler Discretization

\[ \frac{dC}{dt} = \frac{u}{L} C_\text{in}(t) + \frac{S-D}{WHL} - \left(\frac{u}{L} + k\right)C\]

\[\Rightarrow \frac{C(t+1) - C(t)}{\Delta t} = \frac{u}{L} C_\text{in}(t) + \frac{R}{WHL} - \left(\frac{u}{L} + k\right)C(t)\]

\[\bbox[yellow, 10px, border:5px solid red]{C(t+1) = \left(1 - \Delta t\left(\frac{u}{L} + k\right)\right)C(t) + \Delta t \left(\frac{u}{L} C_\text{in}(t) + \frac{R}{WHL}\right)} \]

Monte Carlo Samples

Code

nsamp = 1000
u = rand(LogNormal(log(2), 1), nsamp)
Cin = rand(LogNormal(log(0.16), 0.12), nsamp)
R = rand(Normal(0.5, 0.5), nsamp)

p1 = histogram(u, ylabel="count", xlabel=L"$u$ (m/s)", label=false,  size=(400, 450))
p2 = histogram(Cin, ylabel="count", xlabel=L"$C_{in}$ (ppm)", label=false, size=(400, 450))
p3 = histogram(R, ylabel="count", xlabel=L"$R$ (ppm/hr)", label=false,  size=(400, 450))
display(p1)
display(p2)
display(p3)

(a) Monte Carlo samples for the airshed model.

Simulation Results: Pilot Run

Code

# other parameters
C₀ = 0.07
T = 60
k = 0.3
W = 4
H = 5
L = 4
# conduct simulation
P = u / L .* Cin
l = u / L .+ k
C2 = zeros(T*100 + 1, nsamp)
S = 0:0.01:T
for (i, t) in pairs(S)
    if i == 1
        C2[i, :] .= C₀
    else
        C2[i, :] = (1 .- 0.01*l) .* C2[i-1, :] .+ 0.01 * P .+ 0.01 * R / (H * W * L)
    end
end
mean_SO2 = map(mean, eachcol(C2)) # calculate means
# plot histogram
p1 = histogram(mean_SO2, xlabel="1-Hour Average Exposure (ppm)", ylabel="Count", legend=false, tickfontsize=16, guidefontsize=18)
vline!(p1, [0.14], color=:red, linestyle=:dash, linewidth=3)
xticks!(p1, 0:0.04:0.3)
xaxis!(p1, xminorticks=2)
plot!(p1, size=(600, 450))
# plot cdf
p2 = plot(sort(mean_SO2), (1:nsamp) ./ nsamp, xlabel="1-Hour Average Exposure (ppm)", ylabel="Cumulative Probability", legend=false, linewidth=3)
vline!(p2, [0.14], linestyle=:dash, color=:red, linewidth=3, minorgrid=true)
xticks!(p2, 0:0.04:0.3)
xaxis!(p2, xminorticks=2)
yaxis!(p2, yminorticks=5)
plot!(p2, size=(600, 450))

display(p1)
display(p2)

What Sample Size?

Code

# show Monte Carlo estimate stabilization
avg_mc_out = zeros(nsamp)
avg_mc_out[1] = mean_SO2[1] > 0.14
std_mc_out = zeros(nsamp)
std_mc_out[1] = 0
for i = 2:nsamp
    avg_mc_out[i] = (avg_mc_out[i-1] * (i-1) + (mean_SO2[i] > 0.14)) / i
    std_mc_out[i] = 1/sqrt(i) * std(mean_SO2[1:i] .> 0.14)
end
p = plot(avg_mc_out, xlabel="Monte Carlo Iteration", ylabel="Probability", left_margin=3mm, legend=:false, ribbon=1.96*std_mc_out, fillalpha=0.3, linewidth=2, fillcolor=:red, right_margin=5mm, minorgrid=true)
ylims!(p, (0, 0.3))
yaxis!(p, yminorticks=5)
plot!(p, size=(600, 450))
display(p)

Figure 5: Monte Carlo estimation for the airshed model.

$\hat{\sigma}_n^2 = \mathbb{V}\left[\mathbb{I}[x_{1:n} > 0.14]\right]$

$\hat{\sigma}_n^2 =$ 0.13

If we want to get the MCSE less than 0.01, then we need $N \approx \tilde{\sigma}_n * 10,000$ samples.

Monte Carlo Estimation: Full Run

Let’s round up and use 20,000 samples.

Code

# show Monte Carlo estimate stabilization
# get new samples
nsamp = 20_000
u = rand(LogNormal(log(2), 1), nsamp)
Cin = rand(LogNormal(log(0.16), 0.12), nsamp)
R = rand(Normal(0.5, 0.5), nsamp)
# conduct simulation
P = u / L .* Cin
l = u / L .+ k
C2 = zeros(T*100 + 1, nsamp)
S = 0:0.01:T
for (i, t) in pairs(S)
    if i == 1
        C2[i, :] .= C₀
    else
        C2[i, :] = (1 .- 0.01*l) .* C2[i-1, :] .+ 0.01 * P .+ 0.01 * R / (H * W * L)
    end
end
mean_SO2 = map(mean, eachcol(C2)) # calculate means

# compute MC Estimate
avg_mc_out = zeros(nsamp)
avg_mc_out[1] = mean_SO2[1] > 0.14
std_mc_out = zeros(nsamp)
std_mc_out[1] = 0
for i = 2:nsamp
    avg_mc_out[i] = (avg_mc_out[i-1] * (i-1) + (mean_SO2[i] > 0.14)) / i
    std_mc_out[i] = 1/sqrt(i) * std(mean_SO2[1:i] .> 0.14)
end
p = plot(avg_mc_out, xlabel="Monte Carlo Iteration", ylabel="Probability", left_margin=3mm, legend=:false, ribbon=1.96*std_mc_out, fillalpha=0.3, linewidth=2, fillcolor=:red, right_margin=5mm, minorgrid=true)
ylims!(p, (0.1, 0.2))
yaxis!(p, yminorticks=5)
plot!(p, size=(600, 450))
display(p)

Figure 6: Monte Carlo estimation for the airshed model.

MCSE: 0.002

Estimate: 0.13 $\pm$ 0.005

Flood Modeling

Annual Expected Damages

Suppose we have a client who wants us to advise them on their expected flood losses from a nearby stream. The house is worth $400,000 and floods when stream water levels exceed 1.5m.

Code

# load data
true_dist = GeneralizedExtremeValue(0.9, 0.5, 0.16)
depths = rand(true_dist, 40)
# fit GEV distribution
gev_loglik(y, θ) = sum(logpdf.(GeneralizedExtremeValue(θ[1], θ[2], θ[3]), y))
p0 = [0.1, 2.0, 0.0]
lb = [0.0, 0.01, -1.0]
ub = [2.0, 5.0, 1.0]
mle_out = optimize(p -> -gev_loglik(depths, p), lb, ub, p0)
θ_mle = mle_out.minimizer
wl_dist = GeneralizedExtremeValue(θ_mle[1], θ_mle[2], θ_mle[3])

p1 = histogram(depths, legend=false, xlabel="Stream Depth (m)", ylabel="Density", normalize=:pdf, color=:gray)
plot!(p1, wl_dist, color=:red, linewidth=3, normalize=:pdf)
plot!(p1, xlims=(0, 5))
plot!(p1, size=(500, 450))

Depth-Damage Functions

Translating flood depths to damages requires a depth-damage function. For this problem:

\[d(h) = \mathbb{I}[h > 0] \frac{1}{1+\exp(-k(x-x_0))},\]

where $k=1.25$ and $x_0=2$.

Code

h = 0:0.01:5
dd(h, k, x₀) = (h .> 0) * 1 ./ (1 .+ exp.(-k * (h .- x₀)))
plot(h, dd(h, 1.25, 2), xlabel="Flood Depth (m)", ylabel="Fraction of Value Damaged", legend=false)
ylims!(0, 1)
plot!(size=(500, 450))

Estimating Expected Annual Damages

We want to estimate $\mathbb{E}[d(H)]$, $H + 1.5 \sim \text{GEV}(\mu, \sigma, \xi)$.

How many MC samples do we need? Pilot with 1,000 samples:

Figure 9: Histogram of initial depth-damage functions

$\tilde{\sigma}_n^2 =$ 6.7e8

If we want to get the MCSE less than $100, then we need $N \approx \tilde{\sigma}_n / 10,000$ samples.

Monte Carlo Estimate

Let’s round up and use 100,000 samples:

MCSE: 82.0

Estimate: 11157.0 $\pm$ 160.0

Extensions to Optimization

Stochastic Optimization

Problems typically have the form \[\min_{a \in \mathcal{A}} \mathbb{E}_X[f(X; a]\] where

$\mathcal{A}$ is a set of decision alternatives;
$f(x; a)$ is the objective function with random realization $x$ and action $a$.

There may also be some constraints on what actions are allowable.

Example: Home Elevation

Let’s extend the previous example. Suppose our client now wants us to recommend how high they should elevate their house to minimize flooding damages.

$a \in \mathcal{A}$ are elevation heights;
$f(x; a)$ is the net cost of elevating with height $a$ and water level realizations $x$: \[f(x; a) = \underbrace{D(x; a)}_{\text{flood damages}} + \underbrace{C(a)}_{\text{investment cost}}\]

Some Notes

Infrastructure investment problems typically have a design horizon: a period of time over which they are designed to be effective. Let’s use 30 years here.
$X$ is no longer water levels from a single year, as before, but stochastic sequences of 30-year water levels.

Some Notes

Due to the time value of money, we need to discount future benefits with some discount rate $\gamma$ to calculate the net present value (NPV) of benefits: \[NPV = \sum_{t=0}^T x_t (1-\gamma)^t.\]
Cost function: $C(\Delta h) = \mathbb{I}[h > 0](C_0 + C_1\Delta h)$.

Monte Carlo Examples

Review

Monte Carlo

Statistics of Monte Carlo

Monte Carlo Confidence Intervals

How Many Samples Do I Need?

Airshed Modeling

Airshed Model

Airshed Model

Forward Euler Discretization

Monte Carlo Samples

Simulation Results: Pilot Run

What Sample Size?

Monte Carlo Estimation: Full Run

Flood Modeling

Annual Expected Damages

Depth-Damage Functions

Estimating Expected Annual Damages

Monte Carlo Estimate

Extensions to Optimization

Stochastic Optimization

Example: Home Elevation

Some Notes

Some Notes

Upcoming Schedule

Next Classes

References

References (Scroll for Full List)