Lecture 12
February 20, 2026
Dependence: History or sequencing of the data matters
\[p(y_t) = f(y_1, \ldots, y_{t-1})\]
Serial dependence captured by autocorrelation:
\[\varsigma(i) = \rho(y_t, y_{t-i}) = \frac{\text{Cov}[y_t, y_{t+i}]}{\mathbb{V}[y_t]} \]
AR(p): (autoregressive of order \(p\)):
\[ \begin{align*} y_t &= \alpha + \rho y_{t-1} + \varepsilon_t \\ \varepsilon &\sim \mathcal{D}(\theta) \\ y_0 &\sim \mathcal{G}(\psi) \end{align*} \]
\(\mathbb{E}[\varepsilon] = 0\), \(\text{cor}(\varepsilon_i, \varepsilon_j) = 0\), \(\text{cor}(\varepsilon_i, y_0) = 0\)
Assume stationarity and zero-mean process (\(\alpha = 0\)).
Need to know \(\text{Cov}[y_t, y_{t+h}]\) for arbitrary \(h\).
\[ \begin{align*} \text{Cov}[y_t, y_{t-h}] &= \text{Cov}[\rho^h y_{t-h}, y_{t-h}] \\ &= \rho^h \text{Cov}[y_{t-h}, y_{t-h}] \\ &= \rho^h \frac{\sigma^2}{1-\rho^2} \end{align*} \]
\[ \begin{align*} \mathbf{y} &\sim \mathcal{N}(\mathbf{0}, \Sigma) \\ \Sigma &= \frac{\sigma^2}{1 - \rho^2} \begin{pmatrix}1 & \rho & \ldots & \rho^{T-1} \\ \rho & 1 & \ldots & \rho^{T-2} \\ \vdots & \vdots & \ddots & \vdots \\ \rho^{T-1} & \rho^{T-2} & \ldots & 1\end{pmatrix} \end{align*} \]
An “easier approach” (often more numerically stable) is to whiten the series sample/compute likelihoods in sequence:
\[ \begin{align*} y_0 & \sim N\left(0, \frac{\sigma^2}{1 - \rho^2}\right) \\ y_t &\sim N(\rho y_{t-1} , \sigma^2) \end{align*} \]
function ar1_loglik_whitened(θ, dat)
# might need to include mean or trend parameters as well
# subtract trend from data to make this mean-zero in this case
ρ, σ = θ
T = length(dat)
ll = 0 # initialize log-likelihood counter
for i = 1:T
if i == 1
ll += logpdf(Normal(0, sqrt(σ^2 / (1 - ρ^2))), dat[i])
else
ll += logpdf(Normal(ρ * dat[i-1], σ), dat[i])
end
end
return ll
end
function ar1_loglik_joint(θ, dat)
# might need to include mean or trend parameters as well
# subtract trend from data to make this mean-zero in this case
ρ, σ = θ
T = length(dat)
# compute all of the pairwise lags
# this is an "outer product"; syntax will differ wildly by language
H = abs.((1:T) .- (1:T)')
P = ρ.^H # exponentiate ρ by each lag
Σ = σ^2 / (1 - ρ^2) * P
ll = logpdf(MvNormal(zeros(T), Σ), dat)
return ll
endρ = 0.6
σ = 0.25
T = 25
ts_sim = zeros(T)
# simulate synthetic AR(1) series
for t = 1:T
if t == 1
ts_sim[t] = rand(Normal(0, sqrt(σ^2 / (1 - ρ^2))))
else
ts_sim[t] = rand(Normal(ρ * ts_sim[t-1], σ))
end
end
plot(1:T, ts_sim, linewidth=3, xlabel="Time", ylabel="Value", title=L"$ρ = 0.6, σ = 0.25$")
plot!(size=(600, 500))lb = [-0.99, 0.01]
ub = [0.99, 5]
init = [0.6, 0.3]
optim_whitened = Optim.optimize(θ -> -ar1_loglik_whitened(θ, ts_sim), lb, ub, init)
θ_wn_mle = round.(optim_whitened.minimizer; digits=2)
@show θ_wn_mle;
optim_joint = Optim.optimize(θ -> -ar1_loglik_joint(θ, ts_sim), lb, ub, init)
θ_joint_mle = round.(optim_joint.minimizer; digits=2)
@show θ_joint_mle;θ_wn_mle = [0.62, 0.23]
θ_joint_mle = [0.62, 0.23]
\[y_t = \underbrace{x_t}_{\text{fluctuations}} + \underbrace{z_t}_{\text{trend}}\]
Dragons: Extrapolating trends identified using “curve-fitting” is highly fraught, complicating projections.
Better to have an explanatory model and then fit autoregressive models to the residuals…

Source: Reddit (original source unclear…)
Figure 3: Global mean sea level and temperature data for Problem 1.
pacf_slr = pacf(sl_dat[!, :GMSLR], 0:5)
p1 = plot(0:5, autocor(sl_dat[!, :GMSLR], 0:5), marker=:circle, line=:stem, linewidth=3, markersize=8, legend=false, ylabel="Autocorrelation", xlabel="Time Lag", size=(550, 500))
p2 = plot(0:5, pacf_slr, marker=:circle, line=:stem, markersize=8, linewidth=3, legend=false, ylabel="Partial Autocorrelation", xlabel="Time Lag", size=(550, 500))
display(p1)
display(p2)Simple model of sea-level rise from Grinsted et al. (2010):
\[\begin{aligned} \frac{dS}{dt} &= \frac{S_\text{eq} - S}{\tau} \\ S_\text{eq} &= aT + b, \end{aligned} \]
Note: Hypothetically this type of model can capture all of the autocorrelation (or at least non-stationarity) in the data, since:
So a good strategy is to try to calibrate this model without assuming AR(1) residuals, then see if they’re necessary (HW3).
Many ways to discretize model, we use the simplest (forward Euler) based on \(\frac{dy}{dx} = \lim_{\Delta x \to 0} \frac{y(x + \Delta x) - y(x)}{\Delta x}\):
\[ \begin{aligned} \frac{S(t+1) - S(t)}{\Delta t} &= \frac{S_\text{eq}(t) - S(t)}{\tau} \\[0.5em] S_\text{eq}(t) &= a T(t) + b, \\[1.5em] \Rightarrow S(t+1) &= S(t) + \Delta t \frac{S_\text{eq}(t) - S(t)}{\tau} \\[0.5em] S_\text{eq}(t) &= a T(t) + b, \end{aligned} \]
We can treat this like any other regression, but use model for \(S(t)\) instead of e.g. linear model:
Assuming i.i.d. Gaussian residuals: \[y_t = S(t) + \varepsilon_t, \qquad \varepsilon_t \sim N(0, \sigma^2)\]
If residuals are autocorrelated: \[y_t = S(t) + \omega_t, \quad \omega_t = \rho \omega_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim N(0, \sigma^2)\]
For a given set of parameters:
Can go further and separate out observation errors (typically white noise) from model-data discrepancies (systematic differences in state prediction, often autocorrelated):
Computer Model: \[\eta(\underbrace{\theta}_{\substack{\text{calibration}\\\text{variables}}}; \underbrace{x}_{\substack{\text{control}\\\text{variables}}})\]
Observations: \[\mathbf{y} \sim p(\underbrace{\zeta(\mathbf{x})}_{\substack{\text{expected}\\\text{state}}})\]
Write \[\zeta(\mathbf{x}) = \delta(\eta(\theta; x))\] where \(\delta\) represents the discrepancy between the model output and the expected state.
Then the probability model is: \[\mathbf{y} \sim p(\delta(\eta(\theta; x))).\]
Most common setting (e.g. Brynjarsdóttir & O’Hagan (2014)):
\[\mathbf{y} = \underbrace{\eta(\mathbf{x}; \theta)}_{\text{model}} + \underbrace{\delta(x)}_{\text{discrepancy}} + \underbrace{\varepsilon}_{\text{error}}\]
\[y_t = a + \rho_1 y_{t-1} + \rho_2 y_{t-2} + \ldots + \rho_p y_{t-p} + \varepsilon_t\]
Next Week: Extreme Values (Theory and Models)
Next Unit: Hypothesis Testing, Model Evaluation, and Comparison
HW2: Due Friday at 9pm.
Exercises: Available after class (will include GLMs + time series).
HW3: Assigned on 3/2, due 3/13.