Lecture 11
February 18, 2026
Many environmental datasets involve time series: repeated observations over time:
\[X = \{X_t, X_{t+h}, X_{t+2h}, \ldots, X_{t+nh}\}\]
More often: ignore sampling time in notation,
\[X = \{X_1, X_2, X_3, \ldots, X_n\}\]
Dependence: History or sequencing of the data matters
\[p(y_t) = f(y_1, \ldots, y_{t-1})\]
Serial dependence captured by autocorrelation:
\[\varsigma(i) = \rho(y_t, y_{t-i}) = \frac{\text{Cov}[y_t, y_{t+i}]}{\mathbb{V}[y_t]} \]
I.I.D. Data: All data drawn from the same distribution, \(y_i \sim \mathcal{D}\).
Equivalent for time series \(\mathbf{y} = \{y_t\}\) is stationarity.
# tds_riverflow_loglik: function to compute the log-likelihood for the tds model
# θ: vector of model parameters (coefficients β₀ and β₁ and stdev σ)
# tds, flow: vectors of data
tds = let
fname = "data/tds/cuyaTDS.csv" # CHANGE THIS!
tds = DataFrame(CSV.File(fname))
tds[!, [:date, :discharge_cms, :tds_mgL]]
end
function tds_riverflow_loglik(θ, tds, flow)
β₀, β₁, σ = θ # unpack parameter vector
μ = β₀ .+ β₁ * log.(flow) # find mean
ll = sum(logpdf.(Normal.(μ, σ), tds)) # compute log-likelihood
return ll
end
lb = [0.0, -1000.0, 1.0]
ub = [1000.0, 1000.0, 100.0]
θ₀ = [500.0, 0.0, 50.0]
optim_out = Optim.optimize(θ -> -tds_riverflow_loglik(θ, tds.tds_mgL, tds.discharge_cms), lb, ub, θ₀)
θ_mle = optim_out.minimizer; digits=0
x = 1:0.1:60
μ = θ_mle[1] .+ θ_mle[2] * log.(x)
p1 = scatter(tds.discharge_cms, tds.tds_mgL, markersize=3, xlabel=L"Log-Discharge (log-m$^3$/s)", ylabel="Total dissolved solids (mg/L)", label="Data", size=(500, 450))
plot!(p1, x, μ, linewidth=3, color=:red, label="Regression")
pred = θ_mle[1] .+ θ_mle[2] * log.(tds.discharge_cms)
resids = pred - tds.tds_mgL
p2 = scatter(log.(tds.discharge_cms), resids, markersize=3, ylabel=L"$r$ (mg/L)", xlabel=L"Log-Discharge (log-m$^3$/s) ", size=(500, 450))
p3 = scatter(resids[1:end-1], resids[2:end], markersize=3, ylabel=L"$r_{t+1}$ (mg/L)", xlabel=L"$r_t$ (mg/L)", size=(500, 450))
T = nrow(tds)
rlr = lm([ones(T-1) resids[1:end-1]], resids[2:end])
rpred = predict(rlr, [ones(T-1) resids[1:end-1]])
plot!(p3, resids[1:end-1], rpred, linewidth=3, color=:red)
display(p1)
display(p2)
display(p3)Residuals have autocorrelation 0.13.
AR(p): (autoregressive of order \(p\)):
\[ \begin{align*} y_t &= \alpha + \rho y_{t-1} + \varepsilon_t \\ \varepsilon &\sim \mathcal{D}(\theta) \\ y_0 &\sim \mathcal{G}(\psi) \end{align*} \]
\(\mathbb{E}[\varepsilon] = 0\), \(\text{cor}(\varepsilon_i, \varepsilon_j) = 0\), \(\text{cor}(\varepsilon_i, y_0) = 0\)
AR models are commonly used for prediction: bond yields, prices, electricity demand, short-run weather.
But may have little explanatory power: what causes the autocorrelation?
Plot \(\varsigma(i)\) over a series of lags.
Data generated by an AR(1) with \(\rho = 0.7\).
Notice: No explicit higher order autocorrelation, but \(\varsigma(2) > 0\).
\[\begin{aligned} y_t &= \alpha + \rho y_{t-1} + \varepsilon_t \\ &= \alpha + \rho \left(\alpha + \rho y_{t-2} + \varepsilon_{t-1}\right) + \varepsilon_t \\ &= \alpha + \rho \alpha + \rho^2 y_{t-2} + \rho \varepsilon_{t-1} + \varepsilon_t \\ &= \ldots \\ &= \alpha \sum_{k=0}^{t-1} \rho^k + \rho^t y_0 + \sum_{k=0}^{t-1} \varepsilon_{t-k} \rho^k \end{aligned}\]
Can isolate \(\varsigma(i)\) independent of \(\varsigma(i-k)\) through partial autocorrelation.
Typically estimated through regression, \[y = \sum_{k=1}^i \varsigma(i-k) y_{i-k}.\]
The conditional variance \(\mathbb{V}[y_t | y_{t-1}] = \sigma^2\).
Unconditional variance for stationary \(\mathbb{V}[y_t]\):
\[ \begin{align*} \mathbb{V}[y_t] &= \rho^2 \mathbb{V}[y_{t-1}] + \mathbb{V}[\varepsilon] \\ &= \rho^2 \mathbb{V}[y_t] + \sigma^2 \\ &= \frac{\sigma^2}{1 - \rho^2}. \end{align*} \]
Need \(\mathbb{E}[Y_t] = \mathbb{E}[X_{t-1}]\) and \(\mathbb{V}[Y_t] = \mathbb{V}[Y_{t-1}]\). If we want this to hold:
\[\begin{aligned} \mathbb{E}[Y_t] &= \mathbb{E}[\alpha + \rho Y_{t-1} + \varepsilon_t] \\ &= \alpha + \mathbb{E}[Y_{t-1}] + \cancel{\mathbb{E}[\varepsilon_t]} \\[0.75em] \Rightarrow \mathbb{E}[Y_t] &= \alpha + \rho \mathbb{E}[Y_t] \\ &= \frac{\alpha}{1-\rho}. \end{aligned}\]
Now for the variance:
\[\begin{aligned} \mathbb{V}[Y_t] = \mathbb{V}[\alpha + \rho Y_{t-1} + \varepsilon_t] \\ &= \rho^2 \mathbb{V}[Y_{t-1}] + \cancel{2\text{Cov}(Y_{t-1}, \varepsilon_t)} + \mathbb{V}[\varepsilon_t] \\[0.75em] \Rightarrow \mathbb{V}[Y_t] &= \rho^2 \mathbb{V}[Y_t] + \sigma^2 \\ &= \frac{\sigma^2}{1 - \rho^2}. \end{aligned}\]
Thus the AR(1) is stationary if and only if \(|\rho| < 1\).
Think about deterministic version (set \(\alpha = 0\) to simplify):
\[ y_t = \rho y_{t-1} = \rho^t y_0 \]
Stationarity: \[|\rho| < 1 \Rightarrow \lim_{t \to \infty} \rho^t = 0.\]
Stationary AR(1)s will decay to 0 on average and non-stationary will diverge to \(\pm \infty\).
Innovations/noise continually perturb away from this long-term deterministic path.
ρ = 0.7
σ = 0.2
ar_var = sqrt(σ^2 / (1 - ρ^2))
T = 100
ts_det = zeros(T)
n = 3
ts_stoch = zeros(T, n)
t0 = 1.5
for i = 1:T
if i == 1
ts_det[i] = t0
ts_stoch[i, :] .= t0
else
ts_det[i] = ρ * ts_det[i-1]
ts_stoch[i, :] = ρ * ts_stoch[i-1, :] .+ rand(Normal(0, σ), n)
end
end
p = plot(1:T, ts_det, label="Deterministic", xlabel="Time", linewidth=3, color=:black, size=(600, 550))
cols = ["#DDAA33", "#BB5566", "#004488"]
for i = 1:n
label = i == 1 ? "Stochastic" : false
plot!(p, 1:T, ts_stoch[:, i], linewidth=1, color=cols[i], label=label)
end
p\[\begin{aligned} \mathbb{E}[Y_{t+h} | Y_t = \mathbf{y}_t] &= \mathbb{E}[\alpha \sum_{k=0}^{h-1} \rho^k + \sum_{k=0}^{h-1} \varepsilon_{t-k} \rho^k + \rho^h y_0 | Y_t = \mathbf{y}_t] \\ &= \alpha \sum_{k=0}^{t-1} \rho^k + \rho^h y_0 \end{aligned}\]
Friday: AR(1) Inference
Next Week: Extreme Values (Theory and Models)
Next Unit: Hypothesis Testing, Model Evaluation, and Comparison
HW2: Due Friday at 9pm.
Exercises: Available after class (will include GLMs + time series).
HW3: Assigned on 3/2, due 3/13.