Time Series

Lecture 11

February 18, 2026

Time Series

Time Series and Environmental Data

Many environmental datasets involve time series: repeated observations over time:

\[X = \{X_t, X_{t+h}, X_{t+2h}, \ldots, X_{t+nh}\}\]

More often: ignore sampling time in notation,

\[X = \{X_1, X_2, X_3, \ldots, X_n\}\]

Figure 1: Example of time series; trapped lynx populations in Canada.

When Do We Care About Time Series?

Dependence: History or sequencing of the data matters

\[p(y_t) = f(y_1, \ldots, y_{t-1})\]

Serial dependence captured by autocorrelation:

\[\varsigma(i) = \rho(y_t, y_{t-i}) = \frac{\text{Cov}[y_t, y_{t+i}]}{\mathbb{V}[y_t]} \]

Figure 2: Autocorrelation for the Lynx data.

Stationarity

I.I.D. Data: All data drawn from the same distribution, \(y_i \sim \mathcal{D}\).

Equivalent for time series \(\mathbf{y} = \{y_t\}\) is stationarity.

Strict stationarity: \(\{y_t, \ldots, y_{t+k-1}\} \sim \mathcal{D}\) for all \(t\).
Weak stationarity: \(\mathbb{E}[X_1] = \mathbb{E}[X_t]\) and \(\text{Cov}(X_1, X_k) = \text{Cov}(X_t, X_{t+k-1})\) for all \(t, k\).

Stationary vs. Non-Stationary Series

(a) Stationary vs. Non-stationary time series

Autocorrelation

Autocorrelation and Linear Regression

Code

# tds_riverflow_loglik: function to compute the log-likelihood for the tds model
# θ: vector of model parameters (coefficients β₀ and β₁ and stdev σ)
# tds, flow: vectors of data
tds = let
    fname = "data/tds/cuyaTDS.csv" # CHANGE THIS!
    tds = DataFrame(CSV.File(fname))
    tds[!, [:date, :discharge_cms, :tds_mgL]]
end

function tds_riverflow_loglik(θ, tds, flow)
    β₀, β₁, σ = θ # unpack parameter vector
    μ = β₀ .+ β₁ * log.(flow) # find mean
    ll = sum(logpdf.(Normal.(μ, σ), tds)) # compute log-likelihood
    return ll
end

lb = [0.0, -1000.0, 1.0]
ub = [1000.0, 1000.0, 100.0]
θ₀ = [500.0, 0.0, 50.0]
optim_out = Optim.optimize(θ -> -tds_riverflow_loglik(θ, tds.tds_mgL, tds.discharge_cms), lb, ub, θ₀)
θ_mle = optim_out.minimizer; digits=0

x = 1:0.1:60
μ = θ_mle[1] .+ θ_mle[2] * log.(x)
p1 = scatter(tds.discharge_cms, tds.tds_mgL, markersize=3, xlabel=L"Log-Discharge (log-m$^3$/s)", ylabel="Total dissolved solids (mg/L)", label="Data", size=(500, 450))
plot!(p1, x, μ, linewidth=3, color=:red, label="Regression")

pred = θ_mle[1] .+ θ_mle[2] * log.(tds.discharge_cms)
resids = pred - tds.tds_mgL
p2 = scatter(log.(tds.discharge_cms), resids, markersize=3, ylabel=L"$r$ (mg/L)", xlabel=L"Log-Discharge (log-m$^3$/s) ", size=(500, 450))
p3 = scatter(resids[1:end-1], resids[2:end], markersize=3, ylabel=L"$r_{t+1}$ (mg/L)", xlabel=L"$r_t$ (mg/L)", size=(500, 450))
T = nrow(tds)
rlr = lm([ones(T-1) resids[1:end-1]], resids[2:end])
rpred = predict(rlr, [ones(T-1) resids[1:end-1]])
plot!(p3, resids[1:end-1], rpred, linewidth=3, color=:red)

display(p1)
display(p2)
display(p3)

(a) Autocorrelation of linear regression residuals

Residuals have autocorrelation 0.13.

Autoregressive (AR) Models

AR(1) Models

AR(p): (autoregressive of order \(p\)):

\[ \begin{align*} y_t &= \alpha + \rho y_{t-1} + \varepsilon_t \\ \varepsilon &\sim \mathcal{D}(\theta) \\ y_0 &\sim \mathcal{G}(\psi) \end{align*} \]

\(\mathbb{E}[\varepsilon] = 0\), \(\text{cor}(\varepsilon_i, \varepsilon_j) = 0\), \(\text{cor}(\varepsilon_i, y_0) = 0\)

Uses of AR Models

AR models are commonly used for prediction: bond yields, prices, electricity demand, short-run weather.

But may have little explanatory power: what causes the autocorrelation?

AR(1) Models

Simulating AR(1) Data

Draw an initial \(y_0 \sim \mathcal{G}(\psi)\)
For \(t \geq 1\):
- Draw \(\varepsilon_t \sim \mathcal{D}(\theta)\)
- Set \(y_t = \alpha + \rho y_{t-1} + \varepsilon_t\)

Diagnosing Autocorrelation

Plot \(\varsigma(i)\) over a series of lags.

Data generated by an AR(1) with \(\rho = 0.7\).

Notice: No explicit higher order autocorrelation, but \(\varsigma(2) > 0\).

“Memory” of Autocorrelation

\[\begin{aligned} y_t &= \alpha + \rho y_{t-1} + \varepsilon_t \\ &= \alpha + \rho \left(\alpha + \rho y_{t-2} + \varepsilon_{t-1}\right) + \varepsilon_t \\ &= \alpha + \rho \alpha + \rho^2 y_{t-2} + \rho \varepsilon_{t-1} + \varepsilon_t \\ &= \ldots \\ &= \alpha \sum_{k=0}^{t-1} \rho^k + \rho^t y_0 + \sum_{k=0}^{t-1} \varepsilon_{t-k} \rho^k \end{aligned}\]

Partial Autocorrelation

Can isolate \(\varsigma(i)\) independent of \(\varsigma(i-k)\) through partial autocorrelation.

Typically estimated through regression, \[y = \sum_{k=1}^i \varsigma(i-k) y_{i-k}.\]

Figure 8: Partial autocorrelation Function

AR(1) Variance

The conditional variance \(\mathbb{V}[y_t | y_{t-1}] = \sigma^2\).

Unconditional variance for stationary \(\mathbb{V}[y_t]\):

\[ \begin{align*} \mathbb{V}[y_t] &= \rho^2 \mathbb{V}[y_{t-1}] + \mathbb{V}[\varepsilon] \\ &= \rho^2 \mathbb{V}[y_t] + \sigma^2 \\ &= \frac{\sigma^2}{1 - \rho^2}. \end{align*} \]

AR(1) and Stationarity

Need \(\mathbb{E}[Y_t] = \mathbb{E}[X_{t-1}]\) and \(\mathbb{V}[Y_t] = \mathbb{V}[Y_{t-1}]\). If we want this to hold:

\[\begin{aligned} \mathbb{E}[Y_t] &= \mathbb{E}[\alpha + \rho Y_{t-1} + \varepsilon_t] \\ &= \alpha + \mathbb{E}[Y_{t-1}] + \cancel{\mathbb{E}[\varepsilon_t]} \\[0.75em] \Rightarrow \mathbb{E}[Y_t] &= \alpha + \rho \mathbb{E}[Y_t] \\ &= \frac{\alpha}{1-\rho}. \end{aligned}\]

AR(1) and Stationarity

Now for the variance:

\[\begin{aligned} \mathbb{V}[Y_t] = \mathbb{V}[\alpha + \rho Y_{t-1} + \varepsilon_t] \\ &= \rho^2 \mathbb{V}[Y_{t-1}] + \cancel{2\text{Cov}(Y_{t-1}, \varepsilon_t)} + \mathbb{V}[\varepsilon_t] \\[0.75em] \Rightarrow \mathbb{V}[Y_t] &= \rho^2 \mathbb{V}[Y_t] + \sigma^2 \\ &= \frac{\sigma^2}{1 - \rho^2}. \end{aligned}\]

Thus the AR(1) is stationary if and only if \(|\rho| < 1\).

Implication of Stationarity

Think about deterministic version (set \(\alpha = 0\) to simplify):

\[ y_t = \rho y_{t-1} = \rho^t y_0 \]

Stationarity: \[|\rho| < 1 \Rightarrow \lim_{t \to \infty} \rho^t = 0.\]

Stationary AR(1)s will decay to 0 on average and non-stationary will diverge to \(\pm \infty\).

Effect of Noise

Innovations/noise continually perturb away from this long-term deterministic path.

Code

ρ = 0.7
σ = 0.2
ar_var = sqrt(σ^2 / (1 - ρ^2))
T = 100
ts_det = zeros(T)
n = 3
ts_stoch = zeros(T, n)
t0 = 1.5
for i = 1:T
    if i == 1
        ts_det[i] = t0
        ts_stoch[i, :] .= t0
    else
        ts_det[i] = ρ * ts_det[i-1]
        ts_stoch[i, :] = ρ * ts_stoch[i-1, :] .+ rand(Normal(0, σ), n)
    end
end

p = plot(1:T, ts_det, label="Deterministic", xlabel="Time", linewidth=3, color=:black, size=(600, 550))
cols = ["#DDAA33", "#BB5566", "#004488"]
for i = 1:n
    label = i == 1 ? "Stochastic" : false
    plot!(p, 1:T, ts_stoch[:, i], linewidth=1, color=cols[i], label=label)
end 
p

Figure 9: Determinitic vs. Stochastic AR(1)

Predicting AR(1)s

\[\begin{aligned} \mathbb{E}[Y_{t+h} | Y_t = \mathbf{y}_t] &= \mathbb{E}[\alpha \sum_{k=0}^{h-1} \rho^k + \sum_{k=0}^{h-1} \varepsilon_{t-k} \rho^k + \rho^h y_0 | Y_t = \mathbf{y}_t] \\ &= \alpha \sum_{k=0}^{t-1} \rho^k + \rho^h y_0 \end{aligned}\]

Key Points

Time series exhibit serial dependence (autocorrelation)
AR(1): autoregressive model with explicit lag-1 dependence.
AR(1) is stationary if and only if \(|\rho| < 1\).
In general, AR models useful for forecasting, pretty useless for explanation.
Similar concepts in spatial data: spatial correlation (distance/adjacency vs. time), lots of different models.

For More On Time Series

Courses

STSCI 4550/5550 (Applied Time Series Analysis),
CEE 6790 (heavy focus on spectral analysis and signal processing)

Books

Shumway & Stoffer (2025)
Hyndman & Athanasopoulos (2021)
Durbin & Koopman (2012)
Banerjee et al. (2011)
Cressie & Wikle (2011)

Upcoming Schedule

Next Classes

Friday: AR(1) Inference

Next Week: Extreme Values (Theory and Models)

Next Unit: Hypothesis Testing, Model Evaluation, and Comparison

Assessments

HW2: Due Friday at 9pm.

Exercises: Available after class (will include GLMs + time series).

HW3: Assigned on 3/2, due 3/13.

References

References (Scroll For Full List)

Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2011). Hierarchical modeling and analysis for spatial data, second edition (2nd ed.). Philadelphia, PA: Chapman & Hall/CRC. https://doi.org/10.1201/b17115

Cressie, N., & Wikle, C. K. (2011). Statistics for Spatio-Temporal Data. Hoboken, NJ: Wiley.

Durbin, J., & Koopman, S. J. (2012). Time series analysis by state space methods (2nd ed.). London, England: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199641178.001.0001

Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed). Melbourne, Australia: OTexts. Retrieved from https://otexts.com/fpp3/

Shumway, R. H., & Stoffer, D. S. (2025). Time series analysis and its applications. Springer.