CIR Term Structure via Kalman Filter


  • Description: Cross-domain Kalman-filter application — estimating the Cox-Ingersoll-Ross (CIR) interest-rate model via Kalman filter for term structure. Affine term structure, CIR dynamics, state-space form, quasi-maximum-likelihood estimation
  • Paper: Multi-Factor Cox-Ingersoll-Ross Models of the Term Structure From a Kalman Filter Model (2003) — and related Kalman-filter term-structure estimation literature
  • K2E-B ID: [K2E-B-Z-1]
  • Max3 PDF: [K2E] SLAM/[K2E-B-Z] Cross-Domain KF Applications/[K2E-B-Z-1][2003] Multi-Factor Cox-Ingersoll-Ross Models of the Term Structure From a Kalman Filter Model.pdf (sibling Z-2: Application of the Kalman Filter for UK/Germany term structure)
  • Notion ID: (待创建)
  • Created: 2021-07-01
  • Updated:2026-06-02
  • License: Reuse welcome — please credit Yu Zhang and link back to yuzhang.io

跨域笔记:这是卡尔曼滤波在金融 (利率期限结构) 而非 SLAM 的应用,放在 K2E-B-Z (Cross-Domain KF Applications)。和 SLAM 共享的是 Kalman filter 框架本身 (见 Gaussian Filters 系列),被估计的"状态"是隐含利率因子而非机器人位姿。


Table of Contents


1. Why Kalman Filter for Term Structure

The term structure of interest rates is driven by a few unobservable latent factors (the short rate, etc.). Bond yields at different maturities are noisy observations of these factors. This is exactly a state-space / hidden-state problem → Kalman filter.

Parallel to SLAM: latent factors ↔ robot state; observed yields ↔ sensor measurements; transition density ↔ motion model; yield equation ↔ observation model. (Bayes/Gaussian filtering theory: see Gaussian Filters 系列.)

2. Affine Term Structure Model

The instantaneous short rate $r$ follows a stochastic differential equation:

$$ dr = \mu(r, t) , dt + \sigma(r, t) , dW $$

  • $\mu(r,t)$ — deterministic drift
  • $\sigma(r,t) dW$ — diffusion (random part), $W$ standard Brownian motion (Wiener process)

A pure discount bond (zero-coupon, no coupon, issued below face value, repaid at face) price in an affine model:

$$ P(t, T) = A(\tau) \exp(-B(\tau) X), \quad \tau = T - t $$

  • $X$ — state vector (latent factors)
  • $A(\tau), B(\tau)$ — functions of time-to-maturity $\tau$

Zero-coupon yield curve:

$$ R(t, T) = -\frac{1}{\tau} \ln P(t, T) = \frac{B(\tau) X - \ln A(\tau)}{\tau} $$

3. CIR Model

Cox-Ingersoll-Ross (1985) — a square-root affine diffusion:

$$ dr = k(\theta - r) , dt + \sigma \sqrt{r} , dW $$

  • $r$ — instantaneous interest rate
  • $\theta$ — long-run mean rate
  • $k$ — speed of mean reversion (mean reversion parameter)
  • $\sigma \sqrt{r}$ — square-root process (volatility scales with $\sqrt{r}$; keeps $r \geq 0$)

Square-root processes are the most popular affine diffusions.

Risk-Neutral / Arbitrage-Free

Adjusting the drift by the market price of risk $\lambda$ (subtracting $\lambda r$ for arbitrage-free pricing):

$$ dr = (k(\theta - r) - \lambda r) , dt + \sigma \sqrt{r} , dW $$

Pure discount bond:

$$ P(t, T) = A(t, T) e^{-B(t, T) r} $$

with $\gamma = \sqrt{(k + \lambda)^2 + 2\sigma^2}$ and closed-form $A(t,T)$, $B(t,T)$ (functions of $\gamma, k, \lambda, \theta, \sigma, \tau$).

Continuously compounded yield:

$$ R(t, T) = -\frac{\log P(t, T)}{T - t} = \frac{-\log A(t,T) + B(t,T) r}{T - t} $$

4. Estimating CIR — Two Approaches

Cross-section approach

Uses only yields of bonds with different maturities at one time. The state $r_t$ is treated as an extra unknown parameter. Disadvantage: risk-premium parameters cannot be identified — they submerge into the drift.

Time-series approach

Uses time series of rates. But using more rates than factors → model under-identified (parameters not consistently estimable).

Solution — allow measurement error + Kalman filter

Allow discrepancies between observed and theoretical rates, treated as Gaussian error:

$$ R(\tau) = \frac{B(\tau) X}{\tau} - \frac{\ln A(\tau)}{\tau} + \epsilon_t $$

Direct MLE is infeasible (yield density has no closed form). Standard technique: quasi-maximum-likelihood estimator based on the Kalman filter (when the measurement-error covariance is full rank). MCMC is an alternative.

5. State-Space Representation

State (transition) — latent factors as Markov process

The exact CIR transition density is non-central chi-square: $2cX_t \mid X_{t-1} \sim \chi^2(2q+2,\ 2u)$ (degrees of freedom $2q+2 = 4k\theta/\sigma^2$, non-centrality $2u$), but the Kalman filter uses a Gaussian approximation $X_t | X_{t-1} \sim N(\mu_t, Q_t)$:

$$ \mu_{t,j} = \theta[1 - e^{-k_j \Delta t}] + X_{t-1,j} e^{-k_j \Delta t} $$

$Q_t$ is diagonal (per factor), with a complex variance expression $\xi_j$.

Discrete-time transition:

$$ X_t = \Phi(\Psi) X_{t-1} + c(\Psi) + \eta_t $$

  • $\Phi = e^{-k_j \Delta t}$ (diagonal), $c = \theta(1 - e^{-k_j \Delta t})$
  • $\eta_t$ — zero-mean disturbance, variance $Q_t$

Measurement — yields observe the factors

$$ R_t = Z(\Psi) X_t + d(\Psi) + \epsilon_t, \quad \epsilon_t \sim N(0, H) $$

  • $R_t$ — $n \times 1$ observed yields (e.g. 8 maturities → $H$ is $8 \times 8$ diagonal)
  • $X_t$ — $j \times 1$ latent state, $Z$ is $n \times j$, $d$ is $n \times 1$
  • $\Psi = (\theta, k, \sigma, \lambda, h_{1..N})$ — hyperparameters

For a one-factor model, $Z = B(t,T)/(T-t)$, $d = -\log A(t,T)/(T-t)$.

6. Kalman Filter Recursion

Standard linear KF (see Gaussian Filters 系列 ch03 §2):

Prediction (set $s = t-1$):

$$ \hat{X}{t|t-1} = \Phi(\Psi) \hat{X}{t-1|t-1} + c(\Psi) $$ $$ P_{t|t-1} = \Phi(\Psi) P_{t-1|t-1} \Phi(\Psi)^T + Q_t \quad \text{(covariance prediction — injects CIR process noise } Q_t \text{)} $$

Measurement update:

$$ R_{t|t-1} = Z \hat{X}{t|t-1} + d $$ $$ v_t = R_t - R{t|t-1} \quad \text{(innovation)} $$ $$ F_t = Z P_{t|t-1} Z^T + H \quad \text{(innovation covariance)} $$ $$ K_t = P_{t|t-1} Z^T F_t^{-1} \quad \text{(Kalman gain)} $$ $$ \hat{X}t = \hat{X}{t|t-1} + K_t v_t, \quad P_t = P_{t|t-1} - K_t Z P_{t|t-1} $$

Same five equations as SLAM's KF — only the model matrices ($\Phi, Z, c, d$) come from the CIR term-structure model instead of robot kinematics.

7. Quasi-Maximum-Likelihood Estimation

The yield density has no closed form, so use the KF-implied Gaussian: $R_t$ is normal with mean $R_{t|t-1}$ and covariance $F_t$. The log-likelihood:

$$ \log L(R_1, \dots, R_n; \Psi) = \sum_t \log p(R_t | \xi_{t-1}) $$

is a function of $n, F_t, v_t$ (Gaussian innovation likelihood). Maximize over hyperparameters $\Psi = (\theta, k, \sigma, \lambda, h)$ — this is quasi-MLE (quasi- because the true CIR density is non-central chi-square, not Gaussian; the KF Gaussian is an approximation).

References

  • Cox, J. C., Ingersoll, J. E., & Ross, S. A. (1985). A Theory of the Term Structure of Interest Rates. Econometrica, 53(2). — CIR model
  • Chen, R.-R., & Scott, L. (2003). Multi-Factor Cox-Ingersoll-Ross Models of the Term Structure: Estimates and Tests from a Kalman Filter Model. Journal of Real Estate Finance and Economics, 27(2), 143-172. — the paper this note covers (K2E-B-Z-1)
  • Geyer, A. L. J., & Pichler, S. (1999). A State-Space Approach to Estimate and Test Multifactor CIR Models of the Term Structure. Journal of Financial Research. — KF estimation of multifactor CIR
  • Duffie, D., & Kan, R. (1996). A Yield-Factor Model of Interest Rates. Mathematical Finance. — affine term structure
  • Kalman filter framework itself: see Gaussian Filters 系列 (KF prediction/update, innovation, inversion lemma)