CIR Term Structure via Kalman Filter
- Description: Cross-domain Kalman-filter application — estimating the Cox-Ingersoll-Ross (CIR) interest-rate model via Kalman filter for term structure. Affine term structure, CIR dynamics, state-space form, quasi-maximum-likelihood estimation
- Paper: Multi-Factor Cox-Ingersoll-Ross Models of the Term Structure From a Kalman Filter Model (2003) — and related Kalman-filter term-structure estimation literature
- K2E-B ID: [K2E-B-Z-1]
- Max3 PDF:
[K2E] SLAM/[K2E-B-Z] Cross-Domain KF Applications/[K2E-B-Z-1][2003] Multi-Factor Cox-Ingersoll-Ross Models of the Term Structure From a Kalman Filter Model.pdf(sibling Z-2: Application of the Kalman Filter for UK/Germany term structure) - Notion ID: (待创建)
- Created: 2021-07-01
- Updated:2026-06-02
- License: Reuse welcome — please credit Yu Zhang and link back to yuzhang.io
跨域笔记:这是卡尔曼滤波在金融 (利率期限结构) 而非 SLAM 的应用,放在 K2E-B-Z (Cross-Domain KF Applications)。和 SLAM 共享的是 Kalman filter 框架本身 (见 Gaussian Filters 系列),被估计的"状态"是隐含利率因子而非机器人位姿。
Table of Contents
- 1. Why Kalman Filter for Term Structure
- 2. Affine Term Structure Model
- 3. CIR Model
- 4. Estimating CIR — Two Approaches
- 5. State-Space Representation
- 6. Kalman Filter Recursion
- 7. Quasi-Maximum-Likelihood Estimation
- References
1. Why Kalman Filter for Term Structure
The term structure of interest rates is driven by a few unobservable latent factors (the short rate, etc.). Bond yields at different maturities are noisy observations of these factors. This is exactly a state-space / hidden-state problem → Kalman filter.
Parallel to SLAM: latent factors ↔ robot state; observed yields ↔ sensor measurements; transition density ↔ motion model; yield equation ↔ observation model. (Bayes/Gaussian filtering theory: see Gaussian Filters 系列.)
2. Affine Term Structure Model
The instantaneous short rate $r$ follows a stochastic differential equation:
$$ dr = \mu(r, t) , dt + \sigma(r, t) , dW $$
- $\mu(r,t)$ — deterministic drift
- $\sigma(r,t) dW$ — diffusion (random part), $W$ standard Brownian motion (Wiener process)
A pure discount bond (zero-coupon, no coupon, issued below face value, repaid at face) price in an affine model:
$$ P(t, T) = A(\tau) \exp(-B(\tau) X), \quad \tau = T - t $$
- $X$ — state vector (latent factors)
- $A(\tau), B(\tau)$ — functions of time-to-maturity $\tau$
Zero-coupon yield curve:
$$ R(t, T) = -\frac{1}{\tau} \ln P(t, T) = \frac{B(\tau) X - \ln A(\tau)}{\tau} $$
3. CIR Model
Cox-Ingersoll-Ross (1985) — a square-root affine diffusion:
$$ dr = k(\theta - r) , dt + \sigma \sqrt{r} , dW $$
- $r$ — instantaneous interest rate
- $\theta$ — long-run mean rate
- $k$ — speed of mean reversion (mean reversion parameter)
- $\sigma \sqrt{r}$ — square-root process (volatility scales with $\sqrt{r}$; keeps $r \geq 0$)
Square-root processes are the most popular affine diffusions.
Risk-Neutral / Arbitrage-Free
Adjusting the drift by the market price of risk $\lambda$ (subtracting $\lambda r$ for arbitrage-free pricing):
$$ dr = (k(\theta - r) - \lambda r) , dt + \sigma \sqrt{r} , dW $$
Pure discount bond:
$$ P(t, T) = A(t, T) e^{-B(t, T) r} $$
with $\gamma = \sqrt{(k + \lambda)^2 + 2\sigma^2}$ and closed-form $A(t,T)$, $B(t,T)$ (functions of $\gamma, k, \lambda, \theta, \sigma, \tau$).
Continuously compounded yield:
$$ R(t, T) = -\frac{\log P(t, T)}{T - t} = \frac{-\log A(t,T) + B(t,T) r}{T - t} $$
4. Estimating CIR — Two Approaches
Cross-section approach
Uses only yields of bonds with different maturities at one time. The state $r_t$ is treated as an extra unknown parameter. Disadvantage: risk-premium parameters cannot be identified — they submerge into the drift.
Time-series approach
Uses time series of rates. But using more rates than factors → model under-identified (parameters not consistently estimable).
Solution — allow measurement error + Kalman filter
Allow discrepancies between observed and theoretical rates, treated as Gaussian error:
$$ R(\tau) = \frac{B(\tau) X}{\tau} - \frac{\ln A(\tau)}{\tau} + \epsilon_t $$
Direct MLE is infeasible (yield density has no closed form). Standard technique: quasi-maximum-likelihood estimator based on the Kalman filter (when the measurement-error covariance is full rank). MCMC is an alternative.
5. State-Space Representation
State (transition) — latent factors as Markov process
The exact CIR transition density is non-central chi-square: $2cX_t \mid X_{t-1} \sim \chi^2(2q+2,\ 2u)$ (degrees of freedom $2q+2 = 4k\theta/\sigma^2$, non-centrality $2u$), but the Kalman filter uses a Gaussian approximation $X_t | X_{t-1} \sim N(\mu_t, Q_t)$:
$$ \mu_{t,j} = \theta[1 - e^{-k_j \Delta t}] + X_{t-1,j} e^{-k_j \Delta t} $$
$Q_t$ is diagonal (per factor), with a complex variance expression $\xi_j$.
Discrete-time transition:
$$ X_t = \Phi(\Psi) X_{t-1} + c(\Psi) + \eta_t $$
- $\Phi = e^{-k_j \Delta t}$ (diagonal), $c = \theta(1 - e^{-k_j \Delta t})$
- $\eta_t$ — zero-mean disturbance, variance $Q_t$
Measurement — yields observe the factors
$$ R_t = Z(\Psi) X_t + d(\Psi) + \epsilon_t, \quad \epsilon_t \sim N(0, H) $$
- $R_t$ — $n \times 1$ observed yields (e.g. 8 maturities → $H$ is $8 \times 8$ diagonal)
- $X_t$ — $j \times 1$ latent state, $Z$ is $n \times j$, $d$ is $n \times 1$
- $\Psi = (\theta, k, \sigma, \lambda, h_{1..N})$ — hyperparameters
For a one-factor model, $Z = B(t,T)/(T-t)$, $d = -\log A(t,T)/(T-t)$.
6. Kalman Filter Recursion
Standard linear KF (see Gaussian Filters 系列 ch03 §2):
Prediction (set $s = t-1$):
$$ \hat{X}{t|t-1} = \Phi(\Psi) \hat{X}{t-1|t-1} + c(\Psi) $$ $$ P_{t|t-1} = \Phi(\Psi) P_{t-1|t-1} \Phi(\Psi)^T + Q_t \quad \text{(covariance prediction — injects CIR process noise } Q_t \text{)} $$
Measurement update:
$$ R_{t|t-1} = Z \hat{X}{t|t-1} + d $$ $$ v_t = R_t - R{t|t-1} \quad \text{(innovation)} $$ $$ F_t = Z P_{t|t-1} Z^T + H \quad \text{(innovation covariance)} $$ $$ K_t = P_{t|t-1} Z^T F_t^{-1} \quad \text{(Kalman gain)} $$ $$ \hat{X}t = \hat{X}{t|t-1} + K_t v_t, \quad P_t = P_{t|t-1} - K_t Z P_{t|t-1} $$
Same five equations as SLAM's KF — only the model matrices ($\Phi, Z, c, d$) come from the CIR term-structure model instead of robot kinematics.
7. Quasi-Maximum-Likelihood Estimation
The yield density has no closed form, so use the KF-implied Gaussian: $R_t$ is normal with mean $R_{t|t-1}$ and covariance $F_t$. The log-likelihood:
$$ \log L(R_1, \dots, R_n; \Psi) = \sum_t \log p(R_t | \xi_{t-1}) $$
is a function of $n, F_t, v_t$ (Gaussian innovation likelihood). Maximize over hyperparameters $\Psi = (\theta, k, \sigma, \lambda, h)$ — this is quasi-MLE (quasi- because the true CIR density is non-central chi-square, not Gaussian; the KF Gaussian is an approximation).
References
- Cox, J. C., Ingersoll, J. E., & Ross, S. A. (1985). A Theory of the Term Structure of Interest Rates. Econometrica, 53(2). — CIR model
- Chen, R.-R., & Scott, L. (2003). Multi-Factor Cox-Ingersoll-Ross Models of the Term Structure: Estimates and Tests from a Kalman Filter Model. Journal of Real Estate Finance and Economics, 27(2), 143-172. — the paper this note covers (K2E-B-Z-1)
- Geyer, A. L. J., & Pichler, S. (1999). A State-Space Approach to Estimate and Test Multifactor CIR Models of the Term Structure. Journal of Financial Research. — KF estimation of multifactor CIR
- Duffie, D., & Kan, R. (1996). A Yield-Factor Model of Interest Rates. Mathematical Finance. — affine term structure
- Kalman filter framework itself: see Gaussian Filters 系列 (KF prediction/update, innovation, inversion lemma)