Bayesian Dynamic Factor Models
This page contains weekly updates on my work developing Bayesian Dynamic Factor Models for the PyMC library as part of Google Summer of Code 2025.
State-space models (SSMs) provide a flexible framework for modeling dynamic systems where
latent states evolve over time. In econometrics, Dynamic Factor Models (DFMs) are widely
used to capture the co-movement of multiple time series by assuming that a small number of
latent factors drive the observed variables. The PyMC library already includes implementations of SARIMAX, VARMAX, and structural state-space models, along with example
notebooks for their usage. This project aims to extend the existing PyMC state-space module by implementing Dynamic Factor Models, aligning with the functionality available in
Statsmodels.
In Statsmodels, our reference for the development of the project, two DFM implementations exist: DynamicFactor, which represents the model
in state-space form and estimates parameters via Kalman filtering, and DynamicFactorMQ,
based on the Expectation-Maximization (EM) algorithm. Our implementation will follow a Bayesian approach by leveraging PyMC’s probabilistic programming
framework and PyTensor for computation. Additionally, an accompanying example notebook will demonstrate estimation, forecasting, and causal analysis with the new model,
ensuring accessibility for users with varying levels of experience in Bayesian modeling.
Code references:
pymc-extras
Statsmodels state space DFM notebook
and making a version in PyMC framework using a custom DFM model
(
code available here
).
pytensor, and compared the results with PyMC’s built-in version.
The notebook is available
here.
Statsmodels implementations of the coincident index, focusing on model outputs, inference behavior, and model structure.
In particular, I added a lagged dependence of the factor on one observed variable to better align with the Statsmodels extended model version.
The updated notebook is available here.
DFM.py module implementing the Dynamic Factor Model, now available in this pull request. The implementation is functional and ready for review, with further testing and matrix-vectorization optimizations planned.
DFM.py implementation through vectorization and block diagonal construction.
Added support for measurement errors and heterogeneous autoregressive orders across factors.
pytest tests for the new BayesianDynamicFactor class. Focused on validating the correct construction of model matrices by comparing them against Statsmodels, and initiated a test for log-likelihood computation.
Statsmodels implementation (also in preparation for the tests).
Statsmodels.
pymc_extras/statespace/models/structural/components/regression.py.
This approach differs from Statsmodels but provides greater flexibility.
Added corresponding tests, including comparisons with Statsmodels and internal validations.
Statsmodels on the construction of the coincident index,
a standard benchmark for DFMs.
pymc-extras repository.
Theory Resources: