From 74b1fe4139378f369871b6371a76c293721b6410 Mon Sep 17 00:00:00 2001 From: Sebastian Krantz Date: Mon, 12 May 2025 01:12:57 +0200 Subject: [PATCH 1/6] Direct links for website. --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 49545f4..078acda 100644 --- a/README.md +++ b/README.md @@ -139,7 +139,7 @@ plot(mod) ```
-plot of chunk unnamed-chunk-1 +plot of chunk unnamed-chunk-1
```r @@ -165,7 +165,7 @@ plot(fc) ```
-plot of chunk unnamed-chunk-1 +plot of chunk unnamed-chunk-1
```r From 5eee934033558648e60d9fde5271a38fe78ed31f Mon Sep 17 00:00:00 2001 From: Sebastian Krantz Date: Mon, 12 May 2025 21:25:39 +0200 Subject: [PATCH 2/6] Better package page. --- R/dfms.R | 45 +++++++++++++++++++++++++-------------------- man/dfms-package.Rd | 45 +++++++++++++++++++++++---------------------- 2 files changed, 48 insertions(+), 42 deletions(-) diff --git a/R/dfms.R b/R/dfms.R index 5cb4aeb..db5dfc6 100644 --- a/R/dfms.R +++ b/R/dfms.R @@ -1,27 +1,19 @@ #' Dynamic Factor Models #' -#' *dfms* provides efficient estimation of Dynamic Factor Models via the EM Algorithm. +#' @description #' -#' Estimation can be done in 3 different ways following: +#' *dfms* provides efficient estimation of Dynamic Factor Models via the EM Algorithm --- following Doz, Giannone & Reichlin (2011, 2012) and Banbura & Modugno (2014). The package has the following contents: #' -#' - Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic factor models based on Kalman filtering. *Journal of Econometrics, 164*(1), 188-205. +#' **Information Criteria** #' -#' - Doz, C., Giannone, D., & Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate dynamic factor models. *Review of Economics and Statistics, 94*(4), 1014-1024. -#' -#' - Banbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. *Journal of Applied Econometrics, 29*(1), 133-160. -#' -#' The default is `em.method = "auto"`, which chooses `"BM"` following Banbura & Modugno (2014) with missing data or mixed frequency, and `"DGR"` following Doz, Giannone & Reichlin (2012) otherwise. Using `em.method = "none"` generates Two-Step estimates following Doz, Giannone & Reichlin (2011). This is extremely efficient on bigger datasets. PCA and Two-Step estimates are also reported in EM-estimation. All methods support missing data, but `em.method = "DGR"` does not model them in EM iterations. -#' -#' @section Package Contents: -#' -#' **Functions to Specify/Estimate Model and Key Methods** -#' -#' \code{\link[=ICr]{ICr()}} --- Information Criteria\cr +#' \code{\link[=ICr]{ICr()}}\cr #' #' - \code{\link[=plot.ICr]{plot()}}\cr #' - \code{\link[=screeplot.ICr]{screeplot()}}\cr #' -#' \code{\link[=DFM]{DFM()}} --- Estimate the Model\cr +#' **Fit a Dynamic Factor Model** +#' +#' \code{\link[=DFM]{DFM()}}\cr #' #' - \code{\link[=summary.dfm]{summary()}}\cr #' - \code{\link[=plot.dfm]{plot()}}\cr @@ -29,20 +21,26 @@ #' - \code{\link[=residuals.dfm]{residuals()}}\cr #' - \code{\link[=fitted.dfm]{fitted()}} #' -#' \code{\link[=predict.dfm]{predict()}} --- Generate Forecasts\cr +#' **Generate Forecasts** +#' +#' \code{\link[=predict.dfm]{predict()}}\cr #' #' - \code{\link[=plot.dfm_forecast]{plot()}}\cr #' - \code{\link[=as.data.frame.dfm_forecast]{as.data.frame()}}\cr #' -#' **Auxiliary Functions** +#' **Fast Stationary Kalman Filtering and Smoothing** #' -#' \code{\link[=.VAR]{.VAR()}} --- Estimate Vector Autoregression\cr #' \code{\link[=SKF]{SKF()}} --- Stationary Kalman Filter\cr #' \code{\link[=FIS]{FIS()}} --- Fixed Interval Smoother\cr #' \code{\link[=SKFS]{SKFS()}} --- Stationary Kalman Filter + Smoother\cr +#' +#' **Helper Functions** +#' +#' \code{\link[=.VAR]{.VAR()}} --- (Fast) Barebones Vector-Autoregression\cr +#' \code{\link[=ainv]{ainv()}} --- Armadillo's Inverse Function\cr +#' \code{\link[=apinv]{apinv()}} --- Armadillo's Pseudo-Inverse Function\cr #' \code{\link[=tsnarmimp]{tsnarmimp()}} --- Remove and Impute Missing Values in a Multivariate Time Series\cr -#' \code{\link[=ainv]{ainv()}} --- Rcpp Armadillo's Inverse Function\cr -#' \code{\link[=apinv]{apinv()}} --- Rcpp Armadillo's Pseudo-Inverse Function\cr +#' \code{\link[=em_converged]{em_converged()}} --- Convergence Test for EM-Algorithm\cr #' #' **Data** #' @@ -50,6 +48,13 @@ #' \code{\link{BM14_Q}} --- Quarterly Series by Banbura and Modugno (2014)\cr #' \code{\link{BM14_Models}} --- Series Metadata + Small/Medium/Large Model Specifications\cr #' +#' @references +#' Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic factor models based on Kalman filtering. *Journal of Econometrics, 164*(1), 188-205. +#' +#' Doz, C., Giannone, D., & Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate dynamic factor models. *Review of Economics and Statistics, 94*(4), 1014-1024. +#' +#' Banbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. *Journal of Applied Econometrics, 29*(1), 133-160. +#' #' @docType package #' @name dfms-package #' @aliases dfms diff --git a/man/dfms-package.Rd b/man/dfms-package.Rd index 08cb52e..fafe4ff 100644 --- a/man/dfms-package.Rd +++ b/man/dfms-package.Rd @@ -6,30 +6,19 @@ \alias{dfms} \title{Dynamic Factor Models} \description{ -\emph{dfms} provides efficient estimation of Dynamic Factor Models via the EM Algorithm. -} -\details{ -Estimation can be done in 3 different ways following: -\itemize{ -\item Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic factor models based on Kalman filtering. \emph{Journal of Econometrics, 164}(1), 188-205. \url{doi:10.1016/j.jeconom.2011.02.012} -\item Doz, C., Giannone, D., & Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate dynamic factor models. \emph{Review of Economics and Statistics, 94}(4), 1014-1024. \url{doi:10.1162/REST_a_00225} -\item Banbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. \emph{Journal of Applied Econometrics, 29}(1), 133-160. \url{doi:10.1002/jae.2306} -} - -The default is \code{em.method = "auto"}, which chooses \code{"BM"} following Banbura & Modugno (2014) with missing data or mixed frequency, and \code{"DGR"} following Doz, Giannone & Reichlin (2012) otherwise. Using \code{em.method = "none"} generates Two-Step estimates following Doz, Giannone & Reichlin (2011). This is extremely efficient on bigger datasets. PCA and Two-Step estimates are also reported in EM-estimation. All methods support missing data, but \code{em.method = "DGR"} does not model them in EM iterations. -} -\section{Package Contents}{ - +\emph{dfms} provides efficient estimation of Dynamic Factor Models via the EM Algorithm --- following Doz, Giannone & Reichlin (2011, 2012) and Banbura & Modugno (2014). The package has the following contents: -\strong{Functions to Specify/Estimate Model and Key Methods} +\strong{Information Criteria} -\code{\link[=ICr]{ICr()}} --- Information Criteria\cr +\code{\link[=ICr]{ICr()}}\cr \itemize{ \item \code{\link[=plot.ICr]{plot()}}\cr \item \code{\link[=screeplot.ICr]{screeplot()}}\cr } -\code{\link[=DFM]{DFM()}} --- Estimate the Model\cr +\strong{Fit a Dynamic Factor Model} + +\code{\link[=DFM]{DFM()}}\cr \itemize{ \item \code{\link[=summary.dfm]{summary()}}\cr \item \code{\link[=plot.dfm]{plot()}}\cr @@ -38,21 +27,27 @@ The default is \code{em.method = "auto"}, which chooses \code{"BM"} following Ba \item \code{\link[=fitted.dfm]{fitted()}} } -\code{\link[=predict.dfm]{predict()}} --- Generate Forecasts\cr +\strong{Generate Forecasts} + +\code{\link[=predict.dfm]{predict()}}\cr \itemize{ \item \code{\link[=plot.dfm_forecast]{plot()}}\cr \item \code{\link[=as.data.frame.dfm_forecast]{as.data.frame()}}\cr } -\strong{Auxiliary Functions} +\strong{Fast Stationary Kalman Filtering and Smoothing} -\code{\link[=.VAR]{.VAR()}} --- Estimate Vector Autoregression\cr \code{\link[=SKF]{SKF()}} --- Stationary Kalman Filter\cr \code{\link[=FIS]{FIS()}} --- Fixed Interval Smoother\cr \code{\link[=SKFS]{SKFS()}} --- Stationary Kalman Filter + Smoother\cr + +\strong{Helper Functions} + +\code{\link[=.VAR]{.VAR()}} --- (Fast) Barebones Vector-Autoregression\cr +\code{\link[=ainv]{ainv()}} --- Armadillo's Inverse Function\cr +\code{\link[=apinv]{apinv()}} --- Armadillo's Pseudo-Inverse Function\cr \code{\link[=tsnarmimp]{tsnarmimp()}} --- Remove and Impute Missing Values in a Multivariate Time Series\cr -\code{\link[=ainv]{ainv()}} --- Rcpp Armadillo's Inverse Function\cr -\code{\link[=apinv]{apinv()}} --- Rcpp Armadillo's Pseudo-Inverse Function\cr +\code{\link[=em_converged]{em_converged()}} --- Convergence Test for EM-Algorithm\cr \strong{Data} @@ -60,4 +55,10 @@ The default is \code{em.method = "auto"}, which chooses \code{"BM"} following Ba \code{\link{BM14_Q}} --- Quarterly Series by Banbura and Modugno (2014)\cr \code{\link{BM14_Models}} --- Series Metadata + Small/Medium/Large Model Specifications\cr } +\references{ +Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic factor models based on Kalman filtering. \emph{Journal of Econometrics, 164}(1), 188-205. \url{doi:10.1016/j.jeconom.2011.02.012} + +Doz, C., Giannone, D., & Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate dynamic factor models. \emph{Review of Economics and Statistics, 94}(4), 1014-1024. \url{doi:10.1162/REST_a_00225} +Banbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. \emph{Journal of Applied Econometrics, 29}(1), 133-160. \url{doi:10.1002/jae.2306} +} From afb006174ed6f1a4195aecbce69f0f5e039c82c8 Mon Sep 17 00:00:00 2001 From: Sebastian Krantz Date: Mon, 12 May 2025 21:37:43 +0200 Subject: [PATCH 3/6] Spelling. --- R/methods.R | 2 +- man/predict.dfm.Rd | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/R/methods.R b/R/methods.R index c9b825d..78a1722 100644 --- a/R/methods.R +++ b/R/methods.R @@ -457,7 +457,7 @@ fitted.dfm <- function(object, #' @param method character. The factor estimates to use: one of \code{"qml"}, \code{"2s"} or \code{"pca"}. #' @param standardized logical. \code{FALSE} will return data forecasts on the original scale. #' @param resFUN an (optional) function to compute a univariate forecast of the residuals. -#' The function needs to have a second argument providing the forecast horizon (\code{h}) and return a vector or forecasts. See Examples. +#' The function needs to have a second argument providing the forecast horizon (\code{h}) and return a vector of forecasts. See Examples. #' @param resAC numeric. Threshold for residual autocorrelation to apply \code{resFUN}: only residual series where AC1 > resAC will be forecasted. #' @param \dots further arguments to \code{resFUN}. #' diff --git a/man/predict.dfm.Rd b/man/predict.dfm.Rd index 9cfde00..f4aecb1 100644 --- a/man/predict.dfm.Rd +++ b/man/predict.dfm.Rd @@ -58,7 +58,7 @@ \item{standardized}{logical. \code{FALSE} will return data forecasts on the original scale.} \item{resFUN}{an (optional) function to compute a univariate forecast of the residuals. -The function needs to have a second argument providing the forecast horizon (\code{h}) and return a vector or forecasts. See Examples.} +The function needs to have a second argument providing the forecast horizon (\code{h}) and return a vector of forecasts. See Examples.} \item{resAC}{numeric. Threshold for residual autocorrelation to apply \code{resFUN}: only residual series where AC1 > resAC will be forecasted.} From 866809d30346dbd4eb303ad13ab508ee39b1ef8d Mon Sep 17 00:00:00 2001 From: Sebastian Krantz Date: Mon, 12 May 2025 22:53:07 +0200 Subject: [PATCH 4/6] Using <- instead of = and mixed frequency example. --- vignettes/introduction.Rmd | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/vignettes/introduction.Rmd b/vignettes/introduction.Rmd index 8064b8f..a94d777 100644 --- a/vignettes/introduction.Rmd +++ b/vignettes/introduction.Rmd @@ -53,7 +53,7 @@ Prior to estimation, all data is differenced by BM14, and some series are log, d library(magrittr) # log-transforming and first-differencing the data BM14_M[, BM14_Models_M$log_trans] %<>% log() -BM14_M_diff = diff(BM14_M) +BM14_M_diff <- diff(BM14_M) plot(scale(BM14_M_diff), lwd = 1) ``` @@ -62,7 +62,7 @@ plot(scale(BM14_M_diff), lwd = 1) Before estimating a model, the `ICr()` function can be applied to determine the number of factors. It computes 3 information criteria proposed in Bai and NG (2002)^[Bai, J., Ng, S. (2002). Determining the Number of Factors in Approximate Factor Models. *Econometrica, 70*(1), 191-221. ], whereby the second criteria generally suggests the most parsimonious model. ```{r} -ic = ICr(BM14_M_diff) +ic <- ICr(BM14_M_diff) print(ic) plot(ic) ``` @@ -88,7 +88,7 @@ Estimation can then simply be done using the `DFM()` function with parameters `r ```{r} # Estimating the model with 4 factors and 3 lags using BM14's EM algorithm -model1 = DFM(BM14_M_diff, r = 4, p = 3) +model1 <- DFM(BM14_M_diff, r = 4, p = 3) print(model1) plot(model1) ``` @@ -134,7 +134,7 @@ DFM forecasts can be obtained with the `predict()` method, which dynamically for ```{r} # 12-period ahead DFM forecast -fc = predict(model1, h = 12) +fc <- predict(model1, h = 12) print(fc) ``` @@ -156,7 +156,7 @@ head(as.data.frame(fc, pivot = "wide")) ## Estimation with Mixed Frequency -*dfms* currently provides no specific adjustments for data at different frequencies. An algorithm that accommodates monthly and quarterly series is planned for summer 2023. In the meantime, users may choose to block the data (creating multiple quarterly series from a monthly series, and duplicating quarterly series to maintain equal representation). +Since v0.3.0 *dfms* allows monthly and quarterly mixed frequency estimation following Mariano & Murasawa (2003) and Banbura & Modugno (2014). Quarterly variables should be to the right of the monthly variables in the data matrix and need to be indicated using the `quarterly.vars` argument. Quarterly observations should be provided every 3rd period. -*dfms* provides efficient estimation of Dynamic Factor Models via the EM Algorithm. Estimation can be done in 3 different ways following: +*dfms* provides efficient estimation of Dynamic Factor Models via the EM Algorithm. Factors are assumed to follow a stationary VAR + process of order `p`. Estimation can be done in 3 different ways following: - Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic factor models based on Kalman filtering. *Journal of Econometrics, 164*(1), 188-205. @@ -27,7 +28,7 @@ The package is fully functional though, and you are very welcome to install it u The default is `em.method = "auto"`, which chooses `"BM"` following Banbura & Modugno (2014) with missing data or mixed frequency, and `"DGR"` following Doz, Giannone & Reichlin (2012) otherwise. Using `em.method = "none"` generates Two-Step estimates following Doz, Giannone & Reichlin (2011). This is extremely efficient on bigger datasets. PCA and Two-Step estimates are also reported in EM-estimation. All methods support missing data, but `em.method = "DGR"` does not model them in EM iterations. -The package is stable, but functionality may expand in the future. In particular, mixed-frequency estimation with autoregressive errors is planned for the near future, and generation of the 'news' may be added in the further future. +The package is currently stable, but functionality may expand in the future. In particular, mixed-frequency estimation with autoregressive errors is planned for the near future, and generation of the 'news' may be added in the further future. ### Comparison with Other R Packages @@ -56,7 +57,7 @@ install.packages('dfms', repos = c('https://sebkrantz.r-universe.dev', 'https:// library(dfms) # Fit DFM with 6 factors and 3 lags in the transition equation -mod = DFM(diff(BM14_M), r = 6, p = 3) +mod <- DFM(diff(BM14_M), r = 6, p = 3) ``` ``` @@ -158,7 +159,7 @@ as.data.frame(mod) |> head() ```r # Forecasting 20 periods ahead -fc = predict(mod, h = 20) +fc <- predict(mod, h = 20) # 'dfm_forecast' methods plot(fc)