-
-
Notifications
You must be signed in to change notification settings - Fork 106
Description
Submitting Author Name: Sebastian Krantz
Submitting Author Github Handle: @SebKrantz
Other Package Authors Github handles: @rbagd
Repository: https://github.com/SebKrantz/dfms
Version submitted: 0.1.2
Submission type: Stats
Badge grade: bronze
Editor: @noamross
Reviewers: @eeholmes, @santikka
Due date for @santikka: 2023-01-04
Archive: TBD
Version accepted: TBD
Language: en
- Paste the full DESCRIPTION file inside a code block below:
Package: dfms
Version: 0.1.2
Title: Dynamic Factor Models
Authors@R: c(person("Sebastian", "Krantz", role = c("aut", "cre"), email = "[email protected]"),
person("Rytis", "Bagdziunas", role = "aut"))
Description: Efficient estimation of Dynamic Factor Models using the Expectation Maximization (EM) algorithm
or Two-Step (2S) estimation, on datasets with missing data. The implementation follows advances in the econometric
literature: estimation can be done either by running the Kalman Filter and Smoother once with initial values
from PCA - following Doz, Giannone and Reichlin (2011) (2S) - or via iterated Kalman Filtering and Smoothing until EM
convergence - following Doz, Giannone and Reichlin (2012) - or using the adapted EM algorithm of Banbura and Modugno
(2014), allowing estimation with arbitrary patterns of missing data. The implementation makes heavy use of the
Armadillo C++ library and the collapse package, providing for particularly speedy estimation. A comprehensive set of
methods supports interpretation/visualization of the model and forecasting. Information criteria to choose the number
of factors are also provided - following Bai and Ng (2002).
--- Key References: ---
Doz, C., Giannone, D., & Reichlin, L. (2011). A two-step estimator for large approximate dynamic
factor models based on Kalman filtering. Journal of Econometrics, 164(1), 188-205.
Doz, C., Giannone, D., & Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate
dynamic factor models. Review of Economics and Statistics, 94(4), 1014-1024.
Banbura, M., & Modugno, M. (2014). Maximum likelihood estimation of factor models on datasets with arbitrary
pattern of missing data. Journal of Applied Econometrics, 29(1), 133-160.
URL: https://sebkrantz.github.io/dfms/
BugReports: https://github.com/SebKrantz/dfms/issues
Depends: R (>= 3.0.0)
Imports: Rcpp (>= 1.0.1), collapse (>= 1.8.0)
LinkingTo: Rcpp, RcppArmadillo
Suggests:
xts,
vars,
magrittr,
testthat (>= 3.0.0),
knitr,
rmarkdown,
covr
License: GPL-3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.1.2
Config/testthat/edition: 3
VignetteBuilder: knitr
Scope
-
Please indicate which of our statistical package categories this package falls under. (Please check one appropriate box below):
Statistical Packages
- Bayesian and Monte Carlo Routines
- Dimensionality Reduction, Clustering, and Unsupervised Learning
- Machine Learning
- Regression and Supervised Learning
- Exploratory Data Analysis (EDA) and Summary Statistics
- Spatial Analyses
- Time Series Analyses
Pre-submission Inquiry
- A pre-submission inquiry has been approved in issue #555
General Information
- Who is the target audience and what are scientific applications of this package?
Anybody working with time series. The package is useful for dimensionality reduction and forecasting with a large amount of time series.
-
Paste your responses to our General Standard G1.1 here, describing whether your software is:
- The first implementation of a novel algorithm; or
- The first implementation within R of an algorithm which has previously been implemented in other languages or contexts; or
- An improvement on other implementations of similar algorithms in R.
See README.md, dfms implements simple baseline versions of algorithms that have been around for a while in Matlab, and in other langaues (R, Python, Julia), but inside more elaborate nowcasting codes - thus not directly accessible, and less efficient. It is the only pure baseline implementation of the algorithms proposed by the 3 academic references mentioned in the description available for R and ready for CRAN.
Please include hyperlinked references to all other relevant software.
The software is actually a reboot and massive improvement upon dynfactoR, an abandoned software project. Generalizations of the functionality are provided by nowcasting and nowcastDFM, which fit dynamic factor models specific to mixed-frequency nowcasting applications. These packages are currently not on CRAN (they were archived) and also not very well maintained. Package MARSS can be used to fit dynamic factor models, but has a complicated API and fails on bigger datasets. The only really useful and well maintained dynamic factor modelling package for R is bayesdfa, which is also on CRAN, and fits bayesian dynamic factor models with Stan. I expect dfms to provide substantially faster estimation than bayesdfa. There are various other codes for Python and Julia on GitHub, including an implementation in the popular statsmodels library, but I did not engage with those as my primary tool remains R and I wanted to create an efficient baseline implementation for R that follows advances in the econometrics literature (PCA + EM Algorithm based estimation).
- (If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
Not applicable.
Badging
- What grade of badge are you aiming for? (bronze, silver, gold)
Bronze
- If aiming for silver or gold, describe which of the four aspects listed in the Guide for Authors chapter the package fulfils (at least one aspect for silver; three for gold)
Technical checks
Confirm each of the following by checking the box.
- I have read the rOpenSci packaging guide.
- I have read the author guide and I expect to maintain this package for at least 2 years or have another maintainer identified.
- I/we have read the Statistical Software Peer Review Guide for Authors.
- I/we have run
autotest
checks on the package, and ensured no tests fail. - The
srr_stats_pre_submit()
function confirms this package may be submitted. - The
pkgcheck()
function confirms this package may be submitted - alternatively, please explain reasons for any checks which your package is unable to pass.
There are still some autotest
issues, especially for the main DFM()
function, but I do not understand those as all inputs received the maximum extent of checking. See lines 211-226. I also don't understand the note in pkgcheck
requesting CI checks. The package receives CI through GitHub Actions (all plattforms) and test coverage is uploaded to codecov.io.
This package:
- does not violate the Terms of Service of any service it interacts with.
- has a CRAN and OSI accepted license.
- contains a README with instructions for installing the development version.
Publication options
- Do you intend for this package to go on CRAN?
- Do you intend for this package to go on Bioconductor?
Code of conduct
- I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.