This toolkit has been designed for conducting data quality assessments on clinical datasets modeled using the OMOP common data model. The toolkit includes a wide variety of data quality checks and a GitHub-based issue reporting mechanism. The toolkit is being routinely used by the PEDSnet CDRN.
- Data: the data quality catalog of checks, summaries of previous data cycle, and acceptable valuesets for various fields.
- Doc: documentation and set up instruction for the program
- Infrastructure: constants and internal helper functions
- Library: contains data quality checks and utility functions
- Main: single and multi-variable data quality scripts
- Resources: configuration file
- Tools: scripts for GitHub-based feedback generation
R version 3.2.x or above, 64-bit (Comprehensive R Archive Network)
install.packages(c("DBI","yaml","ggplot2","RJDBC","devtools","futile.logger","plyr","dplyr","dbplyr","lubridate", "testthat"))
install.packages("RPostgres")
library(devtools)
install_github("baileych/ohdsi-argos")
- Minimum Versions Required:
- R: 3.2
- DBI: 0.7
- dplyr: 0.7
- dbplyr: 1.2
- readr: 1.1
- rlang: 0.1.4
- stringr: 1.2
- The
RPostgrespackage is not required if PostgreSQL is not the target database type - For Oracle users, the
ROraclepackage should be installed
For troubleshooting with install_github("ohdsi/SqlRender"), please see here.
Note: if previously installed, run update.packages() to get the latest version of each library