From edcd26d26c2723426b4060524508d818c2608c3d Mon Sep 17 00:00:00 2001 From: Katie Healy Date: Tue, 22 Nov 2022 15:31:38 -0600 Subject: [PATCH 01/26] Update CONTRIBUTING.Rmd Made minor edits to correct grammar mistakes in CONTRIBUTING vignette. --- vignettes/CONTRIBUTING.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/vignettes/CONTRIBUTING.Rmd b/vignettes/CONTRIBUTING.Rmd index 1e17d74e5..b13df6d65 100644 --- a/vignettes/CONTRIBUTING.Rmd +++ b/vignettes/CONTRIBUTING.Rmd @@ -64,7 +64,7 @@ install.packages(c("devtools", "rmarkdown")) ## Issues - If you see an error or have feedback, the best way to let us know is to file an issue. -- Issues are labeled to help indicate what they are about. For example, we are using "Good First Issue" to indicate issues that be be good first pickings for your first contribution to this open-source project. +- Issues are labeled to help indicate what they are about. For example, we are using "Good First Issue" to indicate issues that might be good first pickings for your first contribution to this open-source project. - Pull requests can be directly linked to a specific issue. If linked, the Repository Administrators can more easily review the pull request and issue at the same time once a contributor submits the pull request. The issue can then be closed once the pull request is merged. ## Branches @@ -75,7 +75,7 @@ To contribute a specific change or new code, outside contributors can fork this "Tasks" should be small in scope. For example, they may pertain to a bug fix or update relevant to a single function. A single "task" may also encompass the same changes made across many functions if needed. Another example of a single "task" could be to make changes to all documentation to improve clarity, for example. Furthermore, a task may include developing a new function, or a series of related functions. In some cases, tasks can also be synonymous with issues, and the pull requests can be directly linked to a specific issue (in that case, the Repository Administrators will review the pull request and issue at the same time and the issue can be closed once the pull request is merged). -Complete the pull request by detailing all fixes and contributions, and tagging TADA repo admins who should review the work. For this package, please tag cristinamullin (Cristina Mullin) and mthawley (Shelly Thawley). Repository Administrators will review code contributions from external collaborators and integrate code commits into source code. This is done to ensure code stability and consistency and prevent degradation of code performance. After review, the admin will either accept the submission, recommend specific improvements to the submission, or in some cases reject the submission. To avoid issues, developers contributing code should contact the repository admins (Cristina or) early in the development process and maintain contact throughout to help ensure the submission is compatible with the code base and is a robust addition. +Complete the pull request by detailing all fixes and contributions, and tagging TADA repo admins who should review the work. For this package, please tag cristinamullin (Cristina Mullin) and mthawley (Shelly Thawley). Repository Administrators will review code contributions from external collaborators and integrate code commits into source code. This is done to ensure code stability and consistency and prevent degradation of code performance. After review, the admin will either accept the submission, recommend specific improvements to the submission, or in some cases reject the submission. To avoid issues, developers contributing code should contact the repository admins (Cristina or Shelly) early in the development process and maintain contact throughout to help ensure the submission is compatible with the code base and is a robust addition. ## Additional References From 50fdc1d829ac4b69b7a99f82c9b6fa702f1c0f3a Mon Sep 17 00:00:00 2001 From: Katie Healy Date: Tue, 22 Nov 2022 16:24:29 -0600 Subject: [PATCH 02/26] Update WQPDataHarmonization.Rmd Made minor edits to correct grammar mistakes in WQPDataHarmonization vignette. --- vignettes/WQPDataHarmonization.Rmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/vignettes/WQPDataHarmonization.Rmd b/vignettes/WQPDataHarmonization.Rmd index 3e5fac82f..aea81b349 100644 --- a/vignettes/WQPDataHarmonization.Rmd +++ b/vignettes/WQPDataHarmonization.Rmd @@ -83,7 +83,7 @@ library(TADA) ## Retrieve WQP data WQP data is retrieved and processed for compatibility with TADA. This -function, TADAdataRetrieval builds on the USGS dataRetrieval package +function, TADAdataRetrieval, builds on the USGS dataRetrieval package functions. It joins three WQP profiles (i.e., the station, narrow, and phys/chem), changes all data in the Characteristic, Speciation, Fraction, and Unit fields to uppercase, removes true duplicates, removes @@ -136,12 +136,12 @@ the console for more details): preserved in these new fields, "ResultMeasureValue.Original" and "DetectionLimitMeasureValue.Original". Additionally, "TADA.ResultMeasureValue.Flag" and - "TADA.DetectionLimitMeasureValue.Flag" are created to track and + "TADA.DetectionLimitMeasureValue.Flag" are created to track any changes made to the "ResultMeasureValue" and "DetectionLimitMeasureValue" columns; and to provide information about the result values that is needed to address censored data later on (i.e., nondetections). Specifically, these new columns flag - if special characters that are included in result values, and + if special characters are included in result values, and specifies what the special characters are. - ResultMeasureValue.Original From e01ad6f06bea98bca5d5462010a736580fdbbf41 Mon Sep 17 00:00:00 2001 From: cristinamullin Date: Fri, 9 Dec 2022 10:15:23 -0500 Subject: [PATCH 03/26] InvalidFraction example Added example to InvalidFraction function in ResultFlagsDependent.R --- R/ResultFlagsDependent.R | 7 +++++++ man/InvalidFraction.Rd | 8 ++++++++ 2 files changed, 15 insertions(+) diff --git a/R/ResultFlagsDependent.R b/R/ResultFlagsDependent.R index 35ccacaae..4f6d7a9be 100644 --- a/R/ResultFlagsDependent.R +++ b/R/ResultFlagsDependent.R @@ -15,6 +15,13 @@ #' "Invalid" rows are removed from the dataframe and no column will be appended. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' Nutrients_Utah_1 <- InvalidFraction(Nutrients_Utah) +#' +#' Nutrients_Utah_2 <- InvalidFraction(Nutrients_Utah, clean = FALSE) #' diff --git a/man/InvalidFraction.Rd b/man/InvalidFraction.Rd index ad0a1305c..e02e618dc 100644 --- a/man/InvalidFraction.Rd +++ b/man/InvalidFraction.Rd @@ -24,3 +24,11 @@ Function checks the validity of each characteristic-fraction combination in the dataframe. When clean = TRUE, rows with invalid characteristic-fraction combinations are removed. Default is clean = TRUE. } +\examples{ +data(Nutrients_Utah) + +Nutrients_Utah_1 <- InvalidFraction(Nutrients_Utah) + +Nutrients_Utah_2 <- InvalidFraction(Nutrients_Utah, clean = FALSE) + +} From 7b8157609d84970bd8da23a77e8be05cab82b454 Mon Sep 17 00:00:00 2001 From: Katie Healy Date: Fri, 9 Dec 2022 16:32:27 -0600 Subject: [PATCH 04/26] Added examples to completed functions Added examples to functions in ResultFlagsDependent, ResultFlagsIndependent, Transformations (except HarmonizeCensoredData and CensoredDataStats), and Filtering. --- R/Filtering.R | 24 +++++++++++++++ R/ResultFlagsDependent.R | 16 ++++++++-- R/ResultFlagsIndependent.R | 61 +++++++++++++++++++++++++++++++++++++- R/Transformations.R | 36 +++++++++++++++++++++- 4 files changed, 133 insertions(+), 4 deletions(-) diff --git a/R/Filtering.R b/R/Filtering.R index 283cb70ac..014fd722c 100644 --- a/R/Filtering.R +++ b/R/Filtering.R @@ -12,6 +12,10 @@ TADA.env <- new.env() #' #' @export #' +#' @examples +#' data(Nutrients_Utah) +#' +#' Fields_Nutrients_Utah <- FilterFields(Nutrients_Utah) FilterFields <- function(.data) { # check .data is data.frame @@ -90,6 +94,10 @@ FilterFields <- function(.data) { #' #' @export #' +#' @examples +#' data(Nutrients_Utah) +#' +#' FieldReview_HydrologicCondition <- FilterFieldReview(field = "HydrologicCondition", Nutrients_Utah) FilterFieldReview <- function(field, .data) { # if provided, check .data is data.frame @@ -136,7 +144,13 @@ FilterFieldReview <- function(field, .data) { #' @return A list of unique characteristics and their counts #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' ParameterList <- FilterParList(Nutrients_Utah) #' + FilterParList <- function(.data) { # count the frequency of each value in CharactersticName field ParValueCount <- data.frame(table(list(.data$CharacteristicName))) @@ -162,6 +176,11 @@ FilterParList <- function(.data) { #' subset by a parameter. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' AmmoniaFields <- FilterParFields(Nutrients_Utah, parameter = "AMMONIA") #' FilterParFields <- function(.data, parameter) { @@ -261,6 +280,11 @@ FilterParFields <- function(.data, parameter) { #' @return A table and pie chart of unique values in the selected field. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' AmmoniaMonitoringLocations <- FilterParFieldReview(field = "MonitoringLocationIdentifier", Nutrients_Utah, parameter = "AMMONIA") #' FilterParFieldReview <- function(field, .data, parameter) { diff --git a/R/ResultFlagsDependent.R b/R/ResultFlagsDependent.R index 4f6d7a9be..c3e46482f 100644 --- a/R/ResultFlagsDependent.R +++ b/R/ResultFlagsDependent.R @@ -19,9 +19,9 @@ #' @examples #' data(Nutrients_Utah) #' -#' Nutrients_Utah_1 <- InvalidFraction(Nutrients_Utah) +#' InvalidFraction_clean <- InvalidFraction(Nutrients_Utah) #' -#' Nutrients_Utah_2 <- InvalidFraction(Nutrients_Utah, clean = FALSE) +#' InvalidFraction_flags <- InvalidFraction(Nutrients_Utah, clean = FALSE) #' @@ -115,6 +115,12 @@ InvalidFraction <- function(.data, clean = TRUE) { #' #' @export #' +#' @examples +#' data(Nutrients_Utah) +#' +#' InvalidSpeciation_clean <- InvalidSpeciation(Nutrients_Utah) +#' +#' InvalidSpeciation_flags <- InvalidSpeciation(Nutrients_Utah, clean = FALSE) InvalidSpeciation <- function(.data, clean = TRUE) { @@ -205,6 +211,12 @@ InvalidSpeciation <- function(.data, clean = TRUE) { #' #' @export #' +#' @examples +#' data(Nutrients_Utah) +#' +#' ResultUnitValidity_clean <- InvalidResultUnit(Nutrients_Utah) +#' +#' ResultUnitValidity_flags <- InvalidResultUnit(Nutrients_Utah, clean = FALSE) InvalidResultUnit <- function(.data, clean = TRUE) { diff --git a/R/ResultFlagsIndependent.R b/R/ResultFlagsIndependent.R index 9aa1d42bd..fc10a955c 100644 --- a/R/ResultFlagsIndependent.R +++ b/R/ResultFlagsIndependent.R @@ -19,7 +19,12 @@ #' #' @export #' +#' @examples +#' data(Nutrients_Utah) #' +#' InvalidMethod_clean <- InvalidMethod(Nutrients_Utah) +#' +#' InvalidMethod_flags <- InvalidMethod(Nutrients_Utah, clean = FALSE) InvalidMethod <- function(.data, clean = TRUE) { # check .data is data.frame @@ -123,6 +128,13 @@ InvalidMethod <- function(.data, clean = TRUE) { #' continuous data is removed from the dataframe. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' AggContinuous_clean <- AggregatedContinuousData(Nutrients_Utah) +#' +#' AggContinuous_flags <- AggregatedContinuousData(Nutrients_Utah, clean = FALSE) AggregatedContinuousData <- function(.data, clean = TRUE) { # check .data is data.frame @@ -197,6 +209,13 @@ AggregatedContinuousData <- function(.data, clean = TRUE) { #' and no column is appended. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' PotentialDup_clean <- PotentialDuplicateRowID(Nutrients_Utah) +#' +#' PotentialDup_flags <- PotentialDuplicateRowID(Nutrients_Utah, clean = FALSE) PotentialDuplicateRowID <- function(.data, clean = TRUE) { # check .data is data.frame @@ -296,6 +315,13 @@ PotentialDuplicateRowID <- function(.data, clean = TRUE) { #' #' @export #' +#' @examples +#' data(Nutrients_Utah) +#' +#' WQXUpperThreshold_clean <- AboveNationalWQXUpperThreshold(Nutrients_Utah) +#' +#' WQXUpperThreshold_flags <- AboveNationalWQXUpperThreshold(Nutrients_Utah, clean = FALSE) +#' AboveNationalWQXUpperThreshold <- function(.data, clean = TRUE) { # check .data is data.frame @@ -400,6 +426,13 @@ AboveNationalWQXUpperThreshold <- function(.data, clean = TRUE) { #' WQX threshold is removed from the dataframe. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' WQXLowerThreshold_clean <- BelowNationalWQXLowerThreshold(Nutrients_Utah) +#' +#' WQXLowerThreshold_flags <- BelowNationalWQXLowerThreshold(Nutrients_Utah, clean = FALSE) BelowNationalWQXLowerThreshold <- function(.data, clean = TRUE) { # check .data is data.frame @@ -516,6 +549,13 @@ BelowNationalWQXLowerThreshold <- function(.data, clean = TRUE) { #' dataframe. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' QAPPapproved_clean <- QAPPapproved(Nutrients_Utah) +#' +#' QAPPapproved_cleanNAs <- QAPPapproved(Nutrients_Utah, cleanNA = TRUE) #' QAPPapproved <- function(.data, clean = TRUE, cleanNA = FALSE) { @@ -570,7 +610,15 @@ QAPPapproved <- function(.data, clean = TRUE, cleanNA = FALSE) { #' data without an associated QAPP document is removed from the dataframe. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' QAPP_URLs_added <- QAPPDocAvailable(Nutrients_Utah) +#' +#' QAPP_URLs_clean <- QAPPDocAvailable(Nutrients_Utah, clean = TRUE) #' + QAPPDocAvailable <- function(.data, clean = FALSE) { # check .data is data.frame checkType(.data, "data.frame", "Input object") @@ -629,7 +677,7 @@ QAPPDocAvailable <- function(.data, clean = FALSE) { #' the row will be flagged as "LONG_OutsideUSA", 3) If the latitude or longitude #' contains the string, "999", the row will be flagged as invalid, and 4) Finally, #' precision can be measured by the number of decimal places in the latitude and longitude -#' provided. If either the lattitue or longitude does not have any numbers to the +#' provided. If either the latitude or longitude does not have any numbers to the #' right of the decimal point, the row will be flagged as "Imprecise". #' #' @param .data TADA dataframe @@ -644,6 +692,17 @@ QAPPDocAvailable <- function(.data, clean = FALSE) { #' removed, respectively. #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' InvalidCoord_flags <- InvalidCoordinates(Nutrients_Utah) +#' +#' OutsideUSACoord_removed <- InvalidCoordinates(Nutrients_Utah, clean_outsideUSA = TRUE) +#' +#' ImpreciseCoord_removed <- InvalidCoordinates(Nutrients_Utah, clean_imprecise = TRUE) +#' +#' InvalidCoord_removed <- InvalidCoordinates(Nutrients_Utah, clean_outsideUSA = TRUE, clean_imprecise = TRUE) #' InvalidCoordinates <- function(.data, clean_outsideUSA = FALSE, clean_imprecise = FALSE) { diff --git a/R/Transformations.R b/R/Transformations.R index ba12f8ed9..80798ce6a 100644 --- a/R/Transformations.R +++ b/R/Transformations.R @@ -50,6 +50,13 @@ #' following two fields to the input dataframe: "WQX.ConversionFactor" and "WQX.TargetUnit". #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' ResultUnitsConverted <- ConvertResultUnits(Nutrients_Utah) +#' +#' ResultUnitsNotConverted <- ConvertResultUnits(Nutrients_Utah, transform = FALSE) ConvertResultUnits <- function(.data, transform = TRUE) { # check .data is data.frame @@ -320,12 +327,23 @@ ConvertResultUnits <- function(.data, transform = TRUE) { #' done at this time. A user can review the conversion factor information if #' desired by using this feature. #' -#' @return When transform = true, the input dataframe is returned with all depth +#' @return When transform = TRUE, the input dataframe is returned with all depth #' data converted to the target unit; no additional columns are added. #' When transform = FALSE, the input dataframe is returned with additional #' columns including... be specific here ... #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' DepthUnitsConverted_m <- ConvertDepthUnits(Nutrients_Utah) +#' +#' DepthUnitsConverted_ft <- ConvertDepthUnits(Nutrients_Utah, unit = "ft") +#' +#' TopDepthUnitsConverted_in <- ConvertDepthUnits(Nutrients_Utah, unit = "in", fields = "ActivityTopDepthHeightMeasure") +#' +#' DepthUnitsNotConverted <- ConvertDepthUnits(Nutrients_Utah, transform = FALSE) #' ConvertDepthUnits <- function(.data, @@ -615,6 +633,13 @@ ConvertDepthUnits <- function(.data, #' @return Harmonization Reference Table unique to the input dataframe #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' CreateRefTable <- HarmonizationRefTable(Nutrients_Utah) +#' +#' DownloadRefTable <- HarmonizationRefTable(Nutrients_Utah, download = TRUE) HarmonizationRefTable <- function(.data, download = FALSE) { # check .data is data.frame @@ -721,6 +746,15 @@ HarmonizationRefTable <- function(.data, download = FALSE) { #' would return the input dataframe unchanged if input was allowed). #' #' @export +#' +#' @examples +#' data(Nutrients_Utah) +#' +#' Nutrients_Harmonized <- HarmonizeData(Nutrients_Utah) +#' +#' Nutrients_Harmonized_noflags <- HarmonizeData(Nutrients_Utah, flag = FALSE) +#' +#' Nutrients_NotHarmonized <- HarmonizeData(Nutrients_Utah, transform = FALSE) HarmonizeData <- function(.data, ref, transform = TRUE, flag = TRUE) { # check .data is data.frame From a854527ceb6cb8bb8e1d5981b401f22caf81f4a9 Mon Sep 17 00:00:00 2001 From: Katie Healy Date: Fri, 16 Dec 2022 12:33:55 -0600 Subject: [PATCH 05/26] Update ResultFlagsIndependent.R Updated QAPPapproved function to add warning for when clean & cleanNA are FALSE. Also updated examples for QAPPDocAvailable function to improve clarity. --- R/ResultFlagsIndependent.R | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/R/ResultFlagsIndependent.R b/R/ResultFlagsIndependent.R index fc10a955c..4041f7252 100644 --- a/R/ResultFlagsIndependent.R +++ b/R/ResultFlagsIndependent.R @@ -585,6 +585,9 @@ QAPPapproved <- function(.data, clean = TRUE, cleanNA = FALSE) { warning("All QAPPApprovedIndicator data is NA") } } + if (clean == FALSE & cleanNA == FALSE) { + warning("No changes were made because clean and cleanNA were FALSE") + } return(.data) } @@ -614,9 +617,9 @@ QAPPapproved <- function(.data, clean = TRUE, cleanNA = FALSE) { #' @examples #' data(Nutrients_Utah) #' -#' QAPP_URLs_added <- QAPPDocAvailable(Nutrients_Utah) +#' FlagData_MissingQAPPDocURLs <- QAPPDocAvailable(Nutrients_Utah) #' -#' QAPP_URLs_clean <- QAPPDocAvailable(Nutrients_Utah, clean = TRUE) +#' RemoveData_MissingQAPPDocURLs <- QAPPDocAvailable(Nutrients_Utah, clean = TRUE) #' QAPPDocAvailable <- function(.data, clean = FALSE) { From 829effce9515c8271701fbe5fd205777e48d4caa Mon Sep 17 00:00:00 2001 From: Katie Healy Date: Fri, 16 Dec 2022 17:16:26 -0600 Subject: [PATCH 06/26] Added comments to examples for functions. Added comments to examples for functions: ResultFlagsDependent.R, ResultFlagsIndependent.R, Transformations.R (except HarmonizeCensoredData and CensoredDataStats), and Filtering.R. --- R/Filtering.R | 13 ++++++++++--- R/ResultFlagsDependent.R | 12 ++++++++++++ R/ResultFlagsIndependent.R | 35 ++++++++++++++++++++++++++++++++--- R/Transformations.R | 23 ++++++++++++++++++++++- 4 files changed, 76 insertions(+), 7 deletions(-) diff --git a/R/Filtering.R b/R/Filtering.R index 014fd722c..facf63b2d 100644 --- a/R/Filtering.R +++ b/R/Filtering.R @@ -13,8 +13,10 @@ TADA.env <- new.env() #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Create a table of fields and count of unique values in each field: #' Fields_Nutrients_Utah <- FilterFields(Nutrients_Utah) FilterFields <- function(.data) { @@ -95,8 +97,10 @@ FilterFields <- function(.data) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Create table and pie chart of "Hydrologic Condition" unique values and counts: #' FieldReview_HydrologicCondition <- FilterFieldReview(field = "HydrologicCondition", Nutrients_Utah) FilterFieldReview <- function(field, .data) { @@ -146,10 +150,11 @@ FilterFieldReview <- function(field, .data) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Create a list of parameters in the dataset and the number of records of each paramter: #' ParameterList <- FilterParList(Nutrients_Utah) -#' FilterParList <- function(.data) { # count the frequency of each value in CharactersticName field @@ -178,10 +183,11 @@ FilterParList <- function(.data) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Create list of fields for parameter "AMMONIA" with number of unique values in each field: #' AmmoniaFields <- FilterParFields(Nutrients_Utah, parameter = "AMMONIA") -#' FilterParFields <- function(.data, parameter) { # check .data is data.frame @@ -282,10 +288,11 @@ FilterParFields <- function(.data, parameter) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Create table and pie chart of monitoring locations for the parameter "AMMONIA" in dataframe: #' AmmoniaMonitoringLocations <- FilterParFieldReview(field = "MonitoringLocationIdentifier", Nutrients_Utah, parameter = "AMMONIA") -#' FilterParFieldReview <- function(field, .data, parameter) { # if provided, check .data is data.frame diff --git a/R/ResultFlagsDependent.R b/R/ResultFlagsDependent.R index c3e46482f..5ce08168e 100644 --- a/R/ResultFlagsDependent.R +++ b/R/ResultFlagsDependent.R @@ -17,10 +17,14 @@ #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Remove data with invalid characteristic-fraction combinations: #' InvalidFraction_clean <- InvalidFraction(Nutrients_Utah) #' +#' # Flag, but do not remove, data with invalid characteristic-fraction combinations +#' # in new column titled "WQX.SampleFractionValidity": #' InvalidFraction_flags <- InvalidFraction(Nutrients_Utah, clean = FALSE) #' @@ -116,10 +120,14 @@ InvalidFraction <- function(.data, clean = TRUE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Remove data with invalid characteristic-method speciation combinations from dataframe: #' InvalidSpeciation_clean <- InvalidSpeciation(Nutrients_Utah) #' +#' # Flag, but do not remove, data with invalid characteristic-method speciation +#' # combinations in new column titled "WQX.MethodSpeciationValidity": #' InvalidSpeciation_flags <- InvalidSpeciation(Nutrients_Utah, clean = FALSE) @@ -212,10 +220,14 @@ InvalidSpeciation <- function(.data, clean = TRUE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Remove invalid characteristic-media-result unit combinations from dataframe: #' ResultUnitValidity_clean <- InvalidResultUnit(Nutrients_Utah) #' +#' # Flag, but do not remove, invalid characteristic-media-result unit combinations +#' # in new column titled "WQX.ResultUnitValidity": #' ResultUnitValidity_flags <- InvalidResultUnit(Nutrients_Utah, clean = FALSE) diff --git a/R/ResultFlagsIndependent.R b/R/ResultFlagsIndependent.R index 4041f7252..48cc5ef00 100644 --- a/R/ResultFlagsIndependent.R +++ b/R/ResultFlagsIndependent.R @@ -20,10 +20,14 @@ #' @export #' #' @examples +#' # Load example dataset #' data(Nutrients_Utah) #' +#' # Remove invalid characteristic-analytical method combinations from dataframe: #' InvalidMethod_clean <- InvalidMethod(Nutrients_Utah) #' +#' # Flag, but do not remove, invalid characteristic-analytical method combinations +#' # in new column titled "WQX.AnalyticalMethodValidity": #' InvalidMethod_flags <- InvalidMethod(Nutrients_Utah, clean = FALSE) InvalidMethod <- function(.data, clean = TRUE) { @@ -130,10 +134,13 @@ InvalidMethod <- function(.data, clean = TRUE) { #' @export #' #' @examples +#' # Load example dataset #' data(Nutrients_Utah) #' +#' # Remove aggregated continuous data from dataframe: #' AggContinuous_clean <- AggregatedContinuousData(Nutrients_Utah) #' +#' # Flag, but do not remove, aggregated continuous data in new column titled "TADA.AggregatedContinuousData": #' AggContinuous_flags <- AggregatedContinuousData(Nutrients_Utah, clean = FALSE) AggregatedContinuousData <- function(.data, clean = TRUE) { @@ -211,10 +218,13 @@ AggregatedContinuousData <- function(.data, clean = TRUE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Remove potential duplicate data from dataframe: #' PotentialDup_clean <- PotentialDuplicateRowID(Nutrients_Utah) #' +#' # Flag, but do not remove, potential duplicate data in new column titled "TADA.PotentialDupRowID": #' PotentialDup_flags <- PotentialDuplicateRowID(Nutrients_Utah, clean = FALSE) PotentialDuplicateRowID <- function(.data, clean = TRUE) { @@ -316,12 +326,15 @@ PotentialDuplicateRowID <- function(.data, clean = TRUE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Remove data that is above the upper WQX threshold from dataframe: #' WQXUpperThreshold_clean <- AboveNationalWQXUpperThreshold(Nutrients_Utah) #' +#' # Flag, but do not remove, data that is above the upper WQX threshold in +#' # new column titled "AboveWQXUpperThreshold": #' WQXUpperThreshold_flags <- AboveNationalWQXUpperThreshold(Nutrients_Utah, clean = FALSE) -#' AboveNationalWQXUpperThreshold <- function(.data, clean = TRUE) { # check .data is data.frame @@ -428,10 +441,14 @@ AboveNationalWQXUpperThreshold <- function(.data, clean = TRUE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Remove data that is below the lower WQX threshold from the dataframe: #' WQXLowerThreshold_clean <- BelowNationalWQXLowerThreshold(Nutrients_Utah) #' +#' # Flag, but do not remove, data that is below the lower WQX threshold in +#' # new column titled "BelowWQXLowerThreshold": #' WQXLowerThreshold_flags <- BelowNationalWQXLowerThreshold(Nutrients_Utah, clean = FALSE) BelowNationalWQXLowerThreshold <- function(.data, clean = TRUE) { @@ -551,12 +568,17 @@ BelowNationalWQXLowerThreshold <- function(.data, clean = TRUE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Remove data where the QAPPApprovedIndicator equals "N", but retain data +#' # where the QAPPApprovedIndicator equals "NA": #' QAPPapproved_clean <- QAPPapproved(Nutrients_Utah) #' +#' # Remove data where the QAPPApprovedIndicator equals "N" or "NA": #' QAPPapproved_cleanNAs <- QAPPapproved(Nutrients_Utah, cleanNA = TRUE) #' +#' # Note: When clean = FALSE and cleanNA = FALSE, no data is removed QAPPapproved <- function(.data, clean = TRUE, cleanNA = FALSE) { # check .data is data.frame @@ -615,12 +637,15 @@ QAPPapproved <- function(.data, clean = TRUE, cleanNA = FALSE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Flag, but do not remove, data without an associated QAPP document in +#' # new column titled "TADA.QAPPDocAvailable": #' FlagData_MissingQAPPDocURLs <- QAPPDocAvailable(Nutrients_Utah) #' +#' # Remove data without an associated QAPP document available: #' RemoveData_MissingQAPPDocURLs <- QAPPDocAvailable(Nutrients_Utah, clean = TRUE) -#' QAPPDocAvailable <- function(.data, clean = FALSE) { # check .data is data.frame @@ -697,16 +722,20 @@ QAPPDocAvailable <- function(.data, clean = FALSE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Flag, but do not remove, data with invalid coordinates in new column titled "TADA.InvalidCoordinates": #' InvalidCoord_flags <- InvalidCoordinates(Nutrients_Utah) #' +#' # Remove data with coordinates outside the USA, but keep flagged data with imprecise coordinates: #' OutsideUSACoord_removed <- InvalidCoordinates(Nutrients_Utah, clean_outsideUSA = TRUE) #' +#' # Remove data with imprecise coordinates, but keep flagged data with coordinates outside the USA: #' ImpreciseCoord_removed <- InvalidCoordinates(Nutrients_Utah, clean_imprecise = TRUE) #' +#' # Remove data with imprecise coordinates or coordinates outside the USA from the dataframe: #' InvalidCoord_removed <- InvalidCoordinates(Nutrients_Utah, clean_outsideUSA = TRUE, clean_imprecise = TRUE) -#' InvalidCoordinates <- function(.data, clean_outsideUSA = FALSE, clean_imprecise = FALSE) { # check .data is data.frame diff --git a/R/Transformations.R b/R/Transformations.R index 80798ce6a..cf2a18f41 100644 --- a/R/Transformations.R +++ b/R/Transformations.R @@ -52,10 +52,16 @@ #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Convert result and detection limit values and units to WQX target units and +#' # add two new columns titled "ResultMeasureUnitCode.Original" and +#' # "DetectionLimitMeasureUnitCode.Original" to retain the original result and unit values: #' ResultUnitsConverted <- ConvertResultUnits(Nutrients_Utah) #' +#' # Do not convert result values and units, but add two new columns titled +#' # "WQX.ConversionFactor" and "WQX.TargetUnit": #' ResultUnitsNotConverted <- ConvertResultUnits(Nutrients_Utah, transform = FALSE) ConvertResultUnits <- function(.data, transform = TRUE) { @@ -335,16 +341,21 @@ ConvertResultUnits <- function(.data, transform = TRUE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Convert all depth units to meters: #' DepthUnitsConverted_m <- ConvertDepthUnits(Nutrients_Utah) #' +#' # Convert all depth units to feet: #' DepthUnitsConverted_ft <- ConvertDepthUnits(Nutrients_Utah, unit = "ft") #' +#' # Convert only the "ActivityTopDepthHeightMeasure" field to inches: #' TopDepthUnitsConverted_in <- ConvertDepthUnits(Nutrients_Utah, unit = "in", fields = "ActivityTopDepthHeightMeasure") #' +#' # Do not convert any depth units, but add columns for target units and +#' # conversion factors for each depth measure: #' DepthUnitsNotConverted <- ConvertDepthUnits(Nutrients_Utah, transform = FALSE) -#' ConvertDepthUnits <- function(.data, unit = "m", @@ -635,10 +646,13 @@ ConvertDepthUnits <- function(.data, #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Create a harmonization reference table for dataframe: #' CreateRefTable <- HarmonizationRefTable(Nutrients_Utah) #' +#' # Create and download a harmonization reference table for dataframe: #' DownloadRefTable <- HarmonizationRefTable(Nutrients_Utah, download = TRUE) HarmonizationRefTable <- function(.data, download = FALSE) { @@ -748,12 +762,19 @@ HarmonizationRefTable <- function(.data, download = FALSE) { #' @export #' #' @examples +#' # Load example dataset: #' data(Nutrients_Utah) #' +#' # Append harmonization reference table columns to dataframe and transform/convert +#' # data to the reference table values: #' Nutrients_Harmonized <- HarmonizeData(Nutrients_Utah) #' +#' # Transform/convert data to the harmonization reference table values, but +#' # do not append any columns to dataframe: #' Nutrients_Harmonized_noflags <- HarmonizeData(Nutrients_Utah, flag = FALSE) #' +#' # Append harmonization reference table columns to dataframe, but do not +#' # transform/convert data to the reference table values: #' Nutrients_NotHarmonized <- HarmonizeData(Nutrients_Utah, transform = FALSE) HarmonizeData <- function(.data, ref, transform = TRUE, flag = TRUE) { From 2a6bdac465a9fe39d0fd7c378a0bc14c0ce1c894 Mon Sep 17 00:00:00 2001 From: cristinamullin Date: Fri, 16 Dec 2022 18:28:51 -0500 Subject: [PATCH 07/26] small changes Added more co-author information. Changed "assessment" to "analysis" in TADA --- DESCRIPTION | 8 +- docs/404.html | 2 +- docs/LICENSE.html | 2 +- docs/articles/CONTRIBUTING.html | 14 +-- docs/articles/WQPDataHarmonization.html | 30 +++--- .../figure-html/unnamed-chunk-10-1.png | Bin 48622 -> 48713 bytes .../figure-html/unnamed-chunk-20-1.png | Bin 44374 -> 44882 bytes .../figure-html/unnamed-chunk-22-1.png | Bin 19997 -> 19955 bytes .../figure-html/unnamed-chunk-9-1.png | Bin 112004 -> 112004 bytes docs/articles/index.html | 2 +- docs/authors.html | 12 +-- docs/index.html | 8 +- docs/pkgdown.yml | 2 +- docs/readme.html | 2 +- .../AboveNationalWQXUpperThreshold.html | 2 +- docs/reference/AggregatedContinuousData.html | 2 +- docs/reference/AutoFilter.html | 2 +- .../BelowNationalWQXLowerThreshold.html | 2 +- docs/reference/CensoredDataStats.html | 2 +- docs/reference/ConvertDepthUnits.html | 2 +- docs/reference/ConvertResultUnits.html | 2 +- docs/reference/CreateAnimatedMap.html | 2 +- docs/reference/FilterFieldReview.html | 2 +- docs/reference/FilterFields.html | 2 +- docs/reference/FilterParFieldReview.html | 2 +- docs/reference/FilterParFields.html | 2 +- docs/reference/FilterParList.html | 2 +- docs/reference/GenerateMap.html | 2 +- docs/reference/GetMeasureUnitRef.html | 2 +- docs/reference/GetWQXCharValRef.html | 2 +- docs/reference/HarmonizationRefTable.html | 2 +- docs/reference/HarmonizeCensoredData.html | 2 +- docs/reference/HarmonizeData.html | 2 +- docs/reference/InvalidCoordinates.html | 2 +- docs/reference/InvalidFraction.html | 15 ++- docs/reference/InvalidMethod.html | 2 +- docs/reference/InvalidResultUnit.html | 2 +- docs/reference/InvalidSpeciation.html | 2 +- docs/reference/JoinWQPProfiles.html | 2 +- .../MeasureValueSpecialCharacters.html | 2 +- docs/reference/Nutrients_Utah.html | 94 +++++++++++++++++ docs/reference/PotentialDuplicateRowID.html | 2 +- docs/reference/QAPPDocAvailable.html | 2 +- docs/reference/QAPPapproved.html | 2 +- docs/reference/RemoveEmptyColumns.html | 2 +- docs/reference/SummarizeCharacteristics.html | 2 +- docs/reference/TADABigdataRetrieval.html | 2 +- docs/reference/TADAdataRetrieval.html | 2 +- docs/reference/TADAprofileCheck.html | 2 +- docs/reference/UpdateMeasureUnitRef.html | 2 +- docs/reference/UpdateWQXCharValRef.html | 2 +- docs/reference/WQXCharValRef_Cached.html | 2 +- docs/reference/WQXunitRef_Cached.html | 2 +- docs/reference/WaterTemp_US.html | 98 ++++++++++++++++++ docs/reference/autoclean.html | 2 +- docs/reference/checkColumns.html | 2 +- docs/reference/checkType.html | 2 +- docs/reference/decimalnumcount.html | 2 +- docs/reference/decimalplaces.html | 2 +- docs/reference/index.html | 22 ++-- docs/reference/pipe.html | 2 +- docs/reference/readWQPwebservice.html | 2 +- docs/search.json | 2 +- docs/sitemap.xml | 6 ++ inst/CITATION | 12 ++- vignettes/WQPDataHarmonization.Rmd | 2 +- 66 files changed, 318 insertions(+), 103 deletions(-) create mode 100644 docs/reference/Nutrients_Utah.html create mode 100644 docs/reference/WaterTemp_US.html diff --git a/DESCRIPTION b/DESCRIPTION index 08ac4007e..423f18018 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: TADA Type: Package -Title: Tools for Automated Data Assessment R Package +Title: Tools for Automated Data Analysis R Package Version: 0.0.1 Organization: U.S. Environmental Protection Agency Authors@R: @@ -9,11 +9,11 @@ Authors@R: role = c("aut", "cre"), email = "mullin.cristina@epa.gov", comment = c(ORCID = "0000-0002-0615-6087")), + person(given = "Michelle", + family = "Thawley", + role = "aut"), person(given = "Jacob", family = "Greif", - role = "aut"), - person(given = "Michelle", - family = "Thawley", role = "aut"), person(given = "Laura", family = "Shumway", diff --git a/docs/404.html b/docs/404.html index 78cdd2392..c81211b92 100644 --- a/docs/404.html +++ b/docs/404.html @@ -75,7 +75,7 @@