Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@wokenny13
Copy link
Collaborator

First 2 reference files draft functions have been pushed through to test.

I am still working on the Mod 3 vignettes, however the R document contains a detailed explanation of what the two functions' goals are, and can be reviewed in the meantime while I continue to work on the Mod 3 vignettes.
 

  • Check to see if using argument input, excel = TRUE, creates the myfileRef spreadsheet in your downloads folder path.
  • Test out different argument inputs and test on other datasets.
  • Test out any warning/error messages from running the functions due to any invalid inputs to ensure no additional bugs.
  • Test out the general usability and user interface of the 2 functions.

First 2 reference files draft functions have been pushed through.
@wokenny13
Copy link
Collaborator Author

working on addressing some check issues that were found

Copy link
Collaborator

@hillarymarler hillarymarler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments about requested changes are in-line or discussed in our working session call.

Added a tab for org_name filtered paramter and use names from ATTAINS

Modified return values of ATTAINS.FlagParameterName
@wokenny13
Copy link
Collaborator Author

@cristinamullin @hillarymarler There are still some edits and formatting I would like to make, but I think this is in a good spot to review now. I will be out of the office tomorrow but will be back on Friday the 20th.

consolidate ATTAINS functions into one .R
fix mod 3, update example data and ref tables
@cristinamullin
Copy link
Collaborator

@wokenny13 can you look into this check note?

checking R code for possible problems ... [18s] NOTE
  TADA_GetEPA304aRef: no visible binding for global variable 'UNIT_NAME'
  Undefined global functions or variables:
    UNIT_NAME

cristinamullin and others added 6 commits February 5, 2025 19:45
Made code more concise and readable.
Reduced lines to less than 100 characters
global variable update.
pkgdown currently only supports html format
qmd is not supported by pkgdown
#' EPA 304a criteria) also need to include an additional column name:
#' 'organization_identifier'.
#'
#' @return A excel file or data frame which contains the columns:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wokenny13 this still returns TADA.CharacteristicName as well. I think that should be excluded from the crosswalk since we have TADA.ComparableDataIdentifier.

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove the column. I think the only reason I kept it on was because I wasn't sure if this would be useful at all to see it as a separate column in the output (even if TADA.ComparableDataIdentifier already includes the TADA.CharacteristicName).

For example, The EPA 304a pollutant names are being cross walked using the list of TADA Priority Characteristic name, so wasn't sure if that would have been useful to have that information for TADA.CharacteristicName as a separate column in this step. Or perhaps if it helped at all to compare how the characteristic name from TADA/WQX compare directly to an ATTAINS.ParameterName.

I don't see much value in those examples though and agree it should be removed.

@cristinamullin
Copy link
Collaborator

cristinamullin commented Feb 10, 2025

@hillarymarler @wokenny13 To make it easier to move back and forth seamlessly between WQP and ATTAINS, do you think we should name/format the TADA versions of the ATTAINS columns the exact same way they are already formatted in ATTAINS batch upload files... ? We can still include "ATTAINS." at the beginning so we can identify them easily but then that could be easily removed from all at once. This would be a consistent approach to how we create TADA. versions of the WQP columns.

For example:
ATTAINS.organization_identifier
ATTAINS.organization_name
ATTAINS.organization_type_text
ATTAINS.use_name
ATTAINS.parameter

Update: Same comment applied to CST columns

NA,"toluene","304A","Human Health","520","H",NA,NA,"O","µg/l"
NA,"toxaphene","304A","Human Health","7.0E-4","H",NA,NA,"W","µg/l"
NA,"toxaphene","304A","Human Health","7.1E-4","H",NA,NA,"O","µg/l"
NA,"trichloroethylene","304A","Human Health","0.6","H",NA,NA,"W","µg/l"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The units are not exporting to the csv correctly here, we may need to change these to match the WQP unit formatting. For example UG/L vs. µg/l.

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Greek letters do cause trouble when exporting to csv. When users get to 'defining the magnitude/criteria' piece, matching EPA304A polllutant names form the CST units to TADA.ComparableDataIdentifier units also becomes important.

Matching the WQP unit formatting will be good to do.

# rm(use_attainments, use_parameters)
#

ATTAINSParamUseOrgRef <- utils::read.csv(system.file("extdata", "ATTAINSParamUseEntityRef.csv", package = "EPATADA"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function appears to be incomplete. It should query ATTAINS to get the most up to date version of the ATTAINS Param Use Org Ref.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a static snapshot of ATTAINS Param-Use-Org reference as a csv file for now. The rATTAINS call may run into errors at times due to long runtime for some orgs like Pennsylvania as to why the function was commented out.

The plan is to use the National Extracts from EQ once those functions are in a good place, but for the time being we are using an old csv snapshot file that has this information for ATTAINS Param-Use-Org.

@hillarymarler
Copy link
Collaborator

@hillarymarler @wokenny13 To make it easier to move back and forth seamlessly between WQP and ATTAINS, do you think we should name/format the TADA versions of the ATTAINS columns the exact same way they are already formatted in ATTAINS batch upload files... ?

Yes, I think this is a good approach for both ATTAINS and CST columns.

The datatable::setnames function might be helpful as it allows renaming based on two columns in a df (original name, new name), so you can read in that crosswalk from a reference file. As I think there may be instances where the column name that you would read in from an ATTAINS profile is slightly different than the column name that is required for the batch upload file.

(https://tim-tiefenbach.de/post/2022-rename-columns/#datatable)

@wokenny13
Copy link
Collaborator Author

@hillarymarler @wokenny13 To make it easier to move back and forth seamlessly between WQP and ATTAINS, do you think we should name/format the TADA versions of the ATTAINS columns the exact same way they are already formatted in ATTAINS batch upload files... ? We can still include "ATTAINS." at the beginning so we can identify them easily but then that could be easily removed from all at once. This would be a consistent approach to how we create TADA. versions of the WQP columns.

For example: ATTAINS.organization_identifier ATTAINS.organization_name ATTAINS.organization_type_text ATTAINS.use_name ATTAINS.parameter

Update: Same comment applied to CST columns

One item to consider with this is how we are including the EPA304a standards as part of the organization_identifier, and was the reason why it was left named as organization_identifier rather than ATTAINS.organization_identifier. Same for the use_name column, as these use_names for the EPA304a standards would not be found in ATTAINS.

This formatting would allow users to define a magnitude for a parameter-use-org in each row separately. It also allows use to not have to perform an additional crosswalk between ATTAINS use names to CST use_names. In the future, we may want to consider doing this crosswalk, but this would be done in a separate function if we decide to do so.

(This image has additional columns not shown and is a draft output of the final table in which users will need to fill out the magnitude components of a param-use. EPA 304a magnitude values are autopopulated values from the CST. This work is in development on a separate branch.)
image

@cristinamullin cristinamullin merged commit 7348834 into develop Feb 11, 2025
7 checks passed
@cristinamullin cristinamullin deleted the Create-Module-3-Reference-Tables branch February 11, 2025 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants