-
Notifications
You must be signed in to change notification settings - Fork 23
TADA_CreateParamRef() and TADA_CreateParamUseRef() #555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
First 2 reference files draft functions have been pushed through.
|
working on addressing some check issues that were found |
TADA_CreateParamRef and TADA_CreateParamUseRef .RD files
removal of some intermediate objects
Update - only print df if there are failing URLs
documentation suggestions
hillarymarler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments about requested changes are in-line or discussed in our working session call.
Added a tab for org_name filtered paramter and use names from ATTAINS Modified return values of ATTAINS.FlagParameterName
|
@cristinamullin @hillarymarler There are still some edits and formatting I would like to make, but I think this is in a good spot to review now. I will be out of the office tomorrow but will be back on Friday the 20th. |
…/USEPA/EPATADA into Create-Module-3-Reference-Tables
consolidate ATTAINS functions into one .R
fix mod 3, update example data and ref tables
|
@wokenny13 can you look into this check note? ❯ checking R code for possible problems ... [18s] NOTE
TADA_GetEPA304aRef: no visible binding for global variable 'UNIT_NAME'
Undefined global functions or variables:
UNIT_NAME |
Made code more concise and readable. Reduced lines to less than 100 characters global variable update.
pkgdown currently only supports html format
qmd is not supported by pkgdown
R/ATTAINSCrosswalks.R
Outdated
| #' EPA 304a criteria) also need to include an additional column name: | ||
| #' 'organization_identifier'. | ||
| #' | ||
| #' @return A excel file or data frame which contains the columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wokenny13 this still returns TADA.CharacteristicName as well. I think that should be excluded from the crosswalk since we have TADA.ComparableDataIdentifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove the column. I think the only reason I kept it on was because I wasn't sure if this would be useful at all to see it as a separate column in the output (even if TADA.ComparableDataIdentifier already includes the TADA.CharacteristicName).
For example, The EPA 304a pollutant names are being cross walked using the list of TADA Priority Characteristic name, so wasn't sure if that would have been useful to have that information for TADA.CharacteristicName as a separate column in this step. Or perhaps if it helped at all to compare how the characteristic name from TADA/WQX compare directly to an ATTAINS.ParameterName.
I don't see much value in those examples though and agree it should be removed.
|
@hillarymarler @wokenny13 To make it easier to move back and forth seamlessly between WQP and ATTAINS, do you think we should name/format the TADA versions of the ATTAINS columns the exact same way they are already formatted in ATTAINS batch upload files... ? We can still include "ATTAINS." at the beginning so we can identify them easily but then that could be easily removed from all at once. This would be a consistent approach to how we create TADA. versions of the WQP columns. For example: Update: Same comment applied to CST columns |
| NA,"toluene","304A","Human Health","520","H",NA,NA,"O","µg/l" | ||
| NA,"toxaphene","304A","Human Health","7.0E-4","H",NA,NA,"W","µg/l" | ||
| NA,"toxaphene","304A","Human Health","7.1E-4","H",NA,NA,"O","µg/l" | ||
| NA,"trichloroethylene","304A","Human Health","0.6","H",NA,NA,"W","µg/l" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Greek letters do cause trouble when exporting to csv. When users get to 'defining the magnitude/criteria' piece, matching EPA304A polllutant names form the CST units to TADA.ComparableDataIdentifier units also becomes important.
Matching the WQP unit formatting will be good to do.
| # rm(use_attainments, use_parameters) | ||
| # | ||
|
|
||
| ATTAINSParamUseOrgRef <- utils::read.csv(system.file("extdata", "ATTAINSParamUseEntityRef.csv", package = "EPATADA")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function appears to be incomplete. It should query ATTAINS to get the most up to date version of the ATTAINS Param Use Org Ref.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a static snapshot of ATTAINS Param-Use-Org reference as a csv file for now. The rATTAINS call may run into errors at times due to long runtime for some orgs like Pennsylvania as to why the function was commented out.
The plan is to use the National Extracts from EQ once those functions are in a good place, but for the time being we are using an old csv snapshot file that has this information for ATTAINS Param-Use-Org.
Yes, I think this is a good approach for both ATTAINS and CST columns. The datatable::setnames function might be helpful as it allows renaming based on two columns in a df (original name, new name), so you can read in that crosswalk from a reference file. As I think there may be instances where the column name that you would read in from an ATTAINS profile is slightly different than the column name that is required for the batch upload file. (https://tim-tiefenbach.de/post/2022-rename-columns/#datatable) |
One item to consider with this is how we are including the EPA304a standards as part of the organization_identifier, and was the reason why it was left named as organization_identifier rather than ATTAINS.organization_identifier. Same for the use_name column, as these use_names for the EPA304a standards would not be found in ATTAINS. This formatting would allow users to define a magnitude for a parameter-use-org in each row separately. It also allows use to not have to perform an additional crosswalk between ATTAINS use names to CST use_names. In the future, we may want to consider doing this crosswalk, but this would be done in a separate function if we decide to do so. (This image has additional columns not shown and is a draft output of the final table in which users will need to fill out the magnitude components of a param-use. EPA 304a magnitude values are autopopulated values from the CST. This work is in development on a separate branch.) |
First 2 reference files draft functions have been pushed through to test.
I am still working on the Mod 3 vignettes, however the R document contains a detailed explanation of what the two functions' goals are, and can be reviewed in the meantime while I continue to work on the Mod 3 vignettes.