Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@wokenny13
Copy link
Collaborator

  • Changed the name of the reference file to ATTAINSParamToWQXCharRef.
  • Updated the ATTAINSParamToWQXCharRef file from the WQX Characteristic Alias Table
  • Created the CST internal ref file. This is in a clean format to be used for future TADA functions in the future.
  • Created an optional review table for potential additional ATTAINS to WQX alias that could be added as additional rows to the WQX Characteristic Alias Table (these will need to be reviewed). See TADA_AdditionalCharAliasForReview(). This does not create an internal ref file of any kind. We can consider moving this function to utilities? This function uses an exact/like match logic to help identify if there are potential additional matches between ATTAINS Parameter Names and WQX Characteristic Names.

Changed the name of the reference file to ATTAINSParamToWQXCharRef.

Updated the ATTAINSParamToWQXCharRef file from the WQX Characteristic Alias Table

Created the CST internal ref file. This is in a clean format to be used in the future.

Created an optional review for potential additional ATTAINS to WQX alias that could be added as additional rows to the WQX Characteristic Alias Table (these will need to be reviewed).
@wokenny13
Copy link
Collaborator Author

wokenny13 commented Oct 17, 2025

When auto_assign = TRUE for creating the auto assign crosswalk between ATTAINS.ParameterName and TADA.CharacteristicName/TADA.ComparableDataIdenitifer in TADA_CreateParamRef and TADA_DefineCriteriaMethodology, the TADA alias table used to be 1-1, but there will likely be 1 to many matches now.

Will need to think through how this will impact Mod 3 criteria and methods workflow and how to handle appropriately.

…lias-Table-Update-and-Create-the-CST-Internal-Ref-File
@github-actions
Copy link
Contributor

github-actions bot commented Oct 17, 2025

coverage-report

File Coverage Missing
All files 35%
R/ATTAINSCrosswalks.R 22% 64-741 930-940 944-948 953-956 961 969-972 979-982 987-1008 1016-1026 1048-1051 1175-1250 1262 1269-1447 1616-1628 1632-1635 1640-1642 1670-1673 1684-1687 1695-1699 1711-1715 1727-1733 1738-1749 1755 1763-1766 1856-1862 1867-1869 1974 1981-2152 2332-2717 2842-2845 2860 2868-2871 2878-2896 2904-2908 2919-2923 2939-3001 3048-3138 3143-3146 3151-3245
R/ATTAINSRefTables.R 0% 23-517
R/autoClean.R 85% 145-146 230-236 364-365 375-379
R/autoFilter.R 0% 27-430
R/CensoredDataSuite.R 88% 52-53 143 173-174 216-218 304-305 408-409 414 417 425 453-457 460-461 467-469 486-488
R/CriteriaComparison.R 89% 159-161 166 176 223-241
R/CriteriaMethods.R 23% 155 175 185 191-250 404-406 422-424 529-669 675-711 733-984 1003-1171
R/CriteriaRefTables.R 0% 26-138
R/DataDiscoveryRetrieval.R 33% 197 206-211 225-230 240-245 251 255 270-676 690 698 700 704-707 710 716 718 722 724 730 734 736 746 748 752 758 760 764 766 771 773 777 783-786 789 801-809 822-828 857-865 879-887 907 1001-1033 1138-1145 1236-1272 1360-1363 1415-1574
R/DepthProfile.R 0% 96-1471
R/Figures.R 0% 64-1519
R/GeospatialFunctions.R 30% 163-165 176-180 184 239 277 321-547 688-689 695-698 761-766 770 778-921 926-1110 1278-1289 1306-1308 1310 1338 1344-1388 1415-1417 1495-1578 1588-2122 2202 2245 2266-2587 2783-2791 2839 2843 2847 2868-2912 2976-3035 3040-3064 3141-3733
R/MaintenanceScheduled.R 0% 42-366
R/RequiredCols.R 16% 358-579
R/ResultFlagsDependent.R 64% 57 62 95-99 119-123 219 248-250 267-269 276 292-303 398 405 412 420 480 486 502-511 577 589-596 618 629-633 700-787 882 931-939 960 964 970-974
R/ResultFlagsIndependent.R 66% 69 75 111-137 240 245 249 253 263 332-333 345-371 459 464 471 565-574 586-602 689 694 701 800-804 816-979 1024 1043-1062 1073-1076 1177 1181 1224-1235 1240 1244-1247 1314-1315 1391-1443 1541-1547
R/Tables.R 77% 19-30 86
R/TADAGeospatialRefLayers.R 0% 8-13
R/TADARefTables.R 83% 72 82-86
R/Transformations.R 88% 76-77 82 179-185 343-344 380-381 465 540-541 709-722 807-808 819-820 825 840-843 846-855 859
R/UnitConversions.R 81% 128 344 351 358 365 372 379 386 393-394 520-584 608-645 668 708 722-731 923-925 1002 1059-1062 1092
R/Utilities.R 50% 28-32 187 300-301 305 310 383 485-493 554-555 562 624-625 655-659 762-763 767-768 778-782 787-788 830-1123 1149-1150 1161-1162 1222-1571 1664-1697
R/WQPWQXRefTables.R 62% 21-86 110 120 126-128 140 162 172 178-180 255 277 287 293-295 379 400 410 416-420 564 586 596 602-604 621 644 654 660-662 767-970

Minimum allowed coverage is 10%

Generated by 🐒 cobertura-action against 7690a4b

see examples for output dataframe which shows potential additional alias matches.

added @export

added draft test-that to ensure ATTAINS param domain and CST pollutant name domain are up to date in the ref file
@hillarymarler
Copy link
Collaborator

@wokenny13 - would you like any help with the checks? Or just review the new changes?

@wokenny13
Copy link
Collaborator Author

It is fine to just review the changes. I can work on the checks.

Copy link
Collaborator

@hillarymarler hillarymarler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left comments on a few minor things - warning messages for many-to-many relationships, intermediate objects, etc.

I'm excited to see all of this incorporated into the mod 3 workflow!

return(ATTAINSParameterWQPCharRef_Cached)
TADA_GetATTAINSParamToWQPCharRef <- function(charAliasType = c("All", "ATTAINS")) {

charAliasType <- match.arg(charAliasType)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can run the whole function successfully and return the crosswalk, but if I run through step by step, I get this error here:

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this occurs throughout other TADA functions that uses the match.arg function. I am unsure if this is an expected error or not.

Ex. in ResultFlagsDependent.R
image

ATTAINSParamRef <- ATTAINS.raw[, "name", drop = FALSE]

# Create the initial ATTAINS param to WQX char crosswalk
if(charAliasType == "ATTAINS") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition has length > 1 error appears in both the "ATTAINS" and "All" sections too.

return(ATTAINSWQX2.0_non_matched3)
}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wokenny13 - this is a great function! The different steps you use to match/compare are logical and well thought out. I think some intermediate objects can be removed from this one too during/at the end of the workflow. Also, a few more comments on what is happening in some of the intermediate tables generated might be helpful for future maintenance and review.


# Find the first row that has all values populated. This will indicate the column names of the CST data frame.
# Note: Why not use a static row number? The CST may get new entries that may change the start of the data frame's.
first_filled_row_index <- which(rowSums(is.na(CST.raw)) == 0)[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to write any of this as an internal function so it can be used in both the function and this test?

hillarymarler and others added 10 commits October 20, 2025 11:45
add quiet to rExpertQuery function used in test
…lias-Table-Update-and-Create-the-CST-Internal-Ref-File
…lias-Table-Update-and-Create-the-CST-Internal-Ref-File
…lias-Table-Update-and-Create-the-CST-Internal-Ref-File
The current placeholder for EPA304a criteria and methods using the CST. Changed the extdata name to EPACST to clarify this.

+minor text fixes
condensed and cleaned up code and updated documentation (examples and params) in TADA_AdditionalCharAliasForReview

moved TADA CriteriaSearchTool to CriteriaRefTables.R
…lias-Table-Update-and-Create-the-CST-Internal-Ref-File
Copy link
Collaborator

@hillarymarler hillarymarler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the updates look good to me. I was able to run TADA_CreateParamRef successfully on random data sets from a few different states and the output made sense.

I think we could take a look as a team at some of the documentation related to the Mod 3 functions and work to make it a little more concise and clear, especially some of the text regarding review by the TADA team or future development. But that could be done as part of a future PR.

I pushed a couple of small commits - a minor grammar change or two in documentation and the addition of spsUtil::quiet. I suggest applying that when we use rExpertQuery functions as part of larger TADA functions as the printed rExpertQuery messages are probably less useful in the greater TADA context.

#' drop down list of all ATTAINS parameters that have been listed as a cause in
#' prior ATTAINS cycle for the organization selected in the function input 'org_id'.
#' It also highlights the cells in which users should input information. The excel
#' spreadsheet will be automatically downloaded to a user's downloads folder path.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should users have the option to specify a path for the download? (With the download folder set as the default). I'm thinking about situations where users might be using this and other TADA functions as part of their own assessment package or tool.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point in the past, it was decided to keep the path to just the downloads path. It would be an extra param argument input that would be then needed to be used throughout all of the other TADA functions, adding an extra step in the process if a user chooses to proceed with using a different path.

I am thinking it could still be useful to allow users to change the path and this is something we can discuss more in future meetings!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At line 782-784: "Future development efforts may allow users to pull in magnitude values
#' for an ATTAINS parameter through the Criteria Search Tool depending on a
#' users quality control and review of these metrics."

Does this mean that future efforts will allow users to pull in magnitude values (but those values will require review). Or that we will go through a quality control/review process as part of the function development?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users would be the ones required to the review process if we do pull in the magnitude values. This would be dependent on doing a crosswalk of CST pollutant names with the ATTAINS.ParameterName. The WQX Characteristic alias table in the ATTAINSParamToWQPCharRef could also contain this crosswalk if we do decide to proceed with this route in the future (make sure we choose to source from "all" to show what matches have been found for WQX Characteristics and CST pollutant names and see if an ATTAINS.Parameter is also listed as an alias for the same source.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At line 789 - should we have a link to a doc listing the TADA priority characteristics?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this on line 793 edited

'TADAPriorityChar <- utils::read.csv(system.file("extdata", "TADAPriorityCharUnitRef.csv", package = "EPATADA"))'.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at 817 - could users also run TADA_GetATTAINSOrgIDsRef() to see the list?

#' so subsequent calls will be faster.
#'
#' @return Updated sysdata.rda with updated ATTAINSParameterWQPCharRef object
#' @param charAliasType A string value to indicate the WQX data source to use
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an example of a use case where charAliasType is something other than "ATTAINS"? I'm having trouble picturing how this would look.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. @wokenny13 are any of the other alias types useful/logical for use in this function? I bolded a few to consider.

Here is the full list of alias types from WQX:

ATTAINS.PARAMETER
CAS NUMBER
CST.POLLUTANT
CST.STD.POLLUTANT**
EPA ID (SUBSTANCE REGISTRY #)
ITIS TAXON SERIAL NUMBER
MOLECULAR WEIGHT
NOAA - National Center for Environm
NWIS PARM CODE
ONTOLOGY - HYDRO.GEODAB.EU
RETIRED NAME
STANDARDIZE NAME (Normalized)
STORET CHARACTERISTIC NAME
STORET PARM CODE
SYSTEMATIC NAME
TAXON COMMON NAME
WQP COMPARABLE NAME
WQX SYNONYM REGISTRY (validation)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: WQP COMPARABLE NAME can be populated by the TADA team to support any additional matches that would otherwise be added manually.

No longer requires a one to one match for WQX Characteristics to ATTAINS Parameter.
changed autofill in TADA_DefineCriteriaMethodology to "Org" for ATTAINS ParameterName to WQX Char crosswalk. We should discuss how to handle this in a future TADA team meeting.

Minor updates to the EPACST.csv (R8 seems to be working on a more up to date and reviewed EPA304s criteria table, this table has the information crosswalk from the CST for the time being)

shorten TADA mod 3 example data frame names in Mod3Vignette - AtlOptions
…lias-Table-Update-and-Create-the-CST-Internal-Ref-File
@cristinamullin cristinamullin merged commit bc77acb into develop Nov 20, 2025
6 of 7 checks passed
@cristinamullin cristinamullin deleted the ATTAINS-Parameter-To-WQX-Characteristic-Alias-Table-Update-and-Create-the-CST-Internal-Ref-File branch November 20, 2025 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants