Mini Release #578

cristinamullin · 2025-03-13T20:32:43Z

Updates the DOMAINS link for ATTAINS: Hillary is looking into options for making that link a stable URL on drupal so we can leverage that in the future as that would benefit both ATTAINS and TADA, but this updated link is correct for now.
Fixes a bug in the harmonization reference file: now assumes "NITRATE + NITRITE" speciation is "as N" for all known combinations if left blank.
Incorporates USGS dataRetrieval updates: we are now leveraging their "develop" branch for EPATADA instead of CRAN.
Updates the citation: now includes new collaborators.
Updates TADA_FindNearbySites: The major disadvantage to the adjacency matrix approach now used in TADA_FindNearbySites is that it takes longer to find groups of related sites than just identifying sites within a buffer distance of each other. The main improvement that it represents is the function is no longer assigning the same site to multiple groups. So, we may need to discuss that tradeoff more as a group.
Creates tadamonitoringlocationidentifier in tada autoclean
Includes significant updates to TADA_dataRetrieval

adds sf option,
adds tribal options
updates big data options

Overview

Addition of {sf} methods to allow users to query WQP data using {sf} objects
Addition of options to allow tribal lands to be more directly queried using TADA_DataRetrieval
New function, TADA_TribalOptions, to assist users with identifying and querying tribal lands
Folding the processes in TADA_BigDataRetrieval into TADA_DataRetrieval and removing TADA_BigDataRetrieval to avoid confusion
Adding progress bar to large data pulls, user prompt to confirm download, silencing {dataRetrieval} messages + error handling for HTTP errors, vignette update

Additional info

{sf} methods use aoi_sf arg and largely begin here. First checks what data are available for the bbox of the {sf} object provided, then uses only MonitoringLocationIdentifiers inside the {sf} object when running the full query
Tribal land queries use tribal_area_type and tribe_name_parcel args and are handled alongside {sf} because they use this EPA spatial data. Both tribal_area_type and tribe_name_parcel are required. {sf} and tribal args can't be used at the same time (error), and if geographic info like statecode are provided in addition to either {sf} or tribal args then a warning is returned
tribal_area_type refers to one of the EMEF/Tribal MapServer layers. tribe_name_parcel refers to either TRIBE_NAME or PARCEL_NO entries from that layer. The TADA_TribalOptions function is included to help users see TRIBE_NAME/PARCEL_NO options available to them and check punctuation, etc.
TADA_BigDataHelper is now used to handle "big" data requests within TADA_DataRetrieval. By default this is triggered with maxrecs = 250000 & maxsites = 300.
Two (1, 2) progress bars are included inside TADA_BigDataHelper
The ask_user function is used to confirm that the user wants to download the dataset after the number of records is determined
In general the messages from {dataRetrieval} are now silenced because they were returning a lot of information that was hiding (what we considered) more useful information from TADA_DataRetrieval. But we've made sure to include checks for HTTP errors, which will then be communicated back to the user
Additional info now in vignette 1 to explain the new {sf}, tribal, and big data functionality

A few notes:

I left NULL as the default for the aoi_sf argument instead of "null" because the character version didn't work properly
I had hoped to work on issues related to character length limits in queries, as discussed with Cristina, but ran out of time
From my tests it didn't seem like the way that data are indexed by calendar date affected query speed
Please let me know if I can provide any other info on any of this! For example I didn't include any info from speed tests to avoid overwhelming amounts of info here. Thanks for your help.

Closes #361, closes #427, closes #345, closes #159

Potential fix for Field table bug

I included a few spots where TADA.MonitoringLocationIdentifier or TADA.MonitoringLocationType may be appropriate but if they don't see appropriate for change, please feel free to revert.

…er-in-tada_autoclean

Updates to nearby sites

…er-in-tada_autoclean

remove TADA_FindPotentialDuplicatsMultipleOrgs tests and examples, takes too long

Url and url checker updates

temporary change

update dependencies in desc, fix long examples

documentation update to match URL change

…erelease

remove data call

hillarymarler · 2025-03-14T11:25:11Z

For TADA_FindNearby sites, the change that is causing the much longer run time is the inclusion of fetchNHD to only group sites that are within the same catchment. I have not tried this yet, but it would be possible to use the adjacency matrix approach only using the buffer distance, without grouping by catchment.

We could offer users the option as to whether they want to group by catchment or not and explain that grouping by catchment will result in a longer run time. I can make the updates for whatever route we choose.

That is a potential option to prevent returning multiple groups for a single monitoring location and reduce the current run time. So I think the three options are really: 1) incorporate catchments (longer run time) , 2) rely only on buffer distance for grouping (shorter run time) or 3) users decide if they want to incorporate catchments when finding nearby sites.

…erelease

This reverts commit bb9bc2a.

hillarymarler · 2025-03-14T12:23:35Z

There are occasional test failures of ('test-ResultFlagsIndependent.R:137:3'): QC results are not flagged as Continuous ──
unique(cont_QC$TADA.ActivityType.Flag) == "Non_QC" is not TRUE

I am working on troubleshooting this one.

Update: I updated the test to be more inclusive of allowable values in TADA.ActivityType.Flag (anything that is not one of the QC variations is acceptable)

updated test to check for all QC options and fail if it finds them, previous test was too restrictive as it expected only "Non_QC" as a valid TADA.ActivityTpye.Flag

hillarymarler

The changes look good to me. Are there any specific functions you'd like me to test?

hillarymarler · 2025-03-14T11:59:07Z

R/ATTAINSCrosswalks.R

        EPA304A.PollutantName, ATTAINS.ParameterName
      ) %>%
-      dplyr::arrange(organization_identifier, TADA.CharacteristicName) %>%
+      dplyr::arrange(organization_identifier) %>%


Would it make sense to also arrange by TADA.ComparableDataIdentifier so it is easier for users to find a specific row?

hillarymarler · 2025-03-14T13:53:00Z

I am still seeing occasional failures of the continuous/QC results test. They seem to be related to TADA_RandomTestingData return empty data frames (which it should not do), so I'm taking a look at that now.

hillarymarler and others added 30 commits September 18, 2024 13:26

Update TADAModule1_BeginnerTraining.Rmd

6dbbbba

Update TADAModule1_BeginnerTraining.Rmd

2bc5b93

Update TADAModule1_BeginnerTraining.Rmd

84c76bc

Update TADAModule1_BeginnerTraining.Rmd

cb1693e

Update TADAModule1_BeginnerTraining.Rmd

8d43b6c

Update TADAModule1_BeginnerTraining.Rmd

2139687

Update example data

f9a3c54

Update Filtering.R

ad9029b

Potential fix for Field table bug

Update TADAModule1_AdvancedTraining.Rmd

bbfd1f7

I included a few spots where TADA.MonitoringLocationIdentifier or TADA.MonitoringLocationType may be appropriate but if they don't see appropriate for change, please feel free to revert.

Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…

9bf9db0

…er-in-tada_autoclean

Update from develop

e3d1ff2

Update Utilities.R

d6d80ad

Updates to nearby sites

Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…

940bcdc

…er-in-tada_autoclean

Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…

a2ebe60

…er-in-tada_autoclean

Merge updates from develop

6ea8b11

Update Utilities.R

3ff94e1

Merge remote-tracking branch 'upstream/develop' into develop

a96064f

tribal options edits

49ddc83

TADA_DR rewrite

57e5a03

Helper function for large queries

7d9ac2a

Document bigdatahelper

2b06dbd

update geospatial funs

929dc37

warning -> message & add dplyr::

001c2eb

Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…

464b20a

…er-in-tada_autoclean

Fix conflicts

93b23d5

Example data updates

59e1b23

odds and ends

19d80a2

Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…

0d18494

…er-in-tada_autoclean

Update Utilities.R

01e23f6

Merge branch 'develop' into 482-create-tadamonitoringlocationidentifi…

a451b9e

…er-in-tada_autoclean

cristinamullin and others added 15 commits March 11, 2025 15:21

Update test-ResultFlagsIndependent.R

47e0991

remove TADA_FindPotentialDuplicatsMultipleOrgs tests and examples, takes too long

Merge branch 'update_attains_domain_link' into prerelease

3667c8d

url updates

84d0c9f

Url and url checker updates

Update pchIcons.Rd

2e18a8a

dont eval

5de782b

temporary change

address check notes

08fb337

update dependencies in desc, fix long examples

run document

b272866

Update NMCWorkshop.Rmd

39a4189

Update DESCRIPTION

fe41fbd

Update pchIcons.Rd

bb9bc2a

documentation update to match URL change

Merge branch 'prerelease' of https://github.com/USEPA/EPATADA into pr…

e352f27

…erelease

specify to use DR dev branch

fda0a7c

Update DESCRIPTION

a2f08ec

Update test-ResultFlagsIndependent.R

a419655

remove data call

fix check notes

0758e98

cristinamullin requested review from hillarymarler and wokenny13 March 13, 2025 20:33

only polygon queries

ddbeacd

hillarymarler added 2 commits March 14, 2025 07:37

Merge branch 'prerelease' of https://github.com/USEPA/EPATADA into pr…

48a7065

…erelease

Revert "Update pchIcons.Rd"

d9208ce

This reverts commit bb9bc2a.

Update test-ResultFlagsIndependent.R

3ec4b96

updated test to check for all QC options and fail if it finds them, previous test was too restrictive as it expected only "Non_QC" as a valid TADA.ActivityTpye.Flag

hillarymarler approved these changes Mar 14, 2025

View reviewed changes

hillarymarler and others added 3 commits March 14, 2025 10:10

Bug fix for TADA_RandomTestingData and document()

b7af4c4

RandomTestingData bug fix

ad81feb

remove workshop draft

ac0f33b

cristinamullin merged commit e090929 into develop Mar 14, 2025
6 of 7 checks passed

cristinamullin deleted the prerelease branch March 14, 2025 18:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mini Release #578

Mini Release #578

Uh oh!

cristinamullin commented Mar 13, 2025 •

edited

Loading

Uh oh!

hillarymarler commented Mar 14, 2025 •

edited

Loading

Uh oh!

hillarymarler commented Mar 14, 2025 •

edited

Loading

Uh oh!

hillarymarler left a comment

Uh oh!

hillarymarler Mar 14, 2025

Uh oh!

hillarymarler commented Mar 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Mini Release #578

Mini Release #578

Uh oh!

Conversation

cristinamullin commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hillarymarler commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hillarymarler commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hillarymarler left a comment

Choose a reason for hiding this comment

Uh oh!

hillarymarler Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

hillarymarler commented Mar 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

cristinamullin commented Mar 13, 2025 •

edited

Loading

hillarymarler commented Mar 14, 2025 •

edited

Loading

hillarymarler commented Mar 14, 2025 •

edited

Loading