513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset #514

wokenny13 · 2024-08-23T19:02:40Z

NA logical value for TADA.MonitoringLocationIdentifier created in TADA_FindNearbySites() function it seems.

versus

"NA" character values for TADA.MonitoringLocationIdentifier in the dupsdat dataframe in this function.

Joining causes additional rows created in its return value due to mismatch of type.

Convert logical NA to character "NA".

If TADA_FindNearbySite does not return NA for TADA.MonitoringLocationIdentifier, it will return the respective TADA.MonitoringLocationIdentifier still.

NA logical value for TADA.MonitoringLocationIdentifier created in TADA_FindNearbySites() function it seems. versus "NA" character values for TADA.MonitoringLocationIdentifier in the dupsdat dataframe in this function. Joining causes additional rows due to mismatch. Convert logical NA to character "NA". If TADA_FindNearbySite does not return NA for TADA.MonitoringLocationIdentifier, it will return the respective TADA.MonitoringLocationIdentifier still.

cristinamullin · 2024-08-23T19:23:21Z

Good catch Kenny. We've run into similar issues before and applied a similar approach to resolve. We changed NA to "NA - Not Available" (character).

Should we convert to "NA - Not Available" here as well for consistency with other TADA functions?

wokenny13 · 2024-08-23T20:07:06Z

Keeping any NA changes to "NA - Not Available" for consistency sounds like it would be good.

In general, does storing a value as a logical/missing NA versus a character NA make any difference in speed/performance/memory?

Best spot to handle this change?

I think dupsdata convert the logical NA for TADA.MonitoringLocationIdentifier to a character in this chunk of the code in TADA_FindPotentialDuplicatesMultipleOrgs. In this case, since a full join is used at the end of the code, I believe it would require then changing "NA" found in dupsdata to "NA - Not Available"

dupsdat <- dupsdat %>% dplyr::rename(SingleNearbyGroup = TADA.MonitoringLocationIdentifier) %>% dplyr::mutate( TADA.MonitoringLocationIdentifier = paste(SingleNearbyGroup, sep = ","), TADA.ResultSelectedMultipleOrgs = ifelse(ResultIdentifier %in% duppicks$ResultIdentifier, "Y", "N") ) %>% dplyr::select(-SingleNearbyGroup)

Or would we want to make edits to NA value in lines 841 under TADA_FindNearbySites() where

if (!"TADA.MonitoringLocationIdentifier" %in% colnames(.data)) { .data$TADA.MonitoringLocationIdentifier <- NA }

hillarymarler · 2024-08-26T15:14:50Z

@wokenny13 - is this ready for review?

wokenny13 · 2024-08-26T15:32:26Z

@hillarymarler

Good catch Kenny. We've run into similar issues before and applied a similar approach to resolve. We changed NA to "NA - Not Available" (character).

Should we convert to "NA - Not Available" here as well for consistency with other TADA functions?

I will work on converting it to "NA - Not Available" for consistency. I will let you know when I get this finished and for it to be reviewed then

hillarymarler · 2024-08-26T15:34:41Z

Sounds great - thank you!

… for consistency

wokenny13 · 2024-08-26T18:48:11Z

@hillarymarler this is ready for review

…ntialduplicates-does-not-grow-dataset

wokenny13 requested review from cristinamullin and hillarymarler August 23, 2024 19:02

wokenny13 linked an issue Aug 23, 2024 that may be closed by this pull request

Occasional test failure - TADA_FindPotentialDuplicatesMultipleOrgs does not grow dataset #513

Closed

wokenny13 changed the title ~~Update ResultFlagsIndependent.R~~ 513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset Aug 23, 2024

TADA.MonitoringLocationIdentifier column returns "NA - Not Available"…

695b3bf

… for consistency

Merge branch 'develop' into 513-occasional-test-failure-tada_findpote…

479a734

…ntialduplicates-does-not-grow-dataset

hillarymarler approved these changes Aug 26, 2024

View reviewed changes

hillarymarler merged commit 436a98f into develop Aug 26, 2024

hillarymarler deleted the 513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset branch August 26, 2024 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset #514

513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset #514

Uh oh!

wokenny13 commented Aug 23, 2024

Uh oh!

cristinamullin commented Aug 23, 2024

Uh oh!

wokenny13 commented Aug 23, 2024

Uh oh!

hillarymarler commented Aug 26, 2024

Uh oh!

wokenny13 commented Aug 26, 2024

Uh oh!

hillarymarler commented Aug 26, 2024

Uh oh!

wokenny13 commented Aug 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset #514

513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset #514

Uh oh!

Conversation

wokenny13 commented Aug 23, 2024

Uh oh!

cristinamullin commented Aug 23, 2024

Uh oh!

wokenny13 commented Aug 23, 2024

Uh oh!

hillarymarler commented Aug 26, 2024

Uh oh!

wokenny13 commented Aug 26, 2024

Uh oh!

hillarymarler commented Aug 26, 2024

Uh oh!

wokenny13 commented Aug 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants