-
Notifications
You must be signed in to change notification settings - Fork 23
513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset #514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
513-occasional-test-failure-tada_findpotentialduplicates-does-not-grow-dataset #514
Conversation
NA logical value for TADA.MonitoringLocationIdentifier created in TADA_FindNearbySites() function it seems. versus "NA" character values for TADA.MonitoringLocationIdentifier in the dupsdat dataframe in this function. Joining causes additional rows due to mismatch. Convert logical NA to character "NA". If TADA_FindNearbySite does not return NA for TADA.MonitoringLocationIdentifier, it will return the respective TADA.MonitoringLocationIdentifier still.
|
Good catch Kenny. We've run into similar issues before and applied a similar approach to resolve. We changed NA to "NA - Not Available" (character). Should we convert to "NA - Not Available" here as well for consistency with other TADA functions? |
|
Keeping any NA changes to "NA - Not Available" for consistency sounds like it would be good. In general, does storing a value as a logical/missing NA versus a character NA make any difference in speed/performance/memory? Best spot to handle this change? I think dupsdata convert the logical NA for TADA.MonitoringLocationIdentifier to a character in this chunk of the code in TADA_FindPotentialDuplicatesMultipleOrgs. In this case, since a full join is used at the end of the code, I believe it would require then changing "NA" found in dupsdata to "NA - Not Available"
Or would we want to make edits to NA value in lines 841 under TADA_FindNearbySites() where
|
|
@wokenny13 - is this ready for review? |
I will work on converting it to "NA - Not Available" for consistency. I will let you know when I get this finished and for it to be reviewed then |
|
Sounds great - thank you! |
|
@hillarymarler this is ready for review |
…ntialduplicates-does-not-grow-dataset
NA logical value for TADA.MonitoringLocationIdentifier created in TADA_FindNearbySites() function it seems.
versus
"NA" character values for TADA.MonitoringLocationIdentifier in the dupsdat dataframe in this function.
Joining causes additional rows created in its return value due to mismatch of type.
Convert logical NA to character "NA".
If TADA_FindNearbySite does not return NA for TADA.MonitoringLocationIdentifier, it will return the respective TADA.MonitoringLocationIdentifier still.