-
Notifications
You must be signed in to change notification settings - Fork 23
Censored data eh #217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Censored data eh #217
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
e8610eb
First commit to censored data branch
ehinman c8e4107
Update Transformations.R
ehinman ba6de94
created ref tables for censored data columns
ehinman f2240a1
censored data functions
ehinman cf5acb4
tweeks to detection limit analysis
ehinman 96595ee
TADA columns in utilities
ehinman f0d3825
Update Utilities.R
ehinman c0d4f4a
visualizations file change and documentation
ehinman 9178b38
updates to documentation
ehinman 95c44c9
Update HarmonizationTemplate.csv
ehinman 3ec49d7
conversion to new naming convention in functions
ehinman c5c2bb7
update tests
ehinman 55cfd9b
Update test-Transformations.R
ehinman 2b2961a
update tests
ehinman 5bbc5a3
added ordering function to other functions
ehinman b85c0ce
depth function, hamonization, ordering
ehinman 56c49ce
comments on reordering function
ehinman 6a882ba
Update CensoredDataSuite.R
ehinman 84bbe53
updated documentation
ehinman 4311c99
updates to build markdown
ehinman 158ebd1
Update Transformations.R
ehinman bf07daa
Merge branch 'develop' into censored_data_eh
ehinman cf31182
Update ResultFlagsDependent.R
ehinman be29e95
fix warnings
ehinman b608d54
added global variables
ehinman e7261ac
Update WQPDataHarmonization.Rmd
ehinman 1bec6d0
Update WQPDataHarmonization.Rmd
ehinman d8f5006
Update WQPDataHarmonization.Rmd
ehinman 447e097
Small changes
cristinamullin 84d2880
update docs
cristinamullin 920e95a
Update Utilities.R
cristinamullin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,4 +18,4 @@ TADA.Rproj | |
| .DS_Store | ||
|
|
||
| # | ||
| _snaps | ||
| _snaps | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| #' Simple Tools for Censored Data Handling | ||
| #' | ||
| #' This function determines if detection limit type and detection condition are parsimonious | ||
| #' before applying simple tools for non-detect and over-detect data handling, including filling | ||
| #' in the values as-is, X times the detection limit, or a random number between 0 | ||
| #' and the LOWER detection limit. These methods do NOT depend upon censored data frequency | ||
| #' in the dataset. | ||
| #' | ||
| #' @param .data A post-idCensoredData() TADA dataframe | ||
| #' @param nd_method A text string indicating the type of method used to populate a non-detect (lower limit) data value. Can be set to "multiplier" (default),"randombelowlimit", or "as-is". | ||
| #' @param nd_multiplier A number to be multiplied to the LOWER detection limit for each entry to obtain the censored data value. Must be supplied if nd_method = "multiplier". Defaults to 0.5, or half the detection limit. | ||
| #' @param od_method A text string indicating the type of method used to populate an over-detect (upper limit) data value. Can be set to "multiplier" or "as-is" (default). | ||
| #' @param od_multiplier A number to be multiplied to the UPPER detection limit for each entry to obtain the censored data value. Must be supplied if od_method = "multiplier". Defaults to 0.5, or half the detection limit. | ||
| #' | ||
| #' @return A TADA dataframe with an additional column named TADA.Censored_Method, which documents the method used to fill censored data values. | ||
| #' | ||
| #' | ||
| #' @export | ||
|
|
||
|
|
||
| simpleCensoredMethods <- function(.data, nd_method = "multiplier", nd_multiplier = 0.5, od_method = "as-is", od_multiplier = "null"){ | ||
| # check .data has all of the required columns | ||
| expected_cols <- c( | ||
| "ResultDetectionConditionText", | ||
| "DetectionQuantitationLimitTypeName", | ||
| "TADA.ResultMeasureValue.DataTypeFlag" | ||
| ) | ||
|
|
||
| # check that multiplier is provided if method = "multiplier" | ||
| if(nd_method=="multiplier"&nd_multiplier=="null"){ | ||
| stop("Please provide a multiplier for the lower detection limit handling method of 'multiplier'. Typically, the multiplier value is between 0 and 1.") | ||
| } | ||
| if(od_method=="multiplier"&od_multiplier=="null"){ | ||
| stop("Please provide a multiplier for the upper detection limit handling method of 'multiplier'. Typically, the multiplier value is between 0 and 1.") | ||
| } | ||
|
|
||
| ## First step: identify censored data | ||
| cens = .data%>%dplyr::filter(TADA.ResultMeasureValue.DataTypeFlag=="Result Value/Unit Copied from Detection Limit") | ||
| not_cens = .data%>%dplyr::filter(!ResultIdentifier%in%cens$ResultIdentifier) | ||
|
|
||
| ## Bring in det cond reference table | ||
| cond.ref = GetDetCondRef()%>%dplyr::rename(ResultDetectionConditionText = Name)%>%dplyr::select(ResultDetectionConditionText, TADA.Detection_Type) | ||
|
|
||
| ## Join to censored data | ||
| cens = dplyr::left_join(cens, cond.ref) | ||
|
|
||
| ## Bring in det limit type reference table | ||
| limtype.ref = GetDetLimitRef()%>%dplyr::rename(DetectionQuantitationLimitTypeName = Name)%>%dplyr::select(DetectionQuantitationLimitTypeName, TADA.Limit_Type) | ||
|
|
||
| ## Join to censored data | ||
| cens = dplyr::left_join(cens, limtype.ref) | ||
|
|
||
| ## Create flag for condition and limit type combinations | ||
| cens = cens%>%dplyr::mutate(TADA.Censored_Flag = dplyr::case_when( | ||
| TADA.Detection_Type=="Non-Detect"&TADA.Limit_Type=="Non-Detect" ~ as.character("Non-Detect"), | ||
| TADA.Detection_Type=="Over-Detect"&TADA.Limit_Type=="Over-Detect" ~ as.character("Over-Detect"), | ||
| TADA.Detection_Type=="Other"&TADA.Limit_Type=="Other" ~ as.character("Other Condition/Limit Populated"), | ||
| !TADA.Detection_Type==TADA.Limit_Type ~ as.character("Conflict between Condition and Limit") | ||
| )) | ||
|
|
||
| ## warn when some limit metadata may be problematic | ||
| if("Conflict between Condition and Limit"%in%cens$TADA.Censored_Flag){ | ||
| num = length(cens$TADA.Censored_Flag[cens$TADA.Censored_Flag=="Conflict between Condition and Limit"]) | ||
| warning(paste0(num," records in supplied dataset have conflicting detection condition and detection limit type information. These records will not be included in detection limit handling calculations.")) | ||
| } | ||
|
|
||
| cens = cens%>%dplyr::select(-TADA.Detection_Type, -TADA.Limit_Type) | ||
|
|
||
| # split out over detects and non detects | ||
| nd = subset(cens, cens$TADA.Censored_Flag=="Non-Detect") | ||
| od = subset(cens, cens$TADA.Censored_Flag=="Over-Detect") | ||
| other = subset(cens, !cens$ResultIdentifier%in%c(nd$ResultIdentifier,od$ResultIdentifier)) | ||
|
|
||
| # ND handling | ||
| if(dim(nd)[1]>0){ | ||
| if(nd_method=="multiplier"){ | ||
| nd$TADA.ResultMeasureValue = nd$TADA.ResultMeasureValue*nd_multiplier | ||
| nd$TADA.Censored_Method = paste0("Detection Limit Value Multiplied by ",nd_multiplier) | ||
| } | ||
| if(nd_method=="randombelowlimit"){ | ||
| nd$multiplier = stats::runif(dim(nd)[1],0,1) | ||
| nd$TADA.ResultMeasureValue = nd$TADA.ResultMeasureValue*nd$multiplier | ||
| nd$TADA.Censored_Method = paste0("Random Value Between 0 and Detection Limit Using this Multiplier: ",round(nd$multiplier,digits=3)) | ||
| nd = nd%>%dplyr::select(-multiplier) | ||
| } | ||
| if(nd_method=="as-is"){ | ||
| nd$TADA.Censored_Method = "Detection Limit Value Unchanged" | ||
| } | ||
| } | ||
| # OD handling | ||
| if(dim(od)[1]>0){ | ||
| if(od_method=="multiplier"){ | ||
| od$TADA.ResultMeasureValue = od$TADA.ResultMeasureValue*od_multiplier | ||
| od$TADA.Censored_Method = paste0("Detection Limit Value Multiplied by ",od_multiplier) | ||
| } | ||
| if(od_method=="as-is"){ | ||
| od$TADA.Censored_Method = "Detection Limit Value Unchanged" | ||
| } | ||
| } | ||
|
|
||
| .data = plyr::rbind.fill(not_cens, nd, od, other) | ||
| .data = OrderTADACols(.data) | ||
| return(.data) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.