-
Notifications
You must be signed in to change notification settings - Fork 76
Closed
Labels
Description
Prework
- Read and agree to the code of conduct and contributing guidelines.
- If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
- New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a "trouble" or "other" issue so we can discuss your use case and search for existing solutions first.
- Format your code according to the tidyverse style guide.
Problem
- (detailed error description & reprex at the bottom)
- Rare errors left unhandled by aws.s3 (e.g. caused by temporary network connectivity issues) propagate back through targets & produce an unexpected fatal error taking down the whole pipeline.
Why is this a desireable feature for targets
?
The fix is of course to handle each specific error in(Edit: looks like this may have originated from a malformed call toaws.s3
aws.s3
functions fromtargets
- see next comment), but if it were being called straight from user code we could implement our own error handling.- Relying on upstream packages to fix their bugs to prevent fatal pipeline errors like this leaves
targets
vulnerable to new bugs in external packages.
Proposal
- As per title - handle errors in dependencies gracefully so that individual jobs which are affected by upstream package errors give an
error
result for that individual target, rather than taking down the whole pipeline. - Specifically I'm discussing
aws.s3
here as it seems the most frequently-touched external package afterfuture
&clustermq
, but the same could be said for any other. - I realise this feature request could require to a lot of work, so no big deal if it's not at the top of the list!
Error Description
- I've twice seen this error which takes down the targets pipeline, losing all ongoing work. The latest time was 13hr into run of ~950 targets, after ~620 (all with
format = 'aws_qs'
) had completed successfully. - I believe this specific error traces back to
aws.s3::s3.HTTP
where a call toaws.s3::get_bucketname
somehow resulted in anNA
, probably through a temporary failure in network connectivity.
Error in if (isTRUE(check_region) && (bucketname != "")) { :
missing value where TRUE/FALSE needed
Error: callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Run `rlang::last_error()` to see where the error occurred.
rlang::last_error()
<error/tar_condition_run>
callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Backtrace:
1. targets::tar_make_future(reporter = "summary", workers = 90L)
2. targets:::callr_outer(...)
3. base::tryCatch(...)
4. base:::tryCatchList(expr, classes, parentenv, handlers)
5. base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
6. value[[3L]](cond)
7. targets::tar_throw_run(...)
Run `rlang::last_trace()` to see the full context.
```r
rlang::last_trace()
<error/tar_condition_run>
callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Backtrace:
█
1. └─targets::tar_make_future(reporter = "summary", workers = 90L)
2. └─targets:::callr_outer(...)
3. └─base::tryCatch(...)
4. └─base:::tryCatchList(expr, classes, parentenv, handlers)
5. └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
6. └─value[[3L]](cond)
7. └─targets::tar_throw_run(...)
Reprex
- Set these environment variables:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
AWS_BUCKET
- Run with:
targets::tar_make_future(workers = 2L)
# _targets.R
library(targets)
library(future)
library(future.batchtools)
library(aws.s3)
get_bucketname <- function(x, ...) NA
assignInNamespace('get_bucketname', get_bucketname, ns = 'aws.s3')
future::plan(multisession)
tar_option_set(
memory='transient',
storage='worker',
resources = tar_resources(
aws = tar_resources_aws(bucket = Sys.getenv('AWS_BUCKET')),
future = tar_resources_future(
plan = future::plan(multisession),
resources = list(
memory = 256,
ncpus = 1,
ntasks = 1,
walltime = 60L
)
)
)
)
list(
tar_target(
'job_should_error',
{
success <- FALSE
Sys.sleep(1)
success <- TRUE
},
deployment = 'main',
format = 'aws_qs',
error = 'continue',
priority = 0,
cue = tar_cue(mode = 'always')
),
tar_target(
'job_should_complete',
{
success <- FALSE
Sys.sleep(10)
success <- TRUE
},
deployment = 'worker',
error = 'continue',
priority = 1,
cue = tar_cue(mode = 'always')
)
)
• start target job_should_complete
• start target job_should_error
• end pipeline
Error in if (isTRUE(check_region) && (bucketname != "")) { :
missing value where TRUE/FALSE needed
Error: callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Run `rlang::last_error()` to see where the error occurred.
SessionInfo
targets
cb20f61
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /apps/R/4.1.0/lib64/R/lib/libRblas.so
LAPACK: /apps/R/4.1.0/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8
[2] LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8
[8] LC_NAME=C
[9] LC_ADDRESS=C
[10] LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8
[12] LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] aws.s3_0.3.21 future.batchtools_0.10.0
[3] future_1.21.0 targets_0.6.0.9000
loaded via a namespace (and not attached):
[1] pillar_1.6.1 compiler_4.1.0
[3] base64enc_0.1-3 prettyunits_1.1.1
[5] progress_1.2.2 tools_4.1.0
[7] digest_0.6.27 jsonlite_1.7.2
[9] debugme_1.1.0 lifecycle_1.0.0
[11] tibble_3.1.2 checkmate_2.0.0
[13] pkgconfig_2.0.3 rlang_0.4.11
[15] igraph_1.2.6 cli_3.0.0
[17] curl_4.3.2 yaml_2.2.1
[19] parallel_4.1.0 xfun_0.24
[21] xml2_1.3.2 httr_1.4.2
[23] withr_2.4.2 knitr_1.33
[25] rappdirs_0.3.3 hms_1.1.0
[27] vctrs_0.3.8 globals_0.14.0
[29] tidyselect_1.1.1 glue_1.4.2
[31] data.table_1.14.0 listenv_0.8.0
[33] R6_2.5.0 processx_3.5.2
[35] fansi_0.5.0 parallelly_1.26.1
[37] base64url_1.4 aws.ec2metadata_0.2.0
[39] callr_3.7.0 purrr_0.3.4
[41] magrittr_2.0.1 backports_1.2.1
[43] codetools_0.2-18 ps_1.6.0
[45] batchtools_0.9.16 ellipsis_0.3.2
[47] aws.signature_0.6.0 brew_1.0-6
[49] utf8_1.2.1 stringi_1.6.2
[51] crayon_1.4.1