Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Handle upstream errors from aws.s3 gracefully #571

@stuvet

Description

@stuvet

Prework

  • Read and agree to the code of conduct and contributing guidelines.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a "trouble" or "other" issue so we can discuss your use case and search for existing solutions first.
  • Format your code according to the tidyverse style guide.

Problem

  • (detailed error description & reprex at the bottom)
  • Rare errors left unhandled by aws.s3 (e.g. caused by temporary network connectivity issues) propagate back through targets & produce an unexpected fatal error taking down the whole pipeline.

Why is this a desireable feature for targets?

  • The fix is of course to handle each specific error in aws.s3 (Edit: looks like this may have originated from a malformed call to aws.s3 functions from targets - see next comment), but if it were being called straight from user code we could implement our own error handling.
  • Relying on upstream packages to fix their bugs to prevent fatal pipeline errors like this leaves targets vulnerable to new bugs in external packages.

Proposal

  • As per title - handle errors in dependencies gracefully so that individual jobs which are affected by upstream package errors give an error result for that individual target, rather than taking down the whole pipeline.
  • Specifically I'm discussing aws.s3 here as it seems the most frequently-touched external package after future & clustermq, but the same could be said for any other.
  • I realise this feature request could require to a lot of work, so no big deal if it's not at the top of the list!

Error Description

  • I've twice seen this error which takes down the targets pipeline, losing all ongoing work. The latest time was 13hr into run of ~950 targets, after ~620 (all with format = 'aws_qs') had completed successfully.
  • I believe this specific error traces back to aws.s3::s3.HTTP where a call to aws.s3::get_bucketname somehow resulted in an NA, probably through a temporary failure in network connectivity.
Error in if (isTRUE(check_region) && (bucketname != "")) { : 
  missing value where TRUE/FALSE needed
Error: callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Run `rlang::last_error()` to see where the error occurred.

rlang::last_error()
<error/tar_condition_run>
callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Backtrace:
 1. targets::tar_make_future(reporter = "summary", workers = 90L)
 2. targets:::callr_outer(...)
 3. base::tryCatch(...)
 4. base:::tryCatchList(expr, classes, parentenv, handlers)
 5. base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
 6. value[[3L]](cond)
 7. targets::tar_throw_run(...)
Run `rlang::last_trace()` to see the full context.

```r
rlang::last_trace()
<error/tar_condition_run>
callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Backtrace:
    █
 1. └─targets::tar_make_future(reporter = "summary", workers = 90L)
 2.   └─targets:::callr_outer(...)
 3.     └─base::tryCatch(...)
 4.       └─base:::tryCatchList(expr, classes, parentenv, handlers)
 5.         └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
 6.           └─value[[3L]](cond)
 7.             └─targets::tar_throw_run(...)

Reprex

  1. Set these environment variables:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
AWS_BUCKET
  1. Run with:
targets::tar_make_future(workers = 2L)
# _targets.R

library(targets)
library(future)
library(future.batchtools)

library(aws.s3)

get_bucketname <- function(x, ...) NA

assignInNamespace('get_bucketname', get_bucketname, ns = 'aws.s3')

future::plan(multisession)

tar_option_set(
  memory='transient',
  storage='worker',
  resources = tar_resources(
    aws = tar_resources_aws(bucket = Sys.getenv('AWS_BUCKET')),
    future = tar_resources_future(
      plan = future::plan(multisession),
      resources = list(
        memory  =  256,
        ncpus     =  1,
        ntasks    =  1,
        walltime  =  60L
      )
    )
  )
)

list(
  tar_target(
    'job_should_error',
    {
      success <- FALSE
      Sys.sleep(1)
      success <- TRUE
    },
    deployment = 'main',
    format = 'aws_qs',
    error = 'continue',
    priority = 0,
    cue = tar_cue(mode = 'always')
  ),
  tar_target(
    'job_should_complete',
    {
    success <- FALSE
    Sys.sleep(10)
    success <- TRUE
    },
    deployment = 'worker',
    error = 'continue',
    priority = 1,
    cue = tar_cue(mode = 'always')
  )
)

• start target job_should_complete
• start target job_should_error
• end pipeline
Error in if (isTRUE(check_region) && (bucketname != "")) { : 
  missing value where TRUE/FALSE needed
Error: callr subprocess failed: missing value where TRUE/FALSE needed
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Run `rlang::last_error()` to see where the error occurred.

SessionInfo

R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /apps/R/4.1.0/lib64/R/lib/libRblas.so
LAPACK: /apps/R/4.1.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8      
 [2] LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                 
 [9] LC_ADDRESS=C              
[10] LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8
[12] LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

other attached packages:
[1] aws.s3_0.3.21            future.batchtools_0.10.0
[3] future_1.21.0            targets_0.6.0.9000     

loaded via a namespace (and not attached):
 [1] pillar_1.6.1          compiler_4.1.0       
 [3] base64enc_0.1-3       prettyunits_1.1.1    
 [5] progress_1.2.2        tools_4.1.0          
 [7] digest_0.6.27         jsonlite_1.7.2       
 [9] debugme_1.1.0         lifecycle_1.0.0      
[11] tibble_3.1.2          checkmate_2.0.0      
[13] pkgconfig_2.0.3       rlang_0.4.11         
[15] igraph_1.2.6          cli_3.0.0            
[17] curl_4.3.2            yaml_2.2.1           
[19] parallel_4.1.0        xfun_0.24            
[21] xml2_1.3.2            httr_1.4.2           
[23] withr_2.4.2           knitr_1.33           
[25] rappdirs_0.3.3        hms_1.1.0            
[27] vctrs_0.3.8           globals_0.14.0       
[29] tidyselect_1.1.1      glue_1.4.2           
[31] data.table_1.14.0     listenv_0.8.0        
[33] R6_2.5.0              processx_3.5.2       
[35] fansi_0.5.0           parallelly_1.26.1    
[37] base64url_1.4         aws.ec2metadata_0.2.0
[39] callr_3.7.0           purrr_0.3.4          
[41] magrittr_2.0.1        backports_1.2.1      
[43] codetools_0.2-18      ps_1.6.0             
[45] batchtools_0.9.16     ellipsis_0.3.2       
[47] aws.signature_0.6.0   brew_1.0-6           
[49] utf8_1.2.1            stringi_1.6.2        
[51] crayon_1.4.1

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions