Unevaluated promise inflates data sent to workers

## Prework

* [x] Read and agree to the [code of conduct](https://ropensci.org/code-of-conduct/) and [contributing guidelines](https://github.com/ropensci/targets/blob/main/CONTRIBUTING.md).
* [x] If there is [already a relevant issue](https://github.com/ropensci/targets/issues), whether open or closed, comment on the existing thread instead of posting a new issue.
* [x] Post a [minimal reproducible example](https://www.tidyverse.org/help/) like [this one](https://github.com/ropensci/targets/issues/256#issuecomment-754229683) so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    * [x] **Runnable**: post enough R code and data so any onlooker can create the error on their own computer.
    * [x] **Minimal**: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    * [x] **Readable**: format your code according to the [tidyverse style guide](https://style.tidyverse.org/).

## Description

In this example on an SGE cluster, the targets deploy really slowly.

```{r}
# _targets.R
library(targets)
library(tarchetypes)
options(clustermq.scheduler = "sge")
options(clustermq.template = "cmq.tmpl")
options(crayon.enabled = FALSE)
tar_rep(x, 0, batches = 1000, reps = 1)
```

```{r}
# cmq.tmpl
#$ -N {{ job_name }}
#$ -t 1-{{ n_jobs }}
#$ -j y
#$ -cwd
#$ -V
module load R/4.0.3
CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
```

The profiling study took several minutes.

```r
px <- proffer::pprof(tar_make_clustermq(workers = 10, callr_function = NULL), host = "0.0.0.0")
```

I saw this flamegraph:

![Screen Shot 2021-01-21 at 12 39 01 PM](https://user-images.githubusercontent.com/22958003/105389561-f6221800-5be5-11eb-8e19-166ff0d5763d.png)

Which tells me exactly where the bottleneck is: 

https://github.com/ropensci/targets/blob/66a0655527f28c3fa3eca101fe30d0742bced957/R/class_clustermq.R#L98-L101

And sure enough, when I changed just `retrieval` to `"worker"`, everything went much faster.

```r
# _targets.R
library(targets)
library(tarchetypes)
tar_option_set(retrieval = "worker") # worker retrieval
options(clustermq.scheduler = "sge")
options(clustermq.template = "cmq.tmpl")
options(crayon.enabled = FALSE)
tar_rep(x, 0, batches = 1000, reps = 1)
```

![Screen Shot 2021-01-21 at 12 44 37 PM](https://user-images.githubusercontent.com/22958003/105389950-7ea0b880-5be6-11eb-931e-65d071b4e42b.png)

## Solution

Apparently, all I need to do is `force()` all the pre-loaded objects in the subpipeline. It looks like there is an unevaluated promise object that is consuming too much memory. Here I am debugging this example at `$run_worker()`:

```r
> tar_make_clustermq(callr_function = NULL)
● run target x_batch
● run branch x_be02823c
Called from: self$run_worker(target)
Browse[1]> object_size(target) # promise object
9.32 MB # way too big.
Browse[1]> tmp <- force(target$subpipeline$targets$x_batch_084b9b29$value$object)
Browse[1]> object_size(target) # evaluated object
193 kB # much better
Browse[1]> 
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unevaluated promise inflates data sent to workers #279

Prework

Description

Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	self$crew$send_call(
	expr = target_run_worker(target),
	env = list(target = target)
	)

Unevaluated promise inflates data sent to workers #279

Description

Prework

Description

Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions