-
Help
DescriptionHi, apparently using library(dplyr)
library(targets)
library(tidyr)
list(
tar_target(
nested_df,
tibble(
id = 1:3,
nest = rep(list(letters[1:3]), 3)
)
),
tar_target(
nested_row,
# do something expensive per nested row
nested_df %>% mutate(res = paste0("res", id)),
pattern = map(nested_df)
),
tar_target(
unnested_row,
# expensive result gets replicated
unnest(nested_row, nest),
pattern = map(nested_row)
),
tar_target(
grouped_row,
# introduce groups
unnested_row %>%
group_by(id,nest) %>%
tar_group(),
pattern = map(unnested_row),
iteration = "group"
),
tar_target(
over_rows,
# try to do further expensive things per separate row
# (not actually separate -> risk of e.g. memory overflow)
rep(
numeric(10^9),
length(grouped_row$tar_group)
),
pattern = map(grouped_row),
iteration = "list"
)
) I don't necessarily expect this to work the way I want it to. Another sensible outcome (though undesirable for me) would be to get multiple rows all with the same I had a similar situation in #1113, where Will suggested that cheap operations could be performed before the actual pipeline and arranged as desired into a dataframe. That worked at the time, but no longer applies here as some of the steps I want to unnest over are rather complicated. I've also looked over #348, #554, #971. It seems like his recommendation for these situations is to use static branching, but I'd like to know if there is some other alternative as this results in a network viz with redundant clutter. I've considered saving the actual big object as a side effect and passing along the filename so I can perform the splits I want on a pattern-free target but it feels a bit hacky (and doing it directly is not an option since it would involve loading all the large objects in memory at once). |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
|
Beta Was this translation helpful? Give feedback.
iteration = "group"
is meant for non-dynamic targets. It controls how a single data frame target is split into row groups for downstream dynamic targets. Becausegrouped_row
is already dynamic withpattern = map(grouped_row)
, the target is already split into branches, anditeration = "group"
no longer makes sense. So I recommend removingpattern = map(grouped_row)
fromgrouped_row
and keepingiteration = "group"
. As long asiteration
is"vector"
(default) forunnested_row
, the branches ofunnested_row
will be automatically aggregated into a data frame. This should be fine in your example because things don't get demanding on memory or computation until theover_rows
target.