-
Notifications
You must be signed in to change notification settings - Fork 16
Description
When expand is called on a win_tbl
, values are repeated for all time-steps that fall into the interval starting at index_var
and lasting until index_var + dur_var
. This works when index_var + dur_var
is a positive number but whenever it's negative, it is just set to 0
by the following code.
Lines 140 to 148 in 7f2cc42
if (is_win_tbl(x) && !end_var %in% colnames(x)) { | |
on.exit(rm_cols(x, end_var, by_ref = TRUE)) | |
dura_var <- dur_var(x) | |
x <- x[, c(end_var) := re_time(get(start_var) + get(dura_var), interval)] | |
x <- x[get(end_var) < 0, c(end_var) := as.difftime(0, units = time_unit)] | |
} |
The effect of this can be seen with the following example from gcs
. Here, ett_gcs
is processed with expand
but this leads to falls results for some patients.
sed = load_concepts("ett_gcs", src = "mimic_demo")
#> ── Loading 1 concept ───────────────────────────────────────────────────────────
#> • ett_gcs
#> ────────────────────────────────────────────────────────────────────────────────
sed[icustay_id == 234989]
#> # A `win_tbl`: 11 ✖ 4
#> # Id var: `icustay_id`
#> # Index var: `charttime` (1 hours)
#> # Duration var: `dur_var`
#> icustay_id charttime dur_var ett_gcs
#> <int> <drtn> <drtn> <lgl>
#> 1 234989 -2 hours 1 mins TRUE <---- negative `index_var + dur_var`
#> 2 234989 7 hours 1 mins TRUE
#> 3 234989 14 hours 1 mins TRUE
#> 4 234989 18 hours 1 mins TRUE
#> 5 234989 24 hours 1 mins TRUE
#> 6 234989 35 hours 1 mins TRUE
#> 7 234989 39 hours 1 mins TRUE
#> 8 234989 43 hours 1 mins TRUE
#> 9 234989 47 hours 1 mins TRUE
#> 10 234989 52 hours 1 mins TRUE
#> 11 234989 55 hours 1 mins TRUE
sed = expand(sed, aggregate = "any")
sed[icustay_id == 234989]
#> # A `ts_tbl`: 13 ✖ 3
#> # Id var: `icustay_id`
#> # Index var: `charttime` (1 hours)
#> icustay_id charttime ett_gcs
#> <int> <drtn> <lgl>
#> 1 234989 -2 hours TRUE
#> 2 234989 -1 hours TRUE <--- artificially added
#> 3 234989 0 hours TRUE <--- artificially added
#> 4 234989 7 hours TRUE
#> 5 234989 14 hours TRUE
#> 6 234989 18 hours TRUE
#> 7 234989 24 hours TRUE
#> 8 234989 35 hours TRUE
#> 9 234989 39 hours TRUE
#> 10 234989 43 hours TRUE
#> 11 234989 47 hours TRUE
#> 12 234989 52 hours TRUE
#> 13 234989 55 hours TRUE
Created on 2024-04-12 with reprex v2.1.0
It is not entirey clear to me why end_var
would need to be set to zero in the below code.
Lines 146 to 147 in 7f2cc42
x <- x[, c(end_var) := re_time(get(start_var) + get(dura_var), interval)] | |
x <- x[get(end_var) < 0, c(end_var) := as.difftime(0, units = time_unit)] |
Maybe the intent was to prevent negative dur_var
s? In that case, the following code would be needed instead.
x <- x[get(dura_var) < 0, c(dura_var) := as.difftime(0, units = time_unit)]
x <- x[, c(end_var) := re_time(get(start_var) + get(dura_var), interval)]