Thanks to visit codestin.com
Credit goes to github.com

Skip to content

expand on win_tbl arbitrarily sets negative end times to zero #63

@prockenschaub

Description

@prockenschaub

When expand is called on a win_tbl, values are repeated for all time-steps that fall into the interval starting at index_var and lasting until index_var + dur_var. This works when index_var + dur_var is a positive number but whenever it's negative, it is just set to 0 by the following code.

ricu/R/utils-ts.R

Lines 140 to 148 in 7f2cc42

if (is_win_tbl(x) && !end_var %in% colnames(x)) {
on.exit(rm_cols(x, end_var, by_ref = TRUE))
dura_var <- dur_var(x)
x <- x[, c(end_var) := re_time(get(start_var) + get(dura_var), interval)]
x <- x[get(end_var) < 0, c(end_var) := as.difftime(0, units = time_unit)]
}

The effect of this can be seen with the following example from gcs. Here, ett_gcs is processed with expand but this leads to falls results for some patients.

sed = load_concepts("ett_gcs", src = "mimic_demo")
#> ── Loading 1 concept ───────────────────────────────────────────────────────────
#> • ett_gcs
#> ────────────────────────────────────────────────────────────────────────────────

sed[icustay_id == 234989]
#> # A `win_tbl`:  11 ✖ 4
#> # Id var:       `icustay_id`
#> # Index var:    `charttime` (1 hours)
#> # Duration var: `dur_var`
#>    icustay_id charttime dur_var ett_gcs
#>         <int> <drtn>    <drtn>  <lgl>
#> 1      234989 -2 hours  1 mins  TRUE       <---- negative `index_var + dur_var`
#> 2      234989  7 hours  1 mins  TRUE
#> 3      234989 14 hours  1 mins  TRUE
#> 4      234989 18 hours  1 mins  TRUE
#> 5      234989 24 hours  1 mins  TRUE
#> 6      234989 35 hours  1 mins  TRUE
#> 7      234989 39 hours  1 mins  TRUE
#> 8      234989 43 hours  1 mins  TRUE
#> 9      234989 47 hours  1 mins  TRUE
#> 10     234989 52 hours  1 mins  TRUE
#> 11     234989 55 hours  1 mins  TRUE

sed = expand(sed, aggregate = "any")
sed[icustay_id == 234989]
#> # A `ts_tbl`: 13 ✖ 3
#> # Id var:     `icustay_id`
#> # Index var:  `charttime` (1 hours)
#>    icustay_id charttime ett_gcs
#>         <int> <drtn>    <lgl>
#> 1      234989 -2 hours  TRUE
#> 2      234989 -1 hours  TRUE    <--- artificially added
#> 3      234989  0 hours  TRUE    <--- artificially added
#> 4      234989  7 hours  TRUE
#> 5      234989 14 hours  TRUE
#> 6      234989 18 hours  TRUE
#> 7      234989 24 hours  TRUE
#> 8      234989 35 hours  TRUE
#> 9      234989 39 hours  TRUE
#> 10     234989 43 hours  TRUE
#> 11     234989 47 hours  TRUE
#> 12     234989 52 hours  TRUE
#> 13     234989 55 hours  TRUE

Created on 2024-04-12 with reprex v2.1.0

It is not entirey clear to me why end_var would need to be set to zero in the below code.

ricu/R/utils-ts.R

Lines 146 to 147 in 7f2cc42

x <- x[, c(end_var) := re_time(get(start_var) + get(dura_var), interval)]
x <- x[get(end_var) < 0, c(end_var) := as.difftime(0, units = time_unit)]

Maybe the intent was to prevent negative dur_vars? In that case, the following code would be needed instead.

x <- x[get(dura_var) < 0, c(dura_var) := as.difftime(0, units = time_unit)]
x <- x[, c(end_var) := re_time(get(start_var) + get(dura_var), interval)]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions