Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Missing checks on number of rows in ResamplingCV / do we handle grouping & stratification properly? #1293

@sebffischer

Description

@sebffischer

ResamplingCV can be instantiated on tasks that have less than folds number of observations.
This causes poor error messages when used with resample().

library(mlr3)

task = tsk("iris")$filter(c(1:5, 51:54))
task$nrow
#> [1] 9
learner = lrn("classif.debug")
learner$encapsulate("try", lrn("classif.featureless"))
res = rsmp("cv", folds = 10)

# no error?
res$instantiate(task)
res$instance
#> Key: <fold>
#>    row_id  fold
#>     <int> <int>
#> 1:      2     1
#> 2:      4     2
#> 3:     52     3
#> 4:      1     4
#> 5:      5     5
#> 6:     51     6
#> 7:     53     7
#> 8:      3     8
#> 9:     54     9

rr = resample(task, learner, res)#
#> LOG OUTPUT
#> ...
#> Error in self$data(rows, cols = self$target_names): DataBackend did not return the queried rows correctly: 1 requested, 0 received.
#>         The resampling was probably instantiated on a different task.

Created on 2025-04-23 with reprex v2.1.1

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions