PipeOp to try repair predicting with unseen factor levels

problem: quite often, a learner breaks, because it sees SOME prediction in a larger table, which contains new, unseen factor levels. in such a case the predict of the underlying learner fails, completely.

see reprex here:
https://github.com/mlr-org/mlr3/issues/97

this is really annoying. especially as this can happen on only a few observations, but we still 100% fail the complete prediction. 

current options are: the mlr3 fallback learner. that does not really help. because this produces now fallback predictions on the complete test set. 

here is MAYBE a better option.

PipOpUnseenLevels

before we go into the learner, we can on-training, store which levels are present in each factor. 

PipOpUnseenLevels
train: task--stored-levels--->task
predict: task-->stored-levels-->task

train: simply stores a list, one element per factor feature, with the seen level
predict: does through all observations. for each observation where we see "unseen" levels, we create a random row, by sampling from the marginals of the columns.
that is a bit hacky, but should work? 






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PipeOp to try repair predicting with unseen factor levels #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

PipeOp to try repair predicting with unseen factor levels #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions