Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PipeOp to try repair predicting with unseen factor levels #71

@berndbischl

Description

@berndbischl

problem: quite often, a learner breaks, because it sees SOME prediction in a larger table, which contains new, unseen factor levels. in such a case the predict of the underlying learner fails, completely.

see reprex here:
mlr-org/mlr3#97

this is really annoying. especially as this can happen on only a few observations, but we still 100% fail the complete prediction.

current options are: the mlr3 fallback learner. that does not really help. because this produces now fallback predictions on the complete test set.

here is MAYBE a better option.

PipOpUnseenLevels

before we go into the learner, we can on-training, store which levels are present in each factor.

PipOpUnseenLevels
train: task--stored-levels--->task
predict: task-->stored-levels-->task

train: simply stores a list, one element per factor feature, with the seen level
predict: does through all observations. for each observation where we see "unseen" levels, we create a random row, by sampling from the marginals of the columns.
that is a bit hacky, but should work?

Metadata

Metadata

Assignees

Labels

Priority: MediumStatus: Contrib (unprepared)In someone's opinion, this is an issue that could be handled by a contributor with the right supportStatus: Needs DesignNeeds some thought and design decisions.Type: New PipeOpIssue suggests a new PipeOp

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions