BREAKING_CHANGE: change format for binary classification heads #385

sebffischer · 2025-04-16T07:14:00Z

Resolves #374

@tdhock With this PR, mlr3torch will now expect an output of shape (batch_size, 1) for binary classification problems.
t_loss("cross_entropy") will automatically select the appropriate loss depending on the number of classes.

In the example below, you see it in action.
We train the same learner once on a binary classification problem (iris subset to two classes) and on a multi-class classification problem (iris).

Depending on the task, the correct output dimension will be set, the targets will be loaded correctly during training, and the correct loss function is instantiated. This is now also respected by all mlr3torch learners.

library(mlr3torch)

cb = torch_callback("loss_fn",
  state_dict = function() {
    self$ctx$loss_fn
  },
  load_state_dict = function(state_dict) {
    self$x = x
  }
)

learner = lrn("classif.mlp", neurons = 100, batch_size = 32, epochs = 10,
  callbacks = cb)

tsk_binary = tsk("iris")$filter(1:100)$droplevels()
tsk_multi = tsk("iris")

learner$train(tsk_binary)
learner$network(torch_randn(1, 4))
#> torch_tensor
#> -0.3322
#> [ CPUFloatType{1,1} ][ grad_fn = <AddmmBackward0> ]
class(learner$model$callbacks$loss_fn)
#> [1] "nn_bce_with_logits_loss" "nn_loss"                
#> [3] "nn_module"

learner$train(tsk_multi)
learner$network(torch_randn(1, 4))
#> torch_tensor
#>  1.0422  0.3267 -0.5658
#> [ CPUFloatType{1,3} ][ grad_fn = <AddmmBackward0> ]
class(learner$model$callbacks$loss_fn)
#> [1] "nn_cross_entropy_loss" "nn_weighted_loss"      "nn_loss"              
#> [4] "nn_module"

^{Created on 2025-04-16 with reprex v2.1.1}

Note that after this is is not possible to have a neural network with output dim (batch_size, 2) for binary classification. I think this is okay, as (batch_size, 1) is clearly better.

glrn = po("torch_ingress_num") %>>%
  nn("linear", out_features = 2) %>>%
  po("torch_loss", loss = "cross_entropy") %>>%
  po("torch_optimizer", "adamw") %>>%
  po("torch_model_classif", epochs = 1, batch_size = 32) |> as_learner()

glrn$train(tsk("sonar"))
#> Error in (function (self, target, weight, pos_weight, reduction) : output with shape [32, 1] doesn't match the broadcast shape [32, 2]
#> Exception raised from mark_resize_outputs at /Users/runner/work/libtorch-mac-m1/libtorch-mac-m1/pytorch/aten/src/ATen/TensorIterator.cpp:1208 (most recent call first):
#> frame #0: c10::Error::Error(c10::SourceLocation, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 52 (0x10422811c in libc10.dylib)
#> frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 140 (0x104224d6c in libc10.dylib)
#> ... 
#> This happened PipeOp torch_model_classif's $train()

^{Created on 2025-04-16 with reprex v2.1.1}

R/LearnerTorch.R

sebffischer · 2025-04-16T08:56:28Z

@tdhock let me know if this is what you had in mind :)

tdhock · 2025-04-16T13:58:22Z

The code seems to work as I would expect, thanks!
I tried this code:

remotes::install_github("mlr-org/mlr3torch@c03d61a18e9785e2dbb5b20e2b6dada74a9b58b8")
stask <- mlr3::tsk("sonar")
po_list <- list(
  mlr3torch::PipeOpTorchIngressNumeric$new(),
  mlr3torch::nn("head"),
  mlr3pipelines::po(
    "torch_loss",
    loss = torch::nn_bce_with_logits_loss),
  mlr3pipelines::po("torch_optimizer"),
  mlr3pipelines::po(
    "torch_model_classif",
    epochs = 1,
    batch_size = 3,
    predict_type="prob"))
graph <- Reduce(mlr3pipelines::concat_graphs, po_list)
glrn <- mlr3::as_learner(graph)
glrn$train(stask)
glrn$predict(stask)

And I got this result:

> glrn$predict(stask)
<PredictionClassif> for 208 observations:
 row_ids truth response    prob.M    prob.R
       1     R        M 0.5324456 0.4675544
       2     R        M 0.5387607 0.4612393
       3     R        M 0.5339928 0.4660072
     ---   ---      ---       ---       ---
     206     M        M 0.5342165 0.4657835
     207     M        M 0.5279804 0.4720196
     208     M        M 0.5211176 0.4788824

man/nn_graph.Rd

man/output_dim_for.Rd

tests/testthat/helper_autotest.R

tdhock · 2025-04-16T14:06:49Z

overall looks good!
I suggested some comments for the docs.

sebffischer · 2025-04-16T15:18:36Z

@tdhock thanks for the review!

tdhock · 2025-04-17T06:09:39Z

R/LearnerTorch.R

+#' * binary classification: The `factor` target variable of a [`TaskClassif`][mlr3::TaskClassif] is encoded as a
+#'   [`torch_float`][torch::torch_float] with shape `(batch_size, 1)` where the positive class is `1` and the negative
+#'   class is `0`.


ok, but what is the positive and negative class at the R-level?
In R it is a factor with two levels. Is the first factor level always considered the positive class?

mlr3 binary classification tasks have a field $positive. Will add that to the docs!

And yes, it is also ensured that the positive class is the first level.

tdhock · 2025-04-17T06:13:24Z

R/LearnerTorch.R

+#'
+#' Furthermore, the target encoding is expected to be as follows:
+#' * regression: The `numeric` target variable of a [`TaskRegr`][mlr3::TaskRegr] is encoded as a
+#'   [`torch_float`][torch::torch_float] with shape `c(batch_size, 1)`.


great!
Maybe this is more of a question/issue for mlr3 rather than mlr3torch, but is it possible to do multi-task regression? (more than one output to predict?)

Not yet, but we could implement a Task where the target is a lazy_tensor, which would allow this.

setting target to lazy_tensor would be specific to torch, right? other non-torch learners can't use lazy_tensors, right?
it would be better if there would be support for any learners (for example glmnet or rpart) and not just torch.

yeah that’s true, if you want this this should be a feature request in mlr3.

BREAKING_CHANGE: change format for binary classification heads

2c74edc

sebffischer commented Apr 16, 2025

View reviewed changes

R/LearnerTorch.R Outdated Show resolved Hide resolved

sebffischer added 7 commits April 16, 2025 09:26

Update R/LearnerTorch.R

ba9fb82

further fixes

9a238e5

Merge branch 'main' into binary-head

59d645d

pkgdown

0cde23b

...

1e282b9

fix bugs

e44980f

...

56ed668

sebffischer mentioned this pull request Apr 16, 2025

Convert single column binary predictions to two #375

Closed

rd anchor [skip ci]

c03d61a

tdhock reviewed Apr 16, 2025

View reviewed changes

man/nn_graph.Rd Outdated Show resolved Hide resolved

tdhock reviewed Apr 16, 2025

View reviewed changes

man/output_dim_for.Rd Show resolved Hide resolved

tdhock reviewed Apr 16, 2025

View reviewed changes

tests/testthat/helper_autotest.R Show resolved Hide resolved

address feedback

55a7f9c

sebffischer merged commit 307ab40 into main Apr 16, 2025
5 of 6 checks passed

sebffischer deleted the binary-head branch April 16, 2025 15:24

tdhock reviewed Apr 17, 2025

View reviewed changes

tdhock mentioned this pull request Apr 25, 2025

Multiple outputs for regression task mlr-org/mlr3#1296

Open

Uh oh!

BREAKING_CHANGE: change format for binary classification heads #385

BREAKING_CHANGE: change format for binary classification heads #385

Uh oh!

Conversation

sebffischer commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sebffischer commented Apr 16, 2025

Uh oh!

tdhock commented Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tdhock commented Apr 16, 2025

Uh oh!

sebffischer commented Apr 16, 2025

Uh oh!

Uh oh!

tdhock Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

sebffischer Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sebffischer Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tdhock Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tdhock Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

sebffischer Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tdhock Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

sebffischer Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sebffischer commented Apr 16, 2025 •

edited

Loading

sebffischer Apr 17, 2025 •

edited

Loading