diff --git a/R/LearnerTorch.R b/R/LearnerTorch.R index fe6526012..4902b25b6 100644 --- a/R/LearnerTorch.R +++ b/R/LearnerTorch.R @@ -44,6 +44,17 @@ #' * multi-class classification: The `factor` target variable of a [`TaskClassif`][mlr3::TaskClassif] is a label-encoded #' [`torch_long`][torch::torch_long] with shape `(batch_size)` where the label-encoding goes from `1` to `n_classes`. #' +#' @section Important Runtime Considerations: +#' There are a few hyperparameters settings that can have a considerable impact on the runtime of the learner. +#' These include: +#' +#' * `device`: Use a GPU if possible. +#' * `num_threads`: Set this to the number of CPU cores available if training on CPU. +#' * `tensor_dataset`: Set this to `TRUE` (or `"device"` if on a GPU) if the dataset fits into memory. +#' * `batch_size`: Especially for very small models, choose a larger batch size. +#' +#' Also, see the *Early Stopping and Internal Tuning* section for how to terminate training early. +#' #' @template param_id #' @template param_task_type #' @template param_param_vals diff --git a/man-roxygen/paramset_torchlearner.R b/man-roxygen/paramset_torchlearner.R index 8753f3843..46e62e5cb 100644 --- a/man-roxygen/paramset_torchlearner.R +++ b/man-roxygen/paramset_torchlearner.R @@ -62,7 +62,6 @@ #' **Dataloader**: #' * `batch_size` :: `integer(1)`\cr #' The batch size (required). -#' When working with small models or datasets, choosing a larger batch size can considerably speed up training. #' * `shuffle` :: `logical(1)`\cr #' Whether to shuffle the instances in the dataset. This is initialized to `TRUE`, #' which differs from the default (`FALSE`). diff --git a/man/mlr_learners_torch.Rd b/man/mlr_learners_torch.Rd index 6f2988498..ae9d042b9 100644 --- a/man/mlr_learners_torch.Rd +++ b/man/mlr_learners_torch.Rd @@ -60,6 +60,20 @@ is also ensured to be the first factor level) is \code{1} and the negative class } } +\section{Important Runtime Considerations}{ + +There are a few hyperparameters settings that can have a considerable impact on the runtime of the learner. +These include: +\itemize{ +\item \code{device}: Use a GPU if possible. +\item \code{num_threads}: Set this to the number of CPU cores available if training on CPU. +\item \code{tensor_dataset}: Set this to \code{TRUE} (or \code{"device"} if on a GPU) if the dataset fits into memory. +\item \code{batch_size}: Especially for very small models, choose a larger batch size. +} + +Also, see the \emph{Early Stopping and Internal Tuning} section for how to terminate training early. +} + \section{Model}{ The Model is a list of class \code{"learner_torch_model"} with the following elements: @@ -145,7 +159,6 @@ Is initialized to 0. \itemize{ \item \code{batch_size} :: \code{integer(1)}\cr The batch size (required). -When working with small models or datasets, choosing a larger batch size can considerably speed up training. \item \code{shuffle} :: \code{logical(1)}\cr Whether to shuffle the instances in the dataset. This is initialized to \code{TRUE}, which differs from the default (\code{FALSE}). diff --git a/man/mlr_pipeops_torch_model.Rd b/man/mlr_pipeops_torch_model.Rd index 3998458bf..8d3c6e47f 100644 --- a/man/mlr_pipeops_torch_model.Rd +++ b/man/mlr_pipeops_torch_model.Rd @@ -79,7 +79,6 @@ Is initialized to 0. \itemize{ \item \code{batch_size} :: \code{integer(1)}\cr The batch size (required). -When working with small models or datasets, choosing a larger batch size can considerably speed up training. \item \code{shuffle} :: \code{logical(1)}\cr Whether to shuffle the instances in the dataset. This is initialized to \code{TRUE}, which differs from the default (\code{FALSE}).