Thanks to visit codestin.com
Credit goes to github.com

Skip to content

mlr::arsq gives 1.0 always and greater than mlr::rsq #2711

@ManuKulamkombil

Description

@ManuKulamkombil

Prework

  • Didn't find any duplicates on this.

Description

After train and predict, arsq (adjusted r squared) gave result as 1 always. Also greater than rsq value. I guess the formula used for arsq is different from what I know of. I tried with custom r code and got the right answer. Tried to extend to arsq.v2 in my local system with required changes and got an answer equal to rsq and not arsq. I am currently using mlr version 2.14.0

Reproducible example

Learned that the formula given in the link

 1 - (1 - rsq) * (p / (n - p - 1L))

is not same as I expected it to be. My understanding of ARSQ is

 1 - (1 - rsq) * ((n - 1) / (n - p - 1L))

This is the inbuilt arsq:

#' @export arsq

I tried to change that as follows:

arsq.v2 = makeMeasure(id = "arsq.v2", minimize = FALSE, best = 1, worst = 0,
                   properties = c("regr", "req.pred", "req.truth"),
                   name = "Adjusted coefficient of determination",
                   note = "Defined as: 1 - (1 - rsq) * ((n - 1) / (n - p - 1L)). Adjusted R-squared is only defined for normal linear regression.",
                   fun = function(task, model, pred, feats, extra.args) {
                       n = length(pred$data$truth)
                       p = length(model$features)
                       if (n == p + 1) {
                           warning("Adjusted R-squared is undefined if the number observations is equal to the number of independent variables plus one.")
                           return(NA_real_)
                       }
                       1 - (1 - measureRSQ(pred$data$truth, pred$data$response)) * ((n - 1) / (n - p - 1L))
                   })

Compared both with this

meas = mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2)); meas

Got the following results.

    rmse      mae      rsq     arsq  arsq.v2 
3.147790 1.187279 0.479620 1.000000 0.479620 

Then tried arsq with custom rsq

preds = testPred$data$response
actual = testPred$data$truth
rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
rsq <- 1 - rss/tss; rsq

adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared

Expected output

> actual = testPred$data$truth
> rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
> tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
> rsq <- 1 - rss/tss; rsq
[1] 0.47962
> 
> adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared
[1] 0.4594142

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions