-
-
Notifications
You must be signed in to change notification settings - Fork 407
Closed
Labels
Description
Prework
- Didn't find any duplicates on this.
Description
After train and predict, arsq (adjusted r squared) gave result as 1 always. Also greater than rsq value. I guess the formula used for arsq is different from what I know of. I tried with custom r code and got the right answer. Tried to extend to arsq.v2 in my local system with required changes and got an answer equal to rsq and not arsq. I am currently using mlr version 2.14.0
Reproducible example
Learned that the formula given in the link
1 - (1 - rsq) * (p / (n - p - 1L))
is not same as I expected it to be. My understanding of ARSQ is
1 - (1 - rsq) * ((n - 1) / (n - p - 1L))
This is the inbuilt arsq:
Line 268 in e124753
| #' @export arsq |
I tried to change that as follows:
arsq.v2 = makeMeasure(id = "arsq.v2", minimize = FALSE, best = 1, worst = 0,
properties = c("regr", "req.pred", "req.truth"),
name = "Adjusted coefficient of determination",
note = "Defined as: 1 - (1 - rsq) * ((n - 1) / (n - p - 1L)). Adjusted R-squared is only defined for normal linear regression.",
fun = function(task, model, pred, feats, extra.args) {
n = length(pred$data$truth)
p = length(model$features)
if (n == p + 1) {
warning("Adjusted R-squared is undefined if the number observations is equal to the number of independent variables plus one.")
return(NA_real_)
}
1 - (1 - measureRSQ(pred$data$truth, pred$data$response)) * ((n - 1) / (n - p - 1L))
})
Compared both with this
meas = mlr::performance(testPred, measures = list(mlr::rmse, mlr::mae, mlr::rsq, arsq, arsq.v2)); meas
Got the following results.
rmse mae rsq arsq arsq.v2
3.147790 1.187279 0.479620 1.000000 0.479620
Then tried arsq with custom rsq
preds = testPred$data$response
actual = testPred$data$truth
rss <- sum((preds - actual) ^ 2) ## residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2) ## total sum of squares
rsq <- 1 - rss/tss; rsq
adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared
Expected output
> actual = testPred$data$truth
> rss <- sum((preds - actual) ^ 2) ## residual sum of squares
> tss <- sum((actual - mean(actual)) ^ 2) ## total sum of squares
> rsq <- 1 - rss/tss; rsq
[1] 0.47962
>
> adj.r.squared = 1 - (1 - rsq) * ((n - 1)/(n-p-1)); adj.r.squared
[1] 0.4594142