Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@henrifnk
Copy link
Contributor

@henrifnk henrifnk commented Mar 7, 2021

task_data contain a row_id column that is then passed to stats::prcomp() to perform pca.

if (is.null(row_ids)) {
task_data = data.table(task$data(), row_ids = task$row_ids)
} else {
task_data = data.table(task$data(rows = row_ids), row_ids = row_ids)
}
plot_data = merge(task_data, d, by = "row_ids")
ggplot2::autoplot(stats::prcomp(task_data),
data = plot_data,
colour = "cluster", ...)
},

This inhibits pathologic principal components, especially for scaled data as row_id might scale to large numbers for increasing data sets...

@mllg mllg merged commit 743b9dd into mlr-org:main Mar 7, 2021
@mllg
Copy link
Member

mllg commented Mar 7, 2021

Thanks!

giuseppec added a commit that referenced this pull request Mar 14, 2021
mllg added a commit that referenced this pull request Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants