ENH Add parameter return_X_y to `make_classification` #30196

SuccessMoses · 2024-11-02T13:14:14Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

add parameter return_X_y to make_classification

Any other comments?

The dataset returned by load_iris is a Bunch, which is more descriptive. #16532 proposes same.

from sklearn.datasets import load_iris

data = load_iris()
print(data.DESCR)  # Prints a description of the Iris dataset

github-actions · 2024-11-02T13:15:32Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: a5754db. Link to the linter CI: here}

SuccessMoses · 2024-11-03T15:29:40Z

@adrinjalali could you review this?

adrinjalali

Please have a look at fetch_openml on how to document return_X_y and the documentation of the return values.

doc/whats_new/upcoming_changes/sklearn.datasets/30196.enhancement.rst

Co-authored-by: Adrin Jalali <[email protected]>

…nto test

sklearn/datasets/_samples_generator.py

OmarManzoor

Thanks for the PR @SuccessMoses
I added a few comment

sklearn/datasets/_samples_generator.py

SuccessMoses · 2024-11-19T14:14:46Z

@OmarManzoor Thanks for the review. I am working on it

OmarManzoor

LGTM. Thanks @SuccessMoses

OmarManzoor · 2024-11-22T10:50:40Z

CC: @adrinjalali

sklearn/datasets/_samples_generator.py

sklearn/datasets/tests/test_samples_generator.py

adrinjalali

Otherwise LGTM.

adrinjalali · 2024-12-06T10:51:33Z

sklearn/datasets/_samples_generator.py

            )
        if len(weights) == n_classes - 1:
            if isinstance(weights, list):
-                weights = weights + [1.0 - sum(weights)]


this isn't modifying the existing variable, it's allocating a new chunk of memory, and the nameweights refers to that new chunk. So the original data passed to this function is never changed. Therefore this change in this PR is unnecessary.

I did not intend to modify the variable weights passed by the user this is why I created new variable weights_

You won't be changing the original variable, that's now how python works 😉

Investigate this example:

import numpy as np a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) def f(a): a = a + 1 return a print(f(a)) print(a)

this isn't modifying the existing variable, it's allocating a new chunk of memory,

Thank you @adrinjalali for the correction. make_classification returns a bunch object which also contains a dictionary of the original value of parameters like n_samples, n_features and weights that was used to generate the data.

https://github.com/SuccessMoses/scikit-learn/blob/test/sklearn/datasets/_samples_generator.py#L356-L383

reassigning the variable weights will change its content. Is there a way to work around this?

I see, I'd missed that

SuccessMoses added 4 commits November 2, 2024 09:53

add feat_desc

f9b1ee0

fix ValueError

3a0f4d6

add return_X_y parameter

bbcb672

fix _parameter_constraint

f8f96ea

github-actions bot added the module:datasets label Nov 2, 2024

SuccessMoses added 5 commits November 2, 2024 14:27

add changelog

9935fa7

remove trailing whitespace

bc7c04e

add test

739dbe3

fix error in test

db0aa49

make new code pass tests

2250401

adrinjalali reviewed Nov 5, 2024

View reviewed changes

doc/whats_new/upcoming_changes/sklearn.datasets/30196.enhancement.rst Outdated Show resolved Hide resolved

SuccessMoses and others added 7 commits November 5, 2024 10:39

update changelog

52a7759

Co-authored-by: Adrin Jalali <[email protected]>

fix documentation for new parameter

6939e59

change bunch.DESC to bunch.DESCR

511bf49

Merge branch 'test' of https://github.com/SuccessMoses/scikit-learn i…

33c6166

…nto test

Merge branch 'main' into test

b7f2800

fix numpy doc error

da51f75

Merge branch 'test' of https://github.com/SuccessMoses/scikit-learn i…

45ad7bb

…nto test

SuccessMoses requested a review from adrinjalali November 5, 2024 15:01

adrinjalali reviewed Nov 6, 2024

View reviewed changes

sklearn/datasets/_samples_generator.py Outdated Show resolved Hide resolved

sklearn/datasets/_samples_generator.py Show resolved Hide resolved

sklearn/datasets/_samples_generator.py Outdated Show resolved Hide resolved

sklearn/datasets/_samples_generator.py Outdated Show resolved Hide resolved

SuccessMoses requested a review from adrinjalali November 6, 2024 22:59

change n_useless to n_random

8d94f62

OmarManzoor reviewed Nov 15, 2024

View reviewed changes

SuccessMoses added 3 commits November 19, 2024 21:23

fix docstring

742899a

test coverage

fbbbc9a

fix CI

239fd15

SuccessMoses requested a review from OmarManzoor November 22, 2024 09:31

OmarManzoor approved these changes Nov 22, 2024

View reviewed changes

adrinjalali reviewed Nov 25, 2024

View reviewed changes

SuccessMoses added 4 commits November 28, 2024 21:50

Merge branch 'main' into test

97f62b4

fix docstring

d75b033

add local weights_ to make_classification

833eba0

fix _

a5754db

adrinjalali reviewed Dec 6, 2024

View reviewed changes

adrinjalali approved these changes Jan 3, 2025

View reviewed changes

adrinjalali merged commit c9aeb15 into scikit-learn:main Jan 3, 2025
30 checks passed

Uh oh!

ENH Add parameter return_X_y to make_classification #30196

ENH Add parameter return_X_y to make_classification #30196

Conversation

SuccessMoses commented Nov 2, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Nov 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

SuccessMoses commented Nov 3, 2024

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SuccessMoses commented Nov 19, 2024

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

OmarManzoor commented Nov 22, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

SuccessMoses Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Jan 2, 2025

Choose a reason for hiding this comment

Uh oh!

SuccessMoses Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali Jan 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ENH Add parameter return_X_y to `make_classification` #30196

ENH Add parameter return_X_y to `make_classification` #30196

github-actions bot commented Nov 2, 2024 •

edited

Loading

SuccessMoses Jan 2, 2025 •

edited

Loading