Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TST Change expected result type np.int64 -> np.int #18089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 7, 2020

Conversation

ckastner
Copy link
Contributor

@ckastner ckastner commented Aug 4, 2020

The function being tested uses np.int, so match that in the test. The
test otherwise fails on 32-bit architectures.

Closes: #18084

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ckastner !

We be good to run this test in CI, likely by specifying the version of pandas to install so it is not skipped

PANDAS_VERSION: 'none'

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @ckastner !

@@ -186,12 +186,12 @@ def test_loader(loader_func, data_shape, target_shape, n_target, has_descr,


@pytest.mark.parametrize("loader_func, data_dtype, target_dtype", [
(load_breast_cancer, np.float64, np.int64),
(load_breast_cancer, np.float64, np.int),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.int is being deprecated in numpy. Does the following work?

Suggested change
(load_breast_cancer, np.float64, np.int),
(load_breast_cancer, np.float64, np.int_),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, my use of np.int was mistaken anyway. The data-loading function uses plain (builtin) int, not np.int as I originally claimed:

target[i] = np.asarray(ir[-1], dtype=int)

I propose switching to that (PR updated), so that the data-loading function and the test are consistent. If you'd prefer np.int_, please let me know. Sorry about the confusion.

On a side note: grep -R 'np\.int[^_1368a-z]' sklearn/* gives me ~100 occurrences of np.int in the codebase which will probably need to be updated at some point, along with the other deprecated types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int is works as well.

@thomasjpfan
Copy link
Member

As for the PANDAS_VERSION in config.yml, removing the line will test on the latest version of pandas.

(I hope we do not run into 32 bit + pandas issues)

The function being tested uses int, so match that in the test. The
test otherwise fails on 32-bit architectures.

Closes: scikit-learn#18084
@ckastner
Copy link
Contributor Author

ckastner commented Aug 5, 2020

As for the PANDAS_VERSION in config.yml, removing the line will test on the latest version of pandas.

(I hope we do not run into 32 bit + pandas issues)

To add a data point, the packages we build for Debian are built on a number of 32-bit architectures (x86, ARM, MIPS), and all builds use pandas. All but very few tests pass, and IIRC the few failing tests have indications other than pandas.

@rth
Copy link
Member

rth commented Aug 6, 2020

Thanks for the information! Could you still enable pandas in the 32 bit build? It's not currently in the diff..

@ckastner
Copy link
Contributor Author

ckastner commented Aug 6, 2020

Sorry about that, I thought that comment wasn't meant for me. Enabled and pushed.

Just for my own understanding: I assume this enabling is limited to this particular PR and won't be merged? Because if the intention is to merge, then it should probably also be enabled for the 64-bit build, where it is currently disabled.

@ckastner
Copy link
Contributor Author

ckastner commented Aug 6, 2020

The change did not have an effect on CI, the tests involving pandas were skipped again. I assume that deploying this change involves further steps?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this PR to enable pandas.

LGTM

@@ -50,7 +50,8 @@ elif [[ "$DISTRIB" == "ubuntu" ]]; then

elif [[ "$DISTRIB" == "ubuntu-32" ]]; then
apt-get update
apt-get install -y python3-dev python3-scipy python3-matplotlib libatlas3-base libatlas-base-dev python3-virtualenv
apt-get install -y python3-dev python3-scipy python3-matplotlib libatlas3-base libatlas-base-dev python3-virtualenv python3-pandas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use numpy from apt-get thus we need to use pandas from there as well.

@@ -1200,7 +1200,7 @@ def test_check_fit_params(indices):
def test_check_sparse_pandas_sp_format(sp_format):
# check_array converts pandas dataframe with only sparse arrays into
# sparse matrix
pd = pytest.importorskip("pandas")
pd = pytest.importorskip("pandas", minversion="0.25.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sparse support was added in 0.25 and the pandas version installed on 18.04 is 0.22.0.

@thomasjpfan
Copy link
Member

Error in CI is for OSX which is a timeout issue that is not related.

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ckastner and @thomasjpfan !

@rth rth merged commit 5af6561 into scikit-learn:master Aug 7, 2020
@ckastner ckastner deleted the numpy-int branch August 7, 2020 06:55
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing tests on 32-bit archtectures from use of numpy.int
3 participants