Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MNT skip test falling on master in legacy platforms #12382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 15, 2018
Merged

MNT skip test falling on master in legacy platforms #12382

merged 4 commits into from
Oct 15, 2018

Conversation

qinhanmin2014
Copy link
Member

Apologies an accident in Appveyor prevent me from seeing this error (python2.7 32bit Appveyor).

_____________________________ test_count_nonzero ______________________________
    def test_count_nonzero():
        X = np.array([[0, 3, 0],
                      [2, -1, 0],
                      [0, 0, 0],
                      [9, 8, 7],
                      [4, 0, 5]], dtype=np.float64)
        X_csr = sp.csr_matrix(X)
        X_csc = sp.csc_matrix(X)
        X_nonzero = X != 0
        sample_weight = [.5, .2, .3, .1, .1]
        X_nonzero_weighted = X_nonzero * np.array(sample_weight)[:, None]
    
        for axis in [0, 1, -1, -2, None]:
            assert_array_almost_equal(count_nonzero(X_csr, axis=axis),
                                      X_nonzero.sum(axis=axis))
            assert_array_almost_equal(count_nonzero(X_csr, axis=axis,
                                                    sample_weight=sample_weight),
                                      X_nonzero_weighted.sum(axis=axis))
    
        assert_raises(TypeError, count_nonzero, X_csc)
        assert_raises(ValueError, count_nonzero, X_csr, axis=2)
    
        assert (count_nonzero(X_csr, axis=0).dtype ==
                count_nonzero(X_csr, axis=1).dtype)
        assert (count_nonzero(X_csr, axis=0, sample_weight=sample_weight).dtype ==
                count_nonzero(X_csr, axis=1, sample_weight=sample_weight).dtype)
    
        # Check dtypes with large sparse matrices too
        X_csr.indices = X_csr.indices.astype(np.int64)
        X_csr.indptr = X_csr.indptr.astype(np.int64)
>       assert (count_nonzero(X_csr, axis=0).dtype ==
                count_nonzero(X_csr, axis=1).dtype)
X          = array([[ 0.,  3.,  0.],
       [ 2., -1.,  0.],
       [ 0.,  0.,  0.],
       [ 9.,  8.,  7.],
       [ 4.,  0.,  5.]])
X_csc      = <5x3 sparse matrix of type '<type 'numpy.float64'>'
	with 8 stored elements in Compressed Sparse Column format>
X_csr      = <5x3 sparse matrix of type '<type 'numpy.float64'>'
	with 8 stored elements in Compressed Sparse Row format>
X_nonzero  = array([[False,  True, False],
       [ True,  True, False],
       [False, False, False],
       [ True,  True,  True],
       [ True, False,  True]])
X_nonzero_weighted = array([[0. , 0.5, 0. ],
       [0.2, 0.2, 0. ],
       [0. , 0. , 0. ],
       [0.1, 0.1, 0.1],
       [0.1, 0. , 0.1]])
axis       = None
sample_weight = [0.5, 0.2, 0.3, 0.1, 0.1]
c:\python27\lib\site-packages\sklearn\utils\tests\test_sparsefuncs.py:454: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
X = <5x3 sparse matrix of type '<type 'numpy.float64'>'
	with 8 stored elements in Compressed Sparse Row format>
axis = 0, sample_weight = None
    def count_nonzero(X, axis=None, sample_weight=None):
        """A variant of X.getnnz() with extension to weighting on axis 0
    
        Useful in efficiently calculating multilabel metrics.
    
        Parameters
        ----------
        X : CSR sparse matrix, shape = (n_samples, n_labels)
            Input data.
    
        axis : None, 0 or 1
            The axis on which the data is aggregated.
    
        sample_weight : array, shape = (n_samples,), optional
            Weight for each row of X.
        """
        if axis == -1:
            axis = 1
        elif axis == -2:
            axis = 0
        elif X.format != 'csr':
            raise TypeError('Expected CSR sparse format, got {0}'.format(X.format))
    
        # We rely here on the fact that np.diff(Y.indptr) for a CSR
        # will return the number of nonzero entries in each row.
        # A bincount over Y.indices will return the number of nonzeros
        # in each column. See ``csr_matrix.getnnz`` in scipy >= 0.14.
        if axis is None:
            if sample_weight is None:
                return X.nnz
            else:
                return np.dot(np.diff(X.indptr), sample_weight)
        elif axis == 1:
            out = np.diff(X.indptr)
            if sample_weight is None:
                # astype here is for consistency with axis=0 dtype
                return out.astype('intp')
            return out * sample_weight
        elif axis == 0:
            if sample_weight is None:
>               return np.bincount(X.indices, minlength=X.shape[1])
E               TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'
X          = <5x3 sparse matrix of type '<type 'numpy.float64'>'
	with 8 stored elements in Compressed Sparse Row format>
axis       = 0
sample_weight = None

I'm unable to come up with a good solution, maybe remove the test is acceptable since we already have one for normal csr matrix, or we can skip it on 32bit machine, but I'm not sure when it will happen.
ping @jnothman

@jnothman
Copy link
Member

I wonder why that operation is safe on 32-bit linux, but not on 32-bit windows.

@jnothman
Copy link
Member

Or rather, why linux does not require safe casting.

@@ -448,14 +448,6 @@ def test_count_nonzero():
assert (count_nonzero(X_csr, axis=0, sample_weight=sample_weight).dtype ==
count_nonzero(X_csr, axis=1, sample_weight=sample_weight).dtype)

# Check dtypes with large sparse matrices too
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better than this (but perhaps still not ideal) would be to have this all in a try-except block, catching and passing if the error message contains "according to the rule 'safe'" and if np.intp().nbytes < 8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM from my side and I can't figure out a better solution.

@qinhanmin2014
Copy link
Member Author

ping @jnothman CIs are green (though not sure why codecov fails), ready for another review :)

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@jnothman
Copy link
Member

Codecov doesn't get stats from windows runs.

@jnothman jnothman changed the title MNT Remove failed test on master MNT skip test falling on master in legacy platforms Oct 15, 2018
@jnothman
Copy link
Member

I'm merging this to fix master, to avoid confusion for contributors, especially since it seems to apply to the ending python 2

@jnothman jnothman merged commit afe0a9b into scikit-learn:master Oct 15, 2018
@qinhanmin2014 qinhanmin2014 deleted the appveyor-failure branch October 17, 2018 03:55
anuragkapale pushed a commit to anuragkapale/scikit-learn that referenced this pull request Oct 23, 2018
jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Nov 14, 2018
@amueller amueller mentioned this pull request Nov 20, 2018
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants