Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG liblinear/libsvm-based learners segfault when passed large sparse matrices #9545

Closed
@jnothman

Description

@jnothman

From #2969 (comment):
"Anything that uses liblinear (and possibly other bundled C as opposed to Cython code) will segfault when given CSR arrays with 64 bit indices (e.g. LogisticRegression(), LinearSVC() etc). This is fairly critical IMO, and even if sparse arrays with 64 bit indices won't be supported there in the near future (or at all), it would be good to check for indices dtype and raise a python exception when appropriate. This is also the reason these tests need to be run with pytest-xdist using the -n 1 option, so that pytest could recover from a crashed interpreter."

I assume the same is true of SVC, SVR.

The issue is that scipy.sparse matrices only relatively began to support large sparse matrices, such as where indptr and indices of csr_matrix may be 64-bit ints. This case should be ruled out for the liblinear/libsvm solvers. I think the best solution (so that we can later support or reject large sparse matrices more systematically) is to add a boolean parameter such as accept_large_sparse to sklearn.utils.check_array.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugEasyWell-defined and straightforward way to resolvehelp wanted

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions