ENH: np.linalg.inv: Allow disabling error when one matrix is singular in a stack #28782

math-hiyoko · 2025-04-20T13:39:36Z

Description

This PR introduces a new optional parameter noerr (default: False) to the numpy.linalg.inv function. The purpose of this parameter is to allow users to compute inverses of multiple matrices simultaneously (stacked matrices with shape (m, n, n)), without raising a LinAlgError if at least one matrix is singular. #27035

Example

>>> import numpy as np
>>> a = np.array([
...    [[1, 0], [0, 1]],   # invertible
...    [[0, 0], [0, 0]],   # singular
... ])
>>> np.linalg.inv(matrices, noerr=True)
array([[[ 1.,  0.],
        [ 0.,  1.]],

       [[nan, nan],
        [nan, nan]]])

close #27035

charris · 2025-04-21T01:39:25Z

The PyPy failure can be ignored.

seberg · 2025-04-23T07:44:05Z

We should probably discuss how to do this (i.e. how to name the noerr, or whether to re-use np.errstate()?).
If we do this for one of the functions, we should eventually (ideally soon) also do it for all others that can raise.

I thought I recalled some discussion on this in SciPy (@tylerjereddy?) if they had a plan on how (if) to do this, then we need to align.

For the implementation/code: You also need to test that the result for the invalid entries is as expected, you are testing only the valid ones!

math-hiyoko · 2025-04-23T18:31:25Z

I’ve improved the test cases as suggested.
https://github.com/math-hiyoko/numpy/blob/13fe6a4f9397f0baacb1f64b387eaacfd4b3a432/numpy/linalg/tests/test_linalg.py#L572-L587

tylerjereddy

I thought I recalled some discussion on this in SciPy (@tylerjereddy?) if they had a plan on how (if) to do this, then we need to align.

Yup, I was pretty vocal about wanting to have the default behavior remain erroring out if something is "wrong" (fail fast and early, Pythonic, etc.). Here is the main discussion that comes to mind: scipy/scipy#22476

@ilayn @mdhaber and @ev-br had some back and forth there. I think Ralf had a similar view to mine.

A conceptually related thing is "lazy" backends (i.e., JAX) where you can't introspect and so your only real choice is to return NaNs for busted axes/slices. One other thing I remember being discussed was wanting to avoid certain synchronization scenarios on GPUs, even for eager backends, so avoiding the error checks in those cases was also discussed.

Anyway, that's a bit wider than what NumPy needs to care about, but those are the cases where recent discussions came up I think. Perhaps NumPy only wants to be involved in the "batching" part of the discussion, though some loose awareness of the other scenarios where NaNs get returned instead of errors may be helpful context for the broader ecosystem.

tylerjereddy · 2025-04-24T02:31:03Z

numpy/linalg/tests/test_linalg.py

+
+        assert_almost_equal(result[0], np.array([[1.0, 0.0], [0.0, 1.0]]))
+        assert_(np.isnan(result[1]).all())
+        assert_almost_equal(result[2], np.array([[2 / 3, -1 / 3], [-1 / 3, 2 / 3]]))


Minor points, but assert_allclose() should probably be preferred as noted in the docs for assert_almost_equal (in newly-written code, even if you see the old way in some source files).

Plain assert probably slightly preferred to the old assert_ as well, now that we use pytest

Yes, also assert_almost_equal should deal with NaNs, so there is no need to do this in 3 asserts, creating the full expected result and compariging is a bit nicer.

I've updated the tests to replace test functions, following the feedback provided.

Thank you for your suggestions !!

It’s strange—tests are failing because the expected error isn’t being raised correctly, but this issue appears only in a single environment.

seberg · 2025-04-24T06:31:54Z

Yup, I was pretty vocal about wanting to have the default behavior remain erroring out if something is "wrong" (fail fast and early, Pythonic, etc.).

Yeah, even if we were to change the defeault, it would maybe make sense to try and allow the old behavior or go to a warning.
Which actually again is an argument for a with errstate()-style option (at least for what the default is).

Anyway, that's a bit wider than what NumPy needs to care about, but those are the cases where recent discussions came up I think.

Those are arguments for making it available and slightly normalizing non-erroring behavior. It's a bit hard for SciPy, I guess, but I am not sure I would worry too much about behavior changing here if you pass a cupy/torch/... array rather than a NumPy one to a library function.

Anyway, right now I lean towards np.errstate() being the nicer API here (a kwarg seems very optional then, more like a probably not useful micro-optimization).
SciPy could re-use NumPy's state (and also introduce it's own context as a stop-gap on older versions).

That is slightly harder to implement, and we need a name of for that option of course. "Linalg" related would make sense. Something more general could actually also make sense to me, but maybe it's best to stay specific (one can always add a "group" if we end up with multiple such errstates).

N.B.: The fast-math discussion is slightly similar, as it could also be tagged on to errstate (think of it as "ufunc/global-state" then) even if that is a misnomer for the user API.

ilayn · 2025-04-24T07:27:41Z

Returning error is fine if there is nothing else to be done. But this is not just "there is an error I'm quitting". There is still work to do in a batch array. That is the crux of the issue. Not all jobs are meant to be "if broken, requires to be fixed" if one slice is singular and the rest is OK.

We are definitely not trying to hide or silence any error. Also we are not forgiving any bad LAPACK argument errors etc. Those are fundamental errors that relate to the code not being able to carry on prescribed algorithm and raises an Exception and work stops. There is no change there and those are hard errors. Having LinAlgError in one of the slices is a different issue and hence it is a bit more nuanced in that sense. Exception based model of Python does not allow us to return the broken output AND raise an exception. And the alternative is warning or printing stuff which is always ignored hence the whole discussion.

@ev-br also mentioned the idea of talking to np.errorstate in scipy/scipy#22838 but I don't know ow to do that in non-NumPy C code yet. That would also fix this issue without any extra keywords and still maintain the explicit style of numpy.errstate. Any pointers would be appreciated.

ev-br · 2025-04-24T07:44:07Z

+1 for np.errstate, it's IMO strictly better than a noerr=True keyword.

Ideally, there's a context manager to allow user control of at least three scenarios:

fail fast with a LinAlgError
finish no matter what, silently return nans for failing slices
emit useful diagnostics, "internal GETRF returned info=-4 for slice 42" (as a LinAlgWarning?).

I don't know if np.errstate is flexible enough to allow all three, and make the warning to be "always" not once. And ideally it's something scipy can reuse.

ilayn · 2025-04-24T08:06:01Z

emit useful diagnostics, "internal GETRF returned info=-4 for slice 42" (as a LinAlgWarning?).

This one should be out of scope in my opinion. That's a hard error independent from the data and it should never happen unless there is an issue somewhere else so better to quit.

seberg · 2025-04-24T08:08:07Z

So I think this is very clearly leaning towards errstate. Let's ignore what the default should be, I think there is a philosophical and historic component but it's just a separate issue.
(Historic, because if you never operate on stacks an error is a far more obvious choice. While if you do operate on "stacks", like math ufuncs, returning invalid values becomes interesting in practice.)

@ilayn currently there is basically no useful C-API for errstate beyond "giving an error of type X" (where the function will check errstate for you).

But, I don't think that matters. We can clearly expand this, but we should focus on user API first, IMO.
You can query the state from Python (or hard-code/backport NumPy implementation details maybe), if we add API like np.get_errstate("linalg") then C or not is likely irrelevant (unless you need to know within the ufunc loop).
As this would be a new state anyway, there obviously can't be existing API for it anyway :).

We may not even want to include it into np.errstate() if just because errstate(all="ignore") should maybe not affect such a new option.
(Implementation wise, I would include it into the same state object on the C level in either case.)

Ideally, there's a context manager to allow user control of at least three scenarios:

These are the exact options we do for all other warning classes of course (plus print and call, but I would be happy to omit them for new things, TBH).

"Fail early" would require the loop to know about the state. That shouldn't be too hard, but I would consider a later step. (I am not convinced fail-fast is a very important feature here and it's orthogonal to the user API again.)

EDIT: Ah, I missed the additional info to "something went wrong". That may be depend on the infrastructure we create, but not sure if it is vital.

and make the warning to be "always" not once

The "always"/"once" is a Python warning option. I think it's best to just keep it that way and not overcomplicate things here. (It's also too difficult anyway, since Python "once" takes into account the call site.)

And ideally it's something scipy can reuse.

I mean it's just a context from Python that you can query that context and we can trivially make it a bit faster/easier to access with new C-API.
For ufuncs, I always assumed we will eventually pass in the state to the ufunc loop selector and inner-loop (e.g. also due to the overlap with fast-math loop selection).

ilayn · 2025-04-24T08:55:41Z

But, I don't think that matters. We can clearly expand this, but we should focus on user API first, IMO.

I agree with this. So probably I couldn't elaborate properly earlier.

From the UX, SciPy does try to be helpful for solve by providing warnings for the ill-conditioned 2D array and returning a condition number in the warning. We want to carry this also to inv. If a hard singularity/exact 0.0 found then we raise LinAlgError.

Now we want to extend this to nD-arrays. Let's keep example simple; A is of shape (10, 5, 5). Slice 3 (0-indexed) is ill-conditioned and Slice 7 is exactly singular. Currently this is happening in a Python loop for SciPy and C loop in NumPy, that is solving each 5x5 case and pushing it to a suitable output array. And if any singularity occurs, it's done.

A = np.zeros([10, 5, 5])
A += np.eye(5)
A[7, :, :] = np.arange(25).reshape(5,5)
A[3, 4, 4] = 1e-16  # Currently no effect, since both NumPy and SciPy does not have ill-conditioning check on inv
np.linalg.inv(A)

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
Cell In[43], line 1
----> 1 np.linalg.inv(A)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\numpy\linalg\_linalg.py:609, in inv(a)
    606 signature = 'D->D' if isComplexType(t) else 'd->d'
    607 with errstate(call=_raise_linalgerror_singular, invalid='call',
    608               over='ignore', divide='ignore', under='ignore'):
--> 609     ainv = _umath_linalg.inv(a, signature=signature)
    610 return wrap(ainv.astype(result_t, copy=False))

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\numpy\linalg\_linalg.py:104, in _raise_linalgerror_singular(err, flag)
    103 def _raise_linalgerror_singular(err, flag):
--> 104     raise LinAlgError("Singular matrix")

LinAlgError: Singular matrix

So the argument is then making it such that if we do

with np.errstate(<some setting tbd>='ignore'):
    np.linalg.inv(A)

LinAlgWarning: input [3, ..., ...] is ill-conditioned. The results may be inaccurate (rcond=....)

returns a partially NaN filled array. Am I seeing the picture correctly?

seberg · 2025-04-24T09:02:07Z

Well, clearly, something="ignore" wouldn't give a warning, but something="warn" would. I suppose my all="ignore" suddenly also applying here is probably a harmless change in practice.
The question may just be, should this be linalg=, something broader, or just a different name (even np.linalg.<...>)?
(Implementation wise it would always be a part of errstate, although FPEs are still different beasts as you don't have to do signal them manually; For ufuncs/casts, the ufunc machinery does it for you.)

ilayn · 2025-04-24T09:34:34Z

That's a very juicy decision discussion to have. I guess one detail for us to understand, since we will probably follow your lead;

Can we get the context before we start the operations meaning that if LinAlgError raise is selected and we will have a hard stop on an error, then at C level we can skip the rest of the array computations since they will be wasted anyways. In more concrete terms, if we know the context and A[2, :, :] is singular we better skip A[3:, :, :] since we know there is no point in continuing.

If we can get the context beforehand then we can pass this information to the actual algorithm for an early exit. I think this would affect the final decision.

ev-br · 2025-04-24T09:49:13Z

if ... A[2, :, :] is singular we better skip A[3:, :, :] since we know there is no point in continuing.

I still think this is wrong.
IMO the "fail fast" option is to raise if any slice is singular. Better not depend on the ordering of slices.
So the UI options IMO should be:

raise if any slice is singular
compute all, silently fill singular ones with nans
as above, with diagnostics.

The "always"/"once" is a Python warning option. I think it's best to just keep it that way and not overcomplicate things here. (It's also too difficult anyway, since Python "once" takes into account the call site.)

Yes it is. My point is if we're designing a context manager, it better include the warning control, so that a user can use a single context manager instead of two, one for linalg state and the other for with warnings.catch_warnings(...).

seberg · 2025-04-24T09:58:44Z

Can we get the context before we start the operations

Let's separate this out:

For existing FPEs/errstate, you can already query the state in Python (if slowish/clunky). So that is clearly no problem.
For ufuncs you can't easily currently (of course you could just query the state via Python). The future here is probably:
- We have a "context" now. Simply tag a pointer to the state on to that context, this will make it easy to query the state both at loop selection time and within the inner-loop itself. (all with new ABI only obviously)
- There might be a possibility via thread-locals, but since context should be easy not sure there is a point.
For non-ufuncs, just create a simple API to query the errstate from C.

So the UI options IMO should be: * raise if any slice is singular

Yes, implementation-wise for linalg functions a fail-fast may be the right thing to do though (a difference only for in-place out= anyway). UI wise, I don't consider it as important.
(I don't think we disagree on any of this? Besides maybe whether fail-fast is important or not.)

My point is if we're designing a context manager, it better include the warning control

Well, np.errstate doesn't include it. I still lean towards thinking it is unnecessary complexity, but even if you consider it seriously, I think it is orthogonal? You could extend errstate to also mutate the Python warning state after all, although it may be hard to convince me personally :).

ilayn · 2025-04-24T10:00:27Z

IMO the "fail fast" option is to raise if any slice is singular. Better not depend on the ordering of slices.

Yes if A[2, :, :] we raise the error. So not sure I understand your objection. I'm trying to skip the unnecessary wait until the error is raised. So it is a quick exit, "raise as soon as you encounter singularity". The point is not to wait 10 seconds to realize the first slice was already singular.

Since we don't implement these as ufuncs, this detail is important for us and not so much for NumPy

ilayn · 2025-04-24T10:02:34Z

Let's separate this out:

OK item 2 is also not so relevant for linalg batches anyways then we are good to go in that regard.

seberg · 2025-04-25T10:03:08Z

OK, let me just argue for these 2-3 things to start with:

Expand the internal extobj (and python errstate) to store a linalg errstate.
- reject print and call for it.
- But do update it for all=....
- all="call"/"print" needs thought. The practical thing is probably to just ignore it for linalg, but maybe deprecate it also. We could have an fpe= to replace the current all= but I am not sure it's worthwhile.
  (I would be happy to try deprecating call/print entirely, however we need a solution for linalg. OTOH, a try/except to replace the error is probably good enough in practice.)
As a first iteration. linalg can roughly doing what it's doing, but choose different paths based on np.geterr()["linalg"].
(We may want to:)
- Add something like np.geterr("linalg") to simplify/micro-optimize the np.geterr()["linalg") (creating a dict is a bit much).
- Add a C-API function to query the errstate. This requires thinking about its ABI (right now it's an int, it should be an uint32 or uint64 probably).

This is should all be pretty straight forward and 3. is only to make things a bit nicer especially for SciPy. The trickiest part should really be dancing around the call/print option (and deciding how to perform that dance).
(However, it does require C changes.)

ENH: np.linalg.inv: support noerr parameter

deebf2c

github-actions bot added the 01 - Enhancement label Apr 20, 2025

FIX: linter

0c5cd88

j-bowhay mentioned this pull request Apr 20, 2025

POC/RFC: linalg: low-level nD support scipy/scipy#22838

Closed

seberg added the triage review Issue/PR to be discussed at the next triage meeting label Apr 23, 2025

ENH: add testcase

13fe6a4

tylerjereddy reviewed Apr 24, 2025

View reviewed changes

FIX: test function

cb8ffa0

ev-br mentioned this pull request May 2, 2025

ENH: linalg: low-level nD support, take 3 scipy/scipy#22924

Merged

melissawm added this to NumPy first-time contributor PRs May 23, 2025

melissawm moved this to Needs decision in NumPy first-time contributor PRs May 23, 2025

ev-br mentioned this pull request Jun 10, 2025

ENH: linalg.inv: follow-up improvements to low-level ND support scipy/scipy#23141

Open

9 tasks

Uh oh!

ENH: np.linalg.inv: Allow disabling error when one matrix is singular in a stack #28782

Are you sure you want to change the base?

ENH: np.linalg.inv: Allow disabling error when one matrix is singular in a stack #28782

Uh oh!

Conversation

math-hiyoko commented Apr 20, 2025

Description

Example

Uh oh!

charris commented Apr 21, 2025

Uh oh!

seberg commented Apr 23, 2025

Uh oh!

math-hiyoko commented Apr 23, 2025

Uh oh!

tylerjereddy left a comment

Choose a reason for hiding this comment

Uh oh!

tylerjereddy Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

seberg Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

math-hiyoko Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

math-hiyoko Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

seberg commented Apr 24, 2025

Uh oh!

ilayn commented Apr 24, 2025

Uh oh!

ev-br commented Apr 24, 2025

Uh oh!

ilayn commented Apr 24, 2025

Uh oh!

seberg commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilayn commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Apr 24, 2025

Uh oh!

ilayn commented Apr 24, 2025

Uh oh!

ev-br commented Apr 24, 2025

Uh oh!

seberg commented Apr 24, 2025

Uh oh!

ilayn commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilayn commented Apr 24, 2025

Uh oh!

seberg commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

seberg commented Apr 24, 2025 •

edited

Loading

ilayn commented Apr 24, 2025 •

edited

Loading

ilayn commented Apr 24, 2025 •

edited

Loading

seberg commented Apr 25, 2025 •

edited

Loading