WIP: MAINT: Made clip into an ufunc #7876

pimdh · 2016-07-28T00:30:02Z

In order to solve issue #7633, it was suggested at #7873 to make clip into an ufunc.
This is an partial implementation of that. Left to do:

Implement proper typecasting
Properly handle None and NaN
Add a faster loop to loops.c.src to speed up if min or max are not arrays
Update the docs
Create benchmarks and test the new implementation's performance
Remove remnants of the old implementation from
- numeric.py
- fromnumeric.py
- numpy_api.py
- multiarray_api.*
- arraytypes.*.src
- calculation.*

But before I continue, I'd like to ask the following:

Am I on the right track?
How do I implement proper type casting? Now for example test_clip_with_out_array_int32 fails with message TypeError: ufunc 'clip' output (typecode 'd') could not be coerced to provided output parameter (typecode 'i') according to the casting rule ''same_kind'' because not all the arguments are of the same type.
In arraytypes.c fastclip is written down very concisely by doing almost all types at once. In loops.c.src, however, all definitions are split into the main categories (ints, floats, etc.). So far, I've followed that expanded approach, but the concise version seems nicer. Is it acceptable to add the concise combined version to the bottom of loops.c.src instead?

seberg · 2016-07-28T11:39:55Z

Nice efforts! As is, I think what might work is if you create a special type resolver for it (there may already be one which does it also).

We could consider creating the special type resolver also, and give a FutureWarning that in the future it will not always cast to a floating value magically, but will also honor integer loops, etc.

I don't have the time right now, but if you can't find it, just make a note or I will probably forget to give more hints.

seberg · 2016-08-05T07:13:54Z

A bit of a look, it looks like you are on the right track. I would say that you can put it into a single loop if possible. Not sure about the error on first sight, it sounds a bit like it does not have an iii->i loop?

pimdh · 2016-08-13T23:03:37Z

Thanks for your comments. I can replicate the previous type behaviour by creating a type resolver that forces the casting to be unsafe. I'm not sure how to introduce any warnings, because the type resolver doesn't seem to have access to the target array type, so the error can't be raised there.

I come across another problem:
The expected behaviour concerning NaN would be to propagate NaN for the first argument, but not for the min or max arguments. This can be easily implemented in the ufunc loops. However, the ufunc does not allow for variable arguments and neither fmin/fmax nor minimum/maximum allow for this asymmetrical NaN propagation. The only solution I can think of, would be to create two more ufunc that follow this behaviour, but that seems very inelegant.

Does anyone have any ideas how to address these issues? Thanks

pimdh · 2016-08-14T16:11:55Z

There seems to be another NaN issue. On all but Travis build nr 5, it shows error:
RuntimeWarning: invalid value encountered in clip, which I managed to reproduce on Ubuntu 12.04, Python 2.7.3 with:
np.array([np.NaN]).clip(1,2)

However, when I change

NPY_NO_EXPORT void
@TYPE@_clip(char **args, npy_intp *dimensions, npy_intp *steps, void *NPY_UNUSED(func))
{
    TERNARY_LOOP {
        const @type@ in = *((@type@ *)ip1);
        const @type@ min = *((@type@ *)ip2);
        const @type@ max = *((@type@ *)ip3);

        if (@lt@(in, min)) {
            *((@type@ *)op1) = min;
        }
        else if (@gt@(in, max)) {
            *((@type@ *)op1) = max;
        }
        else {
            *((@type@ *)op1) = in;
        }
    }
}

, which yields

>>> np.array([np.NaN]).clip(1,2)
__console__:1: RuntimeWarning: invalid value encountered in clip
array([ nan])

, to

NPY_NO_EXPORT void
@TYPE@_clip(char **args, npy_intp *dimensions, npy_intp *steps, void *NPY_UNUSED(func))
{
    TERNARY_LOOP {
        const @type@ in = *((@type@ *)ip1);
        const @type@ min = *((@type@ *)ip2);
        const @type@ max = *((@type@ *)ip3);

        printf(in < min ? "a" : "b");
        printf(in > max ? "c" : "d");
        if (@lt@(in, min)) {
            *((@type@ *)op1) = min;
        }
        else if (@gt@(in, max)) {
            *((@type@ *)op1) = max;
        }
        else {
            *((@type@ *)op1) = in;
        }
    }
}

, it shows

>>> np.array([np.NaN]).clip(1,2)
bdarray([ nan])

Can anyone advise on what causes this? Thanks

seberg · 2016-08-15T15:00:56Z

Hmm, I don't really know floating point flag details. You could check for NaN first to avoid the flag being set probably, or possible unset the flag (@juliantaylor might know without thinking much).
Things changing with the prints, might be because of different vectorization or so, it is nothing unusual I think.

About giving a warning, one thing we did before was introduce special casting flag like "FORCE_CAST_BUT_WARN", it is a bit of a cludge, but works.

anntzer · 2016-11-20T03:04:11Z

Could #5142 (behavior of clip when amin > amax) be fixed during this rewrite too? Just a suggestion from the peanut gallery :-)

homu · 2017-01-16T20:02:23Z

☔ The latest upstream changes (presumably #8475) made this pull request unmergeable. Please resolve the merge conflicts.

seberg · 2017-01-16T21:32:58Z

Since homu mentioned it, looked at it again. In principle I still really like it, and I think the NaN problems can probably be gotten around (mostly) (on linux there is a comparison function which will not set the floating point error flags for example).

One problem I still see is that it seems that the clip function may allow for None to mean that no clipping should be done. Which is something that the ufunc cannot support (except by using the maximum value). Or maybe I am being silly and this is only true in the C-Api though....

If you ever pick it up again or decide its too complex, we can still warm up the old PR to just fix the out problem....

pimdh · 2017-01-16T22:01:04Z

I'll take another look at this this weekend, see if I can finish it / make some progress.

eric-wieser · 2017-04-26T21:43:49Z

@drumstok: Think you'll return to this, or do you want someone else to take over?

In order to solve issue #7633, it was suggested to make clip into an ufunc. This is an partial implementation of that. Left to do: - Implement proper typecasting - Propely handle None and NaN - Add a faster loop to loops.c.src to speed up if min or max are not arrays - Update the docs - Create benchmarks and test the new implementation - Remove remnants of the old implementation from - - numeric.py - - fromnumeric.py - - numpy_api.py - - multiarray_api.* - - arraytypes.*.src - - calculation.*

In order to solve issue #7633, it was suggested to make clip into an ufunc. This is an partial implementation of that. The tests should pass except for test_clip_nan (test_numeric.TestClip), which tests NaN behaviour if either min or max is missing. Ufuncs doen't seem to allow for missing arguements, so ideal (f)min/max are used. However, the expected behaviour is asymmetrical: propagate NaN for the first, but not for the other arguments. None of the existing ufuncs has this behaviour. Left to do: - Handle asymmetrical NaN behaviour is either min or max is missing - Implement warning if unsafe typecasting is used - Add a faster loop to loops.c.src to speed up if min or max are not arrays - Update the docs - Create benchmarks and test the new implementation - Remove remnants of the old implementation from - - numeric.py - - fromnumeric.py - - numpy_api.py - - multiarray_api.* - - arraytypes.*.src - - calculation.*

pimdh · 2017-06-10T15:56:15Z

I have rebased and incorporated #8475 in the ufunc docstrings. However, having looked again at the work that still needs to be done, I am fine with someone taking over. The complications that arise from the NaN's and the allowing of missing arguments make this a bit too complicated for me at the moment. ( @eric-wieser )

eric-wieser · 2018-12-19T17:08:34Z

Continued in #12519

seberg · 2018-12-19T17:30:30Z

I guess we can close it for now then in favor of the new approach. The 3 ufunc approach seems more realistic anyway.

Thanks all!

pimdh mentioned this pull request Jul 28, 2016

BUG: Fix handling of clip out= argument order properly #7873

Closed

charris added 01 - Enhancement component: numpy._core 03 - Maintenance labels Jul 28, 2016

eric-wieser mentioned this pull request May 1, 2017

ENH: Vectorize INT_FastClip operation using AVX2 #9037

Closed

eric-wieser mentioned this pull request May 9, 2017

Adding support for one-sided clipping #8990

Closed

pimdh added 3 commits June 10, 2017 16:03

Bugfix and incorporated docstring improvement #8475

1d14f84

seberg closed this Dec 19, 2018

eric-wieser mentioned this pull request Feb 4, 2019

MAINT: Merge together the unary and binary type resolvers #12928

Merged

eric-wieser mentioned this pull request Feb 25, 2019

MAINT: Replace if statement with a dictionary lookup for ease of extensibility in ufunc generator #13031

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP: MAINT: Made clip into an ufunc #7876

WIP: MAINT: Made clip into an ufunc #7876

Uh oh!

pimdh commented Jul 28, 2016

Uh oh!

seberg commented Jul 28, 2016

Uh oh!

seberg commented Aug 5, 2016

Uh oh!

pimdh commented Aug 13, 2016 •

edited

Loading

Uh oh!

pimdh commented Aug 14, 2016 •

edited

Loading

Uh oh!

seberg commented Aug 15, 2016

Uh oh!

anntzer commented Nov 20, 2016

Uh oh!

homu commented Jan 16, 2017

Uh oh!

seberg commented Jan 16, 2017

Uh oh!

pimdh commented Jan 16, 2017

Uh oh!

eric-wieser commented Apr 26, 2017

Uh oh!

pimdh commented Jun 10, 2017 •

edited

Loading

Uh oh!

eric-wieser commented Dec 19, 2018

Uh oh!

seberg commented Dec 19, 2018

Uh oh!

Uh oh!

Uh oh!

WIP: MAINT: Made clip into an ufunc #7876

WIP: MAINT: Made clip into an ufunc #7876

Uh oh!

Conversation

pimdh commented Jul 28, 2016

Uh oh!

seberg commented Jul 28, 2016

Uh oh!

seberg commented Aug 5, 2016

Uh oh!

pimdh commented Aug 13, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pimdh commented Aug 14, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Aug 15, 2016

Uh oh!

anntzer commented Nov 20, 2016

Uh oh!

homu commented Jan 16, 2017

Uh oh!

seberg commented Jan 16, 2017

Uh oh!

pimdh commented Jan 16, 2017

Uh oh!

eric-wieser commented Apr 26, 2017

Uh oh!

pimdh commented Jun 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-wieser commented Dec 19, 2018

Uh oh!

seberg commented Dec 19, 2018

Uh oh!

Uh oh!

pimdh commented Aug 13, 2016 •

edited

Loading

pimdh commented Aug 14, 2016 •

edited

Loading

pimdh commented Jun 10, 2017 •

edited

Loading