ENH: Added FuncNorm #7631

alvarosg · 2016-12-16T00:35:48Z

This PR is part of a larger PR originally proposed to include FuncNorm and PiecewiseNorm. It was decided to split it into several PRs for easier review starting from the simpler classes.

In this case the functionality added is FuncNorm, a normalization class (inheriting from Normalize), that allows using any arbitrary function as a normalization, by specifying a callable (and a callable of the inverse function), or a string compatible with the brand new _StringFuncParser.

Examples of usage for log normalization:

    norm_log = colors.FuncNorm(f='log10', vmin=0.01)

the same can be achieved with

    norm_log = colors.FuncNorm(f=np.log10,
                               finv=lambda x: 10.**(x), vmin=0.01)

For root normalization:

    norm_sqrt = colors.FuncNorm(f='sqrt', vmin=0.0)

the same can be achieved with

    norm_sqrt = colors.FuncNorm(f='root{2}', vmin=0.)

or with

    norm_sqrt = colors.FuncNorm(f=lambda x: x**0.5,
                                finv=lambda x: x**2, vmin=0.0)

Tests have been added, as well as an example producing this output this output:

Possible caveat:

Most of the behaviour provided by this class does not change the existing interfaces for normalizations, as most public methods of the new class are just overridden methods from Normalize.

~~The only exception to this is the ticks methods, returning an educated guess on where to add tick values, which would be a new public method in the class.~~

On the other hand I think it works quite well (even for very non-linear normalizations) and it is important to give the user a good guess at where to put the ticks, so it is not required to manually enter values (or come up with his own automated algorithm). This way the only thing the user needs to specify is a number of ticks, which may also be easily set to a default value. We should think whether we want to include this or no, or maybe do it in a different way.

~~The way it is done now is by an explicit call to norm.ticks():~~

~~python <del>fig.colorbar(cax, format='%.3g', ticks=cax.norm.ticks(5), ax=ax2) <del>~~

A FuncLocator class has now been implemented, so ticks are set automatically to the suggested positions without explicit user input, in exactly the same way it was done for other normalizations like LogNorm.

alvarosg · 2016-12-16T00:50:55Z

@efiring @story645 @QuLogic
In case you are also interested in helping reviewing this one :)

QuLogic · 2016-12-16T00:47:20Z

examples/color/colormap_normalizations_funcnorm.py

+                      (norm_log, 'Log normalization'),
+                      (norm_sqrt, 'Root normalization')]
+
+    for i, (norm, title) in enumerate(normalizations):


No need for i: for ax_row, (norm, title) in zip(axes, normalizations):

great idea :)

QuLogic · 2016-12-16T00:48:39Z

examples/color/colormap_normalizations_funcnorm.py

+    plt.show()
+
+
+def get_data(_cache=[]):


I don't understand why the data needs to be produced in the loop if it's just going to be cached. Seems a bit over-engineered for an example.

Yes, you are right, the reason it is like this is because originally each of the plots was done independently inside a function, but I changed it a loop to comply with feedback from @story645, and I forgot to change that, thanks :)

QuLogic · 2016-12-16T00:49:43Z

lib/matplotlib/colors.py

+        """
+        Specify the function to be used, and its inverse, as well as other
+        parameters to be passed to `Normalize`. The normalization will be
+        calculated as (f(x)-f(vmin))/(f(max)-f(vmin)).


max or vmax?

QuLogic · 2016-12-16T00:51:10Z

lib/matplotlib/colors.py

+    values in the [0,1] range.
+    """
+
+    def __init__(self, f, finv=None, **normalize_kw):


This should be **kwargs.

I understand that kwargs is what is used as the general case, however, I used normalize_kw because all of these parameters are to be passed to the parent class Normalize. This is the same name convention used for subplots.

That's not the same; those are dictionaries that are individual arguments. In this case, it's not an argument, it's the placeholder that accepts all other non-explicit keyword arguments.

You are completely right. In that case I am thinking that it may just be better to get vmin, and vmax and clip directly, and pass them explicitly to the parent class. Any downside to doing it like that_

We should probably just follow the example of the other classes; LogNorm, BoundaryNorm and NoNorm only accept clip and the rest accept all three explicitly, so being explicit seems to be the best choice.

Actually LogNorm does not have a initialization function, so implicitly takes all three of them as well. So yes, I am convinced that it may be better to include them explicitly. It may be worth in that case to put the documentation for vmin, vmax, and clip in common variables so it can be reused across different classes, similarly to what they do here.

QuLogic · 2016-12-16T00:51:39Z

lib/matplotlib/colors.py

+            Inverse function of `f` that satisfies finv(f(x))==x. It is
+            optional in cases where `f` is provided as a string.
+        normalize_kw : dict, optional
+            Dict with keywords (`vmin`,`vmax`,`clip`) passed


It may be a dict in this function, but it's just all-other-keyword-args to any caller.

Agree with @QuLogic that these should be individually documented.

QuLogic · 2016-12-16T00:55:52Z

lib/matplotlib/colors.py

+            resultnorm[mask] = (self._f(result[mask]) - self._f(vmin)) / \
+                               (self._f(vmax) - self._f(vmin))
+
+        return np.ma.array(resultnorm)


Why is it a MaskedArray? Is that just what other Norms do? It doesn't seem like anything is actually masked.

Precisely, the parent class returns masked arrays, even though it does not really ever sets the mask to anything. It would make sense to use them for values outside the range, the problem is that in that case there would not be a way to say whether they are above the maximum value, or below the minimum, and the plotting methods need this to use the under and over colours.

QuLogic · 2016-12-16T01:01:52Z

lib/matplotlib/colors.py

+        if clip:
+            result = np.clip(result, vmin, vmax)
+            resultnorm = (self._f(result) - self._f(vmin)) / \
+                         (self._f(vmax) - self._f(vmin))


Would it be useful to cache any of these?

I have been considering this but the problem is that the cache would depend on vmin, and vmax, and checking whether the cache is up to date, along as having to include new variables for the cache, would make the code much uglier.

I guess it would make sense in cases when evaluating f is expensive, but even in those cases, we would still have to evaluate f(result), which typically will consist on many values. Also, in general, the functions typically used for normalization should not be very expensive to evaluate... (although we should never underestimate the user, hehehe)

QuLogic · 2016-12-16T01:05:52Z

lib/matplotlib/colors.py

+        return value
+
+    @staticmethod
+    def _fun_normalizer(fun):


This appears unused?

Oh, yes, this was used by some of the derived classes in the original PR, and I figured this was the best place for all of them to have access as it is a general purpose normalization feature. I will remove it for now, and then when can decide where to include when it is necessary for the first time.

QuLogic · 2016-12-16T01:14:37Z

lib/matplotlib/tests/test_colors.py

+        assert_array_equal(norm([0.01, 2]), [0, 1.0])
+
+    def test_limits_without_vmin(self):
+        norm = mcolors.FuncNorm(f='log10', vmax=2.)


This is the same vmax you would get if you didn't set it, so I guess it doesn't really test that it's working.

Yes, but that is a test on itself :P
You are right though, I will include tests where the values go above and below vmin, vmax, with and without the clip option.

I have added added test_clip_true, test_clip_false, test_clip__default_false to test the clipping behavior.

Those changes should pretty much address the original comment

QuLogic · 2016-12-16T01:17:34Z

lib/matplotlib/colors.py

+        if self.vmin > self.vmax:
+            raise ValueError("vmin must be smaller than vmax")
+
+    def ticks(self, nticks=13):


Not sure about the mixing of concerns here, but I'll leave that to @efiring to determine.

Yeah, I also was not sure of this, because technically vmin, and vmax, do not belong to this class. Actually the only thing my autoscale methods do differently is to convert to float, so maybe I should just make a tiny change to the autoscale methods of Normalize to resemble this:

Methods in Normalize:

def autoscale(self, A): self.vmin = np.ma.min(A) self.vmax = np.ma.max(A) def autoscale_None(self, A): ' autoscale only None-valued vmin or vmax' if self.vmin is None and np.size(A) > 0: self.vmin = np.ma.min(A) if self.vmax is None and np.size(A) > 0: self.vmax = np.ma.max(A)

Methods in FuncNorm:

def autoscale(self, A): self.vmin = float(np.ma.min(A)) self.vmax = float(np.ma.max(A)) def autoscale_None(self, A): if self.vmin is None: self.vmin = float(np.ma.min(A)) if self.vmax is None: self.vmax = float(np.ma.max(A)) self.vmin = float(self.vmin) self.vmax = float(self.vmax) if self.vmin > self.vmax: raise ValueError("vmin must be smaller than vmax")

@efiring would it be ok, to include those changes (casting to float and vmax>vmin check) in Normalize, and remove the methods from FuncNorm?

story645 · 2016-12-16T19:00:12Z

examples/color/colormap_normalizations_funcnorm.py

+
+    return _cache[0]
+
+main()


feel like this should be in a main block, and might as well just directly put all the plotting code there instead of shoving it into a function...so

if __name__ == '__main__': all the code currently in main()

The reason I did not do the main originally, is because we now the examples generate the figures automatically through some process, and @efiring and I were not sure whether the actual process would run the file as main.

Yeah of course, I guess the only reason to have things in function is so the data generation could be after the rest, but maybe now that is much shortened it will not look that bad right in between were the norms are generated, and where the loop starts.

The vast majority of examples (like >90%) use neither main, nor __main__ stuff, though I'm sure some examples use classes derived from backend-specific things that I didn't count properly.

story645 · 2016-12-16T19:01:10Z

examples/color/colormap_normalizations_funcnorm.py

+    fig, axes = plt.subplots(3, 2, gridspec_kw={
+              'width_ratios': [1, 3.5]}, figsize=plt.figaspect(0.6))
+
+    # Example of logarithm normalization using FuncNorm


extraneous comment 'cause code explicitly shows this

story645 · 2016-12-16T19:01:34Z

examples/color/colormap_normalizations_funcnorm.py

+
+    # Example of logarithm normalization using FuncNorm
+    norm_log = colors.FuncNorm(f='log10', vmin=0.01)
+    # The same can be achieved with


this comment should be dropped and there should be a standalone example for that feature

story645 · 2016-12-16T19:02:13Z

examples/color/colormap_normalizations_funcnorm.py

+
+    # Example of root normalization using FuncNorm
+    norm_sqrt = colors.FuncNorm(f='sqrt', vmin=0.0)
+    # The same can be achieved with


same as above, presenting users with 3 ways to do things off the bat can be super confusing. Also, is it really necessary to use two examples of the same norm in 1 example? I know you'll likely tell me it's more realistic, but I think examples should fundementally as small/basic as possible while still showing the functionality

two examples of the same norm in 1 example?

If you're referring to the two Axes, one is the norm function, the other is the actual usage for a colormap; see the figure at the top.

I do not really see a problem of having an example of multiple use in this case (both the log, and the sqrt), but I am happy to change it if everyone thinks that the example image is not appropriate.

I also think that presenting multiple ways of doing the same thing (with the extra comment) provides the user an extra insight of what it can be done with the class, at a very low cost. But again, if everyone agrees that the comments are inappropriate, I am also happy to remove them.

@QuLogic I was talking about the log and the sqrt norms and then also providing multiple methods of doing things. @alvarosg is there a colleague you can "hallway test" these docs (and warning messages) with. You have them read the doc/message and just ask what they think (and how they think it could be improved)

story645 · 2016-12-16T19:16:53Z

examples/color/colormap_normalizations_funcnorm.py

+        ticks = cax.norm.ticks(5) if norm else np.linspace(0, 1, 6)
+        fig.colorbar(cax, format='%.3g', ticks=ticks, ax=ax2)
+        ax2.set_title(title)
+        ax2.axes.get_xaxis().set_ticks([])


ax2.xaxis.set_ticks([]) ax2.yaxis.set_ticks([])

story645 · 2016-12-16T19:29:54Z

lib/matplotlib/colors.py

+            f = func_parser.function
+            finv = func_parser.inverse
+        if not callable(f):
+            raise ValueError("`f` must be a callable or a string.")


f must be a function or a string (I don't like using callable in user facing docs 'cause I think it's a little too dev space)

I always the user of a python module will also be a developer, and callable is a keyword of python, so IMO it is more clearer than function.

That's not true at all though in this case. You've got plenty of users for matplotlib in particular who are scientists but not devs who aren't gonna be familiar with any python keyword they don't use all the time (and callable is rarely in that set)

I think that at least when they are native English speakers they can figure it out quickly enough from the context and the structure of the word itself, "callable" -> "call" "able" -> "something that can be called". The word "string" would be much harder to understand than "callable"--it's pure comp-sci jargon, not used anywhere else in this way, and not something that can be figured out from the word itself. We are not going to delete uses of "string" or "callable".

Callable is equivalent to function.. You'd still need to mention it was a string. And string is different cause it's used in every single intro python everything, callable isn't. Honestly, callable trips me up all the time and I'm a native English speaker with a CS background.

Basically, I dunno I see your point but a) I'm always wary of straight transcriptions of the if statements that triggered the exceptions being the error messages b) I sort of think their should maybe be a bigger discussion of who is matplotlib's expected audience.

I would leave it callable, because I think is a more accurate term. I think anyone able to use a callable (to pass it to the function) should know the term, and if not should be able to do a 5 s google search. In any case, let´s not waste our energy discussion this, as I think it is pretty irrelevant.

While I agree with you that this specific thing probably isn't worth fighting about, I feel in a general sense that it's bad practice to dismiss a usability concern as "well they should know what it's called and how to search for it" 'cause rarely are either of those statements true.

story645 · 2016-12-16T19:33:21Z

lib/matplotlib/colors.py

+        self._f = f
+        self._finv = finv
+
+        super(FuncNorm, self).__init__(**normalize_kw)


any particular reason this is put at the end rather than upfront?

Nope, I will move it up

story645 · 2016-12-16T19:35:22Z

lib/matplotlib/colors.py

+        result, is_scalar = self.process_value(value)
+        self.autoscale_None(result)
+
+        vmin = float(self.vmin)


does self.vmin/self.vmin need to be converted to float? I think there's an import at the top that forces division to always be floating point...

This is a very good point, I did not noticed that, I guess there is no need, in that case. Thanks!

story645 · 2016-12-16T19:42:31Z

lib/matplotlib/colors.py

+            resultnorm = result.copy()
+            mask_over = result > vmax
+            mask_under = result < vmin
+            mask = (result >= vmin) * (result <= vmax)


What about something like this 'cause I feel like this is a bit too much on the clever but obfuscating side?
mask = mask_over || mask_under
and then just use ~mask everywhere you're using mask.

Or mask = ~(mask_over | mask_under) or mask = ~mask_over & ~mask_under?

Sure, I am always up for improving the efficiency!

story645 · 2016-12-16T19:49:15Z

lib/matplotlib/colors.py

+        return ticks
+
+    @staticmethod
+    def _round_ticks(ticks, permanenttick):


Same as @QuLogic , question for @efiring about mixing concerns. Wonder if all the tick stuff should be in a private class (or public) in ticker and then normalize should just point to the default formatter and locators it should use. (this is an issue I ran headlong into w/ catagorical norming too...)

Yes, I am definitely not very happy with the current with this the way it is...

and then normalize should just point to the default formatter and locators it should use

How does Normalize communicates with the formatters and tickers, is there any good example around?

How does Normalize communicates with the formatters and tickers, is there any good example around?

It's really messy currently (code is in colorbar.py and would probably require a refactor in the call stack. But that doesn't really matter to you. Since you're having them explictly get ticks via

ticks = cax.norm.ticks(5) if norm else np.linspace(0, 1, 6) fig.colorbar(cax, format='%.3g', ticks=ticks, ax=ax_right)

cax.norm.ticks should really likely be it's own tick Locator Method that locates ticks based on some input (I guess convoluted functions). The downside is that it can't rely on the attributes in the norm (unless it's something like FuncNormLocator(norm)), but I think that prevents scope creep in norms.

Yes, but I ideally I would not like the user to have to call ticks manually, but to get those ticks automatically, but I was not sure how to change the default ticker, to maybe implement a FuncTicker class...

It would be easy to modify colorbar to use the ticks method of its Norm, if it exists, and if ticks are not provided by the user. The alternative of having all tick locators and formatters in tickers.py, and having Norms include a method or attributes for default locators and formatters, is also reasonable. I'm going to leave this question open for the moment, but we will need to return to it. I suspect the second of these two approaches will turn out to be the best.

Yes, it would, but the problem is that the colorbar has two different ways to represent the colorbar, depending on spacing:

'uniform': Represent the colorbar uniformly between 0 and 1 and then assign values for the ticks that are not uniform according to the normalization. This is the one I normally use, and the ones that should use the tick values returned by ticks.

'proportional': Stretch/compress the colorbar to represent the non-linearities given by the normalization, so the actual axis in the colorbar is uniform on the data values. In this case selecting the ticks is the same as with any linear axis.

What if I just make a FuncNorm locator class, and then add the corresponding line here?

@efiring @story645 I have now made a new class FuncLocator (And tests), to include the behavior of proposing tick locations in a more appropriate place.

Instead of taking the norm itself as a parameter of the locator, I decided to just pass a method with the direct and inverse transformation. Then references to the methods of the norm object are passed in the initialization. This way the ticker module does not depend directly on colors, and on FuncNorm in particular, but still, if the FuncNorm instance is modified (limits for example, after calling clim), FuncNorm will adapt the behaviour if his methods and this will be available to the locator.

If you could please take a look, I am sure you can provide useful feedback.

PS: Happy new year :D

phobson · 2016-12-16T22:39:46Z

I wonder if supporting/documenting the string arguments adds more complexity than usability and benefit.

From my perspective, we're relying on the private _StringFuncParser that is fairly new (not battle tested?). But sense we're exposing it through this, we're effection committing to documenting and making its API public.

What's a use case where a string is preferable to just using a numpy function or simply one of your own?

alvarosg · 2016-12-17T00:46:50Z

@phobson
About the use case, the main advantage is not to have to specify both the function and the inverse: but specifying the string it can be parsed and both can be obtained.

This may not be very relevant here (even though the first post already shows the simplicity of using a string vs two callables), because there is only one function, and because if they user is advanced enough to want to use this, he may not mind using two callables.

However, one of the next steps is to implement other classes which take more than one function (and more than one inverse). Some of those may never get implemented, but I am interested in particular on MirrorPiecewiseNorm. This one will normalize a scale symmetrically (or not) around a value, e.g. zero. By setting the different functions on each size, one can control how data is stretched around zero, on the positive side and in the negative side, and the easiest way is to get qualitative change is to apply polynomials, or roots of different degrees to each size. By providing the string option, the user still gets most of the functions that he would normally use, and does not need to specify the inverse function for each of them.

I think the key is not exposing _StringFuncParser to the user directly at all, so we can switch to another service in the future if we find something better. About battle testing the parser, I guess it is simple enough so in case there is a problem it can be fixed, but since the range of inputs is very limited, the tests can actually cover most cases.

QuLogic · 2016-12-17T01:03:07Z

examples/color/colormap_normalizations_funcnorm.py

+
+def main():
+    fig, axes = plt.subplots(3, 2, gridspec_kw={
+              'width_ratios': [1, 3.5]}, figsize=plt.figaspect(0.6))


The indent is kind of funny here; I'd break before the gridspec_kw, not in the middle of its value, if possible. Also, you can add sharex='col' and this will automatically remove the tick labels in between plots.

phobson · 2016-12-17T01:05:12Z

About the use case, the main advantage is not to have to specify both the function and the inverse: but specifying the string it can be parsed and both can be obtained.

That's pretty compelling. Thanks for clearing that up.

I think the key is not exposing _StringFuncParser to the user directly at all, so we can switch to another service in the future if we find something better

I guess I'm thinking that since we're accepting input and directly passing it to _StringFuncParser, we are exposing the mini-language-ness of it (e.g., braces and the like). So any future change to a new parser will have to implement that same syntax. But I don't really see a way around that, ATM.

alvarosg · 2016-12-17T16:47:47Z

lib/matplotlib/tests/test_colors.py

+        norm = mcolors.FuncNorm(f='log10', vmin=0.01, vmax=2.)
+        x = np.linspace(0.01, 2, 10)
+        assert_array_almost_equal(x, norm.inverse(norm(x)))
+


Add tests for scalar values

this is now added

codecov-io · 2016-12-24T00:08:40Z

Current coverage is 63.57% (diff: 88.17%)

Merging #7631 into master will increase coverage by 1.51%

@@             master      #7631   diff @@
==========================================
  Files           174        174           
  Lines         56120      65699   +9579   
  Methods           0          0           
  Messages          0          0           
  Branches          0          0           
==========================================
+ Hits          34826      41765   +6939   
- Misses        21294      23934   +2640   
  Partials          0          0

Powered by Codecov. Last update 841a427...44658d4

efiring

Mostly minor recommendations; but there are a couple of major questions that I need to return to:

How to handle ticking?
Should masked array inputs with masked points be handled differently?

I almost forgot: I am also wondering about whether more restrictions on functions, and checks on values, are needed, specifically to ensure that functions are monotonic, bounded, (and strictly increasing?) over the range of normalization. For example, if a user asks for 'square' and feeds in data from -1 to 1, it won't be good...

efiring · 2016-12-24T00:13:57Z

lib/matplotlib/colors.py

+    """
+    Creates a normalizer using a custom function
+
+    The normalizer will be a function mapping the data values into colormap


I would start the docstring (which is describing a class) with "A norm based on a monotonic function". Then a blank line, followed by the second sentence of the present init docstring, followed by the remainder of the present init docstring (Parameters, etc.). This is in accord with the numpydoc specification for classes: the init args and kwargs are described in the class docstring, and there is no need for an init docstring at all.

efiring · 2016-12-24T00:20:42Z

lib/matplotlib/colors.py

+        f : callable or string
+            Function to be used for the normalization receiving a single
+            parameter, compatible with scalar values and ndarrays.
+            Alternatively a string from the list ['linear', 'quadratic',


An alternative would be to put the list of strings and the explanation of "p" in the Notes section. The advantage is that it would set it apart, and keep the Parameters block from being so long. The disadvantage is that it might be separating it too much from its parameter. It's up to you.

efiring · 2016-12-24T00:25:23Z

lib/matplotlib/colors.py

+            can be used, replacing 'p' by the corresponding value of the
+            parameter, when present.
+        finv : callable, optional
+            Inverse function of `f` that satisfies finv(f(x))==x.


I would prefer to see more concise docstrings, comments, and code, in general. I won't try to identify every opportunity for shortening things, but I will make some suggestions. Here, the line could be "Inverse of f: finv(f(x)) == x." Below, clarify by saying "Optional and ignored when f is a string; otherwise, required."

efiring · 2016-12-24T00:40:33Z

lib/matplotlib/colors.py

+        vmin : float or None, optional
+            Value assigned to the lower limit of the colormap. If None, it
+            will be assigned to the minimum value of the data provided.
+            Default None.


Here you could combine vmin with vmax, reverse the order (-> "vmin, vmax: None or float, optional") and delete the "Default None" line. Then, just "Data values to be mapped to 0 and 1. If either is None, it is assigned the minimum or maximum value of the data supplied to the first call of the norm." Let's leave the word "colormap" out, using it only where necessary, as in the clip explanation.

efiring · 2016-12-24T00:44:54Z

lib/matplotlib/colors.py

+            Default None.
+        clip : bool, optional
+            If True, any value below `vmin` will be clipped to `vmin`, and
+            any value above `vmax` will be clip to `vmin`. This effectively


'clip : bool, optional, default is False' and then delete the last line of the docstring. In addition to being more concise, having the default up front makes it more obvious. Then, 'If True, clip data values to [vmin, vmax]. This defeats ... colormap. If False, ... respectively.'

As far as I know, the default option is specified on the description, not in the specification, right?
From numpydoc:

Optional keyword parameters have default values, which are displayed as part of the function signature. They can also be detailed in the description:

efiring · 2016-12-24T01:09:44Z

lib/matplotlib/colors.py

+        # the limits vmin and vmax may require changing/updating the
+        # function depending on vmin/vmax, for example rescaling it
+        # to accommodate to the new interval.
+        return


This should be "pass", not "return". "pass" is the "do nothing" word.

efiring · 2016-12-24T01:24:37Z

lib/matplotlib/colors.py

+
+        self._check_vmin_vmax()
+        vmin = float(self.vmin)
+        vmax = float(self.vmax)


Should _check_vmin_vmax do the float conversion and return the two values, so you can write vmin, vmax = self._check_vmin_vmax()?

efiring · 2016-12-24T01:30:25Z

lib/matplotlib/colors.py

+
+        Parameters
+        ----------
+        value : float or ndarray of floats


It can be a masked array, to handle missing values, or a python sequence, and it doesn't have to be float. So maybe just say "scalar or array-like".

efiring · 2016-12-24T01:55:10Z

lib/matplotlib/colors.py

+            resultnorm[mask] = (self._f(result[mask]) - self._f(vmin)) / \
+                               (self._f(vmax) - self._f(vmin))
+
+        resultnorm = np.ma.array(resultnorm)


I don't think this is necessary, because process_value() makes result a masked array, and all the operations you are doing after that appear to preserve the masked array type. Since your string-based functions like log10 are np.log10 and not np.ma.log10, however, they are preserving the original mask but not suppressing the warnings as the ma versions would do. (I'm actually surprised that the np versions are returning with the invalid values masked; maybe this has been added in newer numpy versions.)

efiring · 2016-12-24T02:25:32Z

lib/matplotlib/colors.py

+        return ticks
+
+    @staticmethod
+    def _round_ticks(ticks, permanenttick):


It would be easy to modify colorbar to use the ticks method of its Norm, if it exists, and if ticks are not provided by the user. The alternative of having all tick locators and formatters in tickers.py, and having Norms include a method or attributes for default locators and formatters, is also reasonable. I'm going to leave this question open for the moment, but we will need to return to it. I suspect the second of these two approaches will turn out to be the best.

alvarosg · 2016-12-24T13:36:56Z

@efiring

Thanks for the review, I essentially agree with pretty much everything, and implement those changes when I have some time :)

About the major questions:

How to handle ticking?

I do not have a strong opinion on what to do here. I think the functionality in the ticks function is interesting, but about where should that be, I would leave up to the people with a deeper knowledge of the library.

Should masked array inputs with masked points be handled differently?

I tried to do it the same way it was done for the other normalizations, I will double check though.

I am also wondering about whether more restrictions on functions, and checks on values, are needed, specifically to ensure that functions are monotonic, bounded, (and strictly increasing?) over the range of normalization. For example, if a user asks for 'square' and feeds in data from -1 to 1, it won't be good...

I completely agree on this. The problem here is that the issue it is quite different for callables that for predefined string functions:

Callables: It is numerically impossible to check if a given callable is bounded and strictly increasing with a finite number of operations. I had long discussions about this on a PR for scipy to include funcionality for providing the numerical inverse of a function (I also needs to be strictly monotonic), and that was the main conclusion. The only think we can do is explicitly include the conditions on the callable in the documentation.
Strings: In this case, since we have the analytical form of the function, we could actually do those checks. The main problem, specially check that the data is within the bounds (0,Inf) for the 'sqrt' case. The tricky part is that this check is data dependent, and would need to be done for each possible calculation. So this would involve storing some extra data from checks.

This would be my approach to solve it:

Include an extra parameter to FuncNorm: validity_range which by default would be set to [-np.inf, np.inf].
This parameter is stored as an attribute, and data is always checked against those boundaries. Worst case scenario, the user forgets to set it (or in some case it is not necessary to set it, this is why I rather leave it optional), and the behaviour is exactly like now.
We add a new method to _FuncInfo (Maybe), that returns the validity range of the predefined functions. When the input is a string, these values are used to fill the new validity_range attribute.
Some of the tricky aspects of this are related to the cases where the validity ranges are open intervals. For example, in a logarithm normalization, the function is not bounded in the [0, inf) interval, but it is bounded in the [epsilon,inf) interval, This complicates things a little bit, because it would also involve storing another variable indicating whether the validity_range represents and open or closed interval on each end. If would be something similar to this (domain and open_domain arguments).

Taking all this into account, I may still prefer just to specify very clearly in the documentation that the function must be strictly increasing and bounded in the [vmin, vmax] interval, which by default will be the bounds of the data to be normalizaed.

alvarosg · 2017-02-05T14:56:48Z

@efiring Did you get a chance to look at the changes I implemented about a couple of weeks ago?

efiring · 2017-02-06T19:46:17Z

@alvarosg I apologize for having neglected this for so long--it has been on my conscience. I think I can get to it on Saturday, but probably not before then.

QuLogic · 2017-05-13T22:31:32Z

Ping @efiring.

anntzer

(copied from #7294)

I strongly

oppose adding a new domain-specific language to describe functions (root{n}, etc). Yes, the parser has already been merged into cbook but it is not actually used right now and would prefer removing it.
believe that we should still first sort out the overlap between scales, norms, locators, and formatters that has been (much) discussed above, before adding more complex functionality.
As usual, other devs should feel free to override this review if there is consensus to do so.

anntzer · 2018-02-17T22:02:56Z

Closing based on lack of comments over the rejection two weeks ago.

jklymak · 2018-02-18T18:35:17Z

I agree w/ @anntzer closing this given the current API proposed by this PR.

I think this could be re-opened a) if the "string" representation went away, and b) if some thought was put into if all the norms should be implemented with this mechanism, which would require a bit of refactoring, but doesn't seem un-fathomably difficult.

I only somewhat agree that tick locators should be an issue. The obvious default is just equally-spaced ticks in normalized space that will fall where they may in data space, unless a special Locator is provided. I don't see what else a general tool is supposed to do.

I disagree w/ the original authors suggestion to have a bunch of extra parameters to specify the range over which normalization is valid; just specify the ranges in the user-supplied function. Yet another argument against the string-representation of the norms.

I am not interested in doing this work myself. But I'd happily re-open if someone else wanted to refactor this a bit.

story645 · 2018-02-19T01:04:12Z

if some thought was put into if all the norms should be implemented with this mechanism, which would require a bit of refactoring, but doesn't seem un-fathomably difficult.

While I think this is doable, I dunno that a variant of this PR should be held up because of that. A version of FuncNorm could always go in first and then other norms rafactored against it (which I think will be smoother from a reviewing point of view anyway).

jklymak · 2018-02-19T01:06:15Z

@story645 I agree, thats possible, but ideally some thought would be given to the API of this PR to make sure that works.

ENH: Added FuncNorm

0ffcb72

alvarosg mentioned this pull request Dec 16, 2016

ENH: Added FuncNorm and PiecewiseNorm classes in colors #7294

Closed

5 tasks

QuLogic reviewed Dec 16, 2016

View reviewed changes

story645 reviewed Dec 16, 2016

View reviewed changes

QuLogic reviewed Dec 17, 2016

View reviewed changes

tacaswell added this to the 2.1 (next point release) milestone Dec 17, 2016

Implemented feedback from @QuLogic and @story645

44658d4

alvarosg commented Dec 17, 2016

View reviewed changes

NelleV requested a review from efiring December 19, 2016 16:35

NelleV assigned efiring Dec 19, 2016

efiring requested changes Dec 24, 2016

View reviewed changes

alvarosg added 2 commits January 15, 2017 12:50

Implemented changes from @efiring and added a test for scalars

546c8f3

Added FuncLocator

e967d69

tacaswell modified the milestones: 2.1 (next point release), 2.2 (next next feature release) Aug 29, 2017

efiring mentioned this pull request Jan 11, 2018

PowerNorm: do not clip negative values #10234

Merged

anntzer requested changes Feb 2, 2018

View reviewed changes

anntzer modified the milestones: needs sorting, unassigned Feb 17, 2018

anntzer closed this Feb 17, 2018

anntzer mentioned this pull request Oct 29, 2018

colorbar shows no ticks for decreasing norms #12665

Closed

Uh oh!

ENH: Added FuncNorm #7631

ENH: Added FuncNorm #7631

Uh oh!

Conversation

alvarosg commented Dec 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alvarosg commented Dec 16, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alvarosg Dec 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alvarosg Dec 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alvarosg Dec 17, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

QuLogic Dec 17, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alvarosg commented Dec 16, 2016 •

edited

Loading

alvarosg Dec 16, 2016 •

edited

Loading

alvarosg Dec 16, 2016 •

edited

Loading

alvarosg Dec 17, 2016 •

edited

Loading

QuLogic Dec 17, 2016 •

edited

Loading

story645 Dec 24, 2016 •

edited

Loading