-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
ENH: Added FuncNorm #7631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Added FuncNorm #7631
Conversation
(norm_log, 'Log normalization'), | ||
(norm_sqrt, 'Root normalization')] | ||
|
||
for i, (norm, title) in enumerate(normalizations): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for i
: for ax_row, (norm, title) in zip(axes, normalizations):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great idea :)
plt.show() | ||
|
||
|
||
def get_data(_cache=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why the data needs to be produced in the loop if it's just going to be cached. Seems a bit over-engineered for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, the reason it is like this is because originally each of the plots was done independently inside a function, but I changed it a loop to comply with feedback from @story645, and I forgot to change that, thanks :)
""" | ||
Specify the function to be used, and its inverse, as well as other | ||
parameters to be passed to `Normalize`. The normalization will be | ||
calculated as (f(x)-f(vmin))/(f(max)-f(vmin)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max
or vmax
?
values in the [0,1] range. | ||
""" | ||
|
||
def __init__(self, f, finv=None, **normalize_kw): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be **kwargs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that kwargs is what is used as the general case, however, I used normalize_kw because all of these parameters are to be passed to the parent class Normalize
. This is the same name convention used for subplots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not the same; those are dictionaries that are individual arguments. In this case, it's not an argument, it's the placeholder that accepts all other non-explicit keyword arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are completely right. In that case I am thinking that it may just be better to get vmin, and vmax and clip directly, and pass them explicitly to the parent class. Any downside to doing it like that_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably just follow the example of the other classes; LogNorm
, BoundaryNorm
and NoNorm
only accept clip
and the rest accept all three explicitly, so being explicit seems to be the best choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually LogNorm does not have a initialization function, so implicitly takes all three of them as well. So yes, I am convinced that it may be better to include them explicitly. It may be worth in that case to put the documentation for vmin, vmax, and clip in common variables so it can be reused across different classes, similarly to what they do here.
Inverse function of `f` that satisfies finv(f(x))==x. It is | ||
optional in cases where `f` is provided as a string. | ||
normalize_kw : dict, optional | ||
Dict with keywords (`vmin`,`vmax`,`clip`) passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be a dict
in this function, but it's just all-other-keyword-args to any caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @QuLogic that these should be individually documented.
resultnorm[mask] = (self._f(result[mask]) - self._f(vmin)) / \ | ||
(self._f(vmax) - self._f(vmin)) | ||
|
||
return np.ma.array(resultnorm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it a MaskedArray
? Is that just what other Norm
s do? It doesn't seem like anything is actually masked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precisely, the parent class returns masked arrays, even though it does not really ever sets the mask to anything. It would make sense to use them for values outside the range, the problem is that in that case there would not be a way to say whether they are above the maximum value, or below the minimum, and the plotting methods need this to use the under and over colours.
if clip: | ||
result = np.clip(result, vmin, vmax) | ||
resultnorm = (self._f(result) - self._f(vmin)) / \ | ||
(self._f(vmax) - self._f(vmin)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be useful to cache any of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been considering this but the problem is that the cache would depend on vmin, and vmax, and checking whether the cache is up to date, along as having to include new variables for the cache, would make the code much uglier.
I guess it would make sense in cases when evaluating f is expensive, but even in those cases, we would still have to evaluate f(result), which typically will consist on many values. Also, in general, the functions typically used for normalization should not be very expensive to evaluate... (although we should never underestimate the user, hehehe)
return value | ||
|
||
@staticmethod | ||
def _fun_normalizer(fun): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes, this was used by some of the derived classes in the original PR, and I figured this was the best place for all of them to have access as it is a general purpose normalization feature. I will remove it for now, and then when can decide where to include when it is necessary for the first time.
assert_array_equal(norm([0.01, 2]), [0, 1.0]) | ||
|
||
def test_limits_without_vmin(self): | ||
norm = mcolors.FuncNorm(f='log10', vmax=2.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same vmax
you would get if you didn't set it, so I guess it doesn't really test that it's working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that is a test on itself :P
You are right though, I will include tests where the values go above and below vmin, vmax, with and without the clip option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added added test_clip_true
, test_clip_false
, test_clip__default_false
to test the clipping behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those changes should pretty much address the original comment
if self.vmin > self.vmax: | ||
raise ValueError("vmin must be smaller than vmax") | ||
|
||
def ticks(self, nticks=13): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the mixing of concerns here, but I'll leave that to @efiring to determine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I also was not sure of this, because technically vmin, and vmax, do not belong to this class. Actually the only thing my autoscale methods do differently is to convert to float, so maybe I should just make a tiny change to the autoscale methods of Normalize to resemble this:
Methods in Normalize:
def autoscale(self, A):
self.vmin = np.ma.min(A)
self.vmax = np.ma.max(A)
def autoscale_None(self, A):
' autoscale only None-valued vmin or vmax'
if self.vmin is None and np.size(A) > 0:
self.vmin = np.ma.min(A)
if self.vmax is None and np.size(A) > 0:
self.vmax = np.ma.max(A)
Methods in FuncNorm:
def autoscale(self, A):
self.vmin = float(np.ma.min(A))
self.vmax = float(np.ma.max(A))
def autoscale_None(self, A):
if self.vmin is None:
self.vmin = float(np.ma.min(A))
if self.vmax is None:
self.vmax = float(np.ma.max(A))
self.vmin = float(self.vmin)
self.vmax = float(self.vmax)
if self.vmin > self.vmax:
raise ValueError("vmin must be smaller than vmax")
@efiring would it be ok, to include those changes (casting to float and vmax>vmin check) in Normalize, and remove the methods from FuncNorm?
|
||
return _cache[0] | ||
|
||
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feel like this should be in a main block, and might as well just directly put all the plotting code there instead of shoving it into a function...so
if __name__ == '__main__':
all the code currently in main()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I did not do the main originally, is because we now the examples generate the figures automatically through some process, and @efiring and I were not sure whether the actual process would run the file as main.
Yeah of course, I guess the only reason to have things in function is so the data generation could be after the rest, but maybe now that is much shortened it will not look that bad right in between were the norms are generated, and where the loop starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vast majority of examples (like >90%) use neither main
, nor __main__
stuff, though I'm sure some examples use classes derived from backend-specific things that I didn't count properly.
fig, axes = plt.subplots(3, 2, gridspec_kw={ | ||
'width_ratios': [1, 3.5]}, figsize=plt.figaspect(0.6)) | ||
|
||
# Example of logarithm normalization using FuncNorm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extraneous comment 'cause code explicitly shows this
|
||
# Example of logarithm normalization using FuncNorm | ||
norm_log = colors.FuncNorm(f='log10', vmin=0.01) | ||
# The same can be achieved with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment should be dropped and there should be a standalone example for that feature
|
||
# Example of root normalization using FuncNorm | ||
norm_sqrt = colors.FuncNorm(f='sqrt', vmin=0.0) | ||
# The same can be achieved with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, presenting users with 3 ways to do things off the bat can be super confusing. Also, is it really necessary to use two examples of the same norm in 1 example? I know you'll likely tell me it's more realistic, but I think examples should fundementally as small/basic as possible while still showing the functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two examples of the same norm in 1 example?
If you're referring to the two Axes, one is the norm function, the other is the actual usage for a colormap; see the figure at the top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not really see a problem of having an example of multiple use in this case (both the log, and the sqrt), but I am happy to change it if everyone thinks that the example image is not appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that presenting multiple ways of doing the same thing (with the extra comment) provides the user an extra insight of what it can be done with the class, at a very low cost. But again, if everyone agrees that the comments are inappropriate, I am also happy to remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@QuLogic I was talking about the log and the sqrt norms and then also providing multiple methods of doing things. @alvarosg is there a colleague you can "hallway test" these docs (and warning messages) with. You have them read the doc/message and just ask what they think (and how they think it could be improved)
ticks = cax.norm.ticks(5) if norm else np.linspace(0, 1, 6) | ||
fig.colorbar(cax, format='%.3g', ticks=ticks, ax=ax2) | ||
ax2.set_title(title) | ||
ax2.axes.get_xaxis().set_ticks([]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ax2.xaxis.set_ticks([])
ax2.yaxis.set_ticks([])
f = func_parser.function | ||
finv = func_parser.inverse | ||
if not callable(f): | ||
raise ValueError("`f` must be a callable or a string.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f
must be a function or a string (I don't like using callable
in user facing docs 'cause I think it's a little too dev space)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always the user of a python module will also be a developer, and callable is a keyword of python, so IMO it is more clearer than function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not true at all though in this case. You've got plenty of users for matplotlib in particular who are scientists but not devs who aren't gonna be familiar with any python keyword they don't use all the time (and callable is rarely in that set)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that at least when they are native English speakers they can figure it out quickly enough from the context and the structure of the word itself, "callable" -> "call" "able" -> "something that can be called". The word "string" would be much harder to understand than "callable"--it's pure comp-sci jargon, not used anywhere else in this way, and not something that can be figured out from the word itself. We are not going to delete uses of "string" or "callable".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Callable is equivalent to function.. You'd still need to mention it was a string. And string is different cause it's used in every single intro python everything, callable isn't. Honestly, callable trips me up all the time and I'm a native English speaker with a CS background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, I dunno I see your point but a) I'm always wary of straight transcriptions of the if statements that triggered the exceptions being the error messages b) I sort of think their should maybe be a bigger discussion of who is matplotlib's expected audience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave it callable, because I think is a more accurate term. I think anyone able to use a callable (to pass it to the function) should know the term, and if not should be able to do a 5 s google search. In any case, let´s not waste our energy discussion this, as I think it is pretty irrelevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree with you that this specific thing probably isn't worth fighting about, I feel in a general sense that it's bad practice to dismiss a usability concern as "well they should know what it's called and how to search for it" 'cause rarely are either of those statements true.
self._f = f | ||
self._finv = finv | ||
|
||
super(FuncNorm, self).__init__(**normalize_kw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason this is put at the end rather than upfront?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I will move it up
result, is_scalar = self.process_value(value) | ||
self.autoscale_None(result) | ||
|
||
vmin = float(self.vmin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does self.vmin/self.vmin need to be converted to float? I think there's an import at the top that forces division to always be floating point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good point, I did not noticed that, I guess there is no need, in that case. Thanks!
resultnorm = result.copy() | ||
mask_over = result > vmax | ||
mask_under = result < vmin | ||
mask = (result >= vmin) * (result <= vmax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about something like this 'cause I feel like this is a bit too much on the clever but obfuscating side?
mask = mask_over || mask_under
and then just use ~mask
everywhere you're using mask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or mask = ~(mask_over | mask_under)
or mask = ~mask_over & ~mask_under
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I am always up for improving the efficiency!
return ticks | ||
|
||
@staticmethod | ||
def _round_ticks(ticks, permanenttick): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as @QuLogic , question for @efiring about mixing concerns. Wonder if all the tick stuff should be in a private class (or public) in ticker and then normalize should just point to the default formatter and locators it should use. (this is an issue I ran headlong into w/ catagorical norming too...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am definitely not very happy with the current with this the way it is...
and then normalize should just point to the default formatter and locators it should use
How does Normalize communicates with the formatters and tickers, is there any good example around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does Normalize communicates with the formatters and tickers, is there any good example around?
It's really messy currently (code is in colorbar.py and would probably require a refactor in the call stack. But that doesn't really matter to you. Since you're having them explictly get ticks via
ticks = cax.norm.ticks(5) if norm else np.linspace(0, 1, 6)
fig.colorbar(cax, format='%.3g', ticks=ticks, ax=ax_right)
cax.norm.ticks
should really likely be it's own tick Locator Method that locates ticks based on some input (I guess convoluted functions). The downside is that it can't rely on the attributes in the norm (unless it's something like FuncNormLocator(norm)
), but I think that prevents scope creep in norms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I ideally I would not like the user to have to call ticks manually, but to get those ticks automatically, but I was not sure how to change the default ticker, to maybe implement a FuncTicker class...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be easy to modify colorbar to use the ticks method of its Norm, if it exists, and if ticks are not provided by the user. The alternative of having all tick locators and formatters in tickers.py, and having Norms include a method or attributes for default locators and formatters, is also reasonable. I'm going to leave this question open for the moment, but we will need to return to it. I suspect the second of these two approaches will turn out to be the best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would, but the problem is that the colorbar has two different ways to represent the colorbar, depending on spacing
:
'uniform'
: Represent the colorbar uniformly between 0 and 1 and then assign values for the ticks that are not uniform according to the normalization. This is the one I normally use, and the ones that should use the tick values returned by ticks.'proportional'
: Stretch/compress the colorbar to represent the non-linearities given by the normalization, so the actual axis in the colorbar is uniform on the data values. In this case selecting the ticks is the same as with any linear axis.
What if I just make a FuncNorm locator class, and then add the corresponding line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efiring @story645 I have now made a new class FuncLocator (And tests), to include the behavior of proposing tick locations in a more appropriate place.
Instead of taking the norm itself as a parameter of the locator, I decided to just pass a method with the direct and inverse transformation. Then references to the methods of the norm object are passed in the initialization. This way the ticker module does not depend directly on colors, and on FuncNorm in particular, but still, if the FuncNorm instance is modified (limits for example, after calling clim), FuncNorm will adapt the behaviour if his methods and this will be available to the locator.
If you could please take a look, I am sure you can provide useful feedback.
PS: Happy new year :D
I wonder if supporting/documenting the string arguments adds more complexity than usability and benefit. From my perspective, we're relying on the private What's a use case where a string is preferable to just using a numpy function or simply one of your own? |
@phobson This may not be very relevant here (even though the first post already shows the simplicity of using a string vs two callables), because there is only one function, and because if they user is advanced enough to want to use this, he may not mind using two callables. However, one of the next steps is to implement other classes which take more than one function (and more than one inverse). Some of those may never get implemented, but I am interested in particular on I think the key is not exposing _StringFuncParser to the user directly at all, so we can switch to another service in the future if we find something better. About battle testing the parser, I guess it is simple enough so in case there is a problem it can be fixed, but since the range of inputs is very limited, the tests can actually cover most cases. |
|
||
def main(): | ||
fig, axes = plt.subplots(3, 2, gridspec_kw={ | ||
'width_ratios': [1, 3.5]}, figsize=plt.figaspect(0.6)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indent is kind of funny here; I'd break before the gridspec_kw
, not in the middle of its value, if possible. Also, you can add sharex='col'
and this will automatically remove the tick labels in between plots.
That's pretty compelling. Thanks for clearing that up.
I guess I'm thinking that since we're accepting input and directly passing it to |
norm = mcolors.FuncNorm(f='log10', vmin=0.01, vmax=2.) | ||
x = np.linspace(0.01, 2, 10) | ||
assert_array_almost_equal(x, norm.inverse(norm(x))) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add tests for scalar values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now added
Current coverage is 63.57% (diff: 88.17%)@@ master #7631 diff @@
==========================================
Files 174 174
Lines 56120 65699 +9579
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 34826 41765 +6939
- Misses 21294 23934 +2640
Partials 0 0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor recommendations; but there are a couple of major questions that I need to return to:
- How to handle ticking?
- Should masked array inputs with masked points be handled differently?
I almost forgot: I am also wondering about whether more restrictions on functions, and checks on values, are needed, specifically to ensure that functions are monotonic, bounded, (and strictly increasing?) over the range of normalization. For example, if a user asks for 'square' and feeds in data from -1 to 1, it won't be good...
""" | ||
Creates a normalizer using a custom function | ||
|
||
The normalizer will be a function mapping the data values into colormap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would start the docstring (which is describing a class) with "A norm based on a monotonic function". Then a blank line, followed by the second sentence of the present init docstring, followed by the remainder of the present init docstring (Parameters, etc.). This is in accord with the numpydoc specification for classes: the init args and kwargs are described in the class docstring, and there is no need for an init docstring at all.
f : callable or string | ||
Function to be used for the normalization receiving a single | ||
parameter, compatible with scalar values and ndarrays. | ||
Alternatively a string from the list ['linear', 'quadratic', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative would be to put the list of strings and the explanation of "p" in the Notes section. The advantage is that it would set it apart, and keep the Parameters block from being so long. The disadvantage is that it might be separating it too much from its parameter. It's up to you.
can be used, replacing 'p' by the corresponding value of the | ||
parameter, when present. | ||
finv : callable, optional | ||
Inverse function of `f` that satisfies finv(f(x))==x. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see more concise docstrings, comments, and code, in general. I won't try to identify every opportunity for shortening things, but I will make some suggestions. Here, the line could be "Inverse of f
: finv(f(x)) == x." Below, clarify by saying "Optional and ignored when f
is a string; otherwise, required."
vmin : float or None, optional | ||
Value assigned to the lower limit of the colormap. If None, it | ||
will be assigned to the minimum value of the data provided. | ||
Default None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you could combine vmin with vmax, reverse the order (-> "vmin, vmax: None or float, optional") and delete the "Default None" line. Then, just "Data values to be mapped to 0 and 1. If either is None, it is assigned the minimum or maximum value of the data supplied to the first call of the norm." Let's leave the word "colormap" out, using it only where necessary, as in the clip explanation.
Default None. | ||
clip : bool, optional | ||
If True, any value below `vmin` will be clipped to `vmin`, and | ||
any value above `vmax` will be clip to `vmin`. This effectively |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'clip : bool, optional, default is False' and then delete the last line of the docstring. In addition to being more concise, having the default up front makes it more obvious. Then, 'If True, clip data values to [vmin, vmax]. This defeats ... colormap. If False, ... respectively.'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, the default option is specified on the description, not in the specification, right?
From numpydoc:
Optional keyword parameters have default values, which are displayed as part of the function signature. They can also be detailed in the description:
# the limits vmin and vmax may require changing/updating the | ||
# function depending on vmin/vmax, for example rescaling it | ||
# to accommodate to the new interval. | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be "pass", not "return". "pass" is the "do nothing" word.
|
||
self._check_vmin_vmax() | ||
vmin = float(self.vmin) | ||
vmax = float(self.vmax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should _check_vmin_vmax
do the float conversion and return the two values, so you can write vmin, vmax = self._check_vmin_vmax()
?
|
||
Parameters | ||
---------- | ||
value : float or ndarray of floats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be a masked array, to handle missing values, or a python sequence, and it doesn't have to be float. So maybe just say "scalar or array-like".
resultnorm[mask] = (self._f(result[mask]) - self._f(vmin)) / \ | ||
(self._f(vmax) - self._f(vmin)) | ||
|
||
resultnorm = np.ma.array(resultnorm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessary, because process_value()
makes result
a masked array, and all the operations you are doing after that appear to preserve the masked array type. Since your string-based functions like log10
are np.log10
and not np.ma.log10
, however, they are preserving the original mask but not suppressing the warnings as the ma versions would do. (I'm actually surprised that the np versions are returning with the invalid values masked; maybe this has been added in newer numpy versions.)
return ticks | ||
|
||
@staticmethod | ||
def _round_ticks(ticks, permanenttick): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be easy to modify colorbar to use the ticks method of its Norm, if it exists, and if ticks are not provided by the user. The alternative of having all tick locators and formatters in tickers.py, and having Norms include a method or attributes for default locators and formatters, is also reasonable. I'm going to leave this question open for the moment, but we will need to return to it. I suspect the second of these two approaches will turn out to be the best.
Thanks for the review, I essentially agree with pretty much everything, and implement those changes when I have some time :) About the major questions:
I do not have a strong opinion on what to do here. I think the functionality in the ticks function is interesting, but about where should that be, I would leave up to the people with a deeper knowledge of the library.
I tried to do it the same way it was done for the other normalizations, I will double check though.
I completely agree on this. The problem here is that the issue it is quite different for callables that for predefined string functions:
This would be my approach to solve it:
Taking all this into account, I may still prefer just to specify very clearly in the documentation that the function must be strictly increasing and bounded in the [vmin, vmax] interval, which by default will be the bounds of the data to be normalizaed. |
@efiring Did you get a chance to look at the changes I implemented about a couple of weeks ago? |
@alvarosg I apologize for having neglected this for so long--it has been on my conscience. I think I can get to it on Saturday, but probably not before then. |
Ping @efiring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(copied from #7294)
I strongly
- oppose adding a new domain-specific language to describe functions (root{n}, etc). Yes, the parser has already been merged into cbook but it is not actually used right now and would prefer removing it.
- believe that we should still first sort out the overlap between scales, norms, locators, and formatters that has been (much) discussed above, before adding more complex functionality.
As usual, other devs should feel free to override this review if there is consensus to do so.
Closing based on lack of comments over the rejection two weeks ago. |
I agree w/ @anntzer closing this given the current API proposed by this PR. I think this could be re-opened a) if the "string" representation went away, and b) if some thought was put into if all the norms should be implemented with this mechanism, which would require a bit of refactoring, but doesn't seem un-fathomably difficult. I only somewhat agree that tick locators should be an issue. The obvious default is just equally-spaced ticks in normalized space that will fall where they may in data space, unless a special Locator is provided. I don't see what else a general tool is supposed to do. I disagree w/ the original authors suggestion to have a bunch of extra parameters to specify the range over which normalization is valid; just specify the ranges in the user-supplied function. Yet another argument against the string-representation of the norms. I am not interested in doing this work myself. But I'd happily re-open if someone else wanted to refactor this a bit. |
While I think this is doable, I dunno that a variant of this PR should be held up because of that. A version of |
@story645 I agree, thats possible, but ideally some thought would be given to the API of this PR to make sure that works. |
This PR is part of a larger PR originally proposed to include FuncNorm and PiecewiseNorm. It was decided to split it into several PRs for easier review starting from the simpler classes.
In this case the functionality added is
FuncNorm
, a normalization class (inheriting fromNormalize
), that allows using any arbitrary function as a normalization, by specifying a callable (and a callable of the inverse function), or a string compatible with the brand new_StringFuncParser
.Examples of usage for log normalization:
the same can be achieved with
For root normalization:
the same can be achieved with
or with
Tests have been added, as well as an example producing this output this output:
Possible caveat:
Most of the behaviour provided by this class does not change the existing interfaces for normalizations, as most public methods of the new class are just overridden methods from
Normalize
.The only exception to this is the ticks methods, returning an educated guess on where to add tick values, which would be a new public method in the class.On the other hand I think it works quite well (even for very non-linear normalizations) and it is important to give the user a good guess at where to put the ticks, so it is not required to manually enter values (or come up with his own automated algorithm). This way the only thing the user needs to specify is a number of ticks, which may also be easily set to a default value. We should think whether we want to include this or no, or maybe do it in a different way.The way it is done now is by an explicit call tonorm.ticks()
:python <del>fig.colorbar(cax, format='%.3g', ticks=cax.norm.ticks(5), ax=ax2) <del>
A FuncLocator class has now been implemented, so ticks are set automatically to the suggested positions without explicit user input, in exactly the same way it was done for other normalizations like
LogNorm
.