-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Re-write sym-log-norm #16391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-write sym-log-norm #16391
Conversation
9b81d13
to
73ae1d3
Compare
Suggest we let the base be an option. We will need to know the base if we make a scale for the colorbar The docstring probably needs to change as well. The linear region using is not exactly two decades which I think is done to make the derivative continuous. I was working on this and found the existing test should be improved to actually test some numbers we could calculate by hand rather than three random numbers. I’m happy for you to work on this but I would like to see substantive improvements to the tests and the documentation. After we fix this we can get back to fixing the broken colorbars which I’d propose fixing by associating a scale with the norm. Any norms that don’t have a scale could fallback to the manual ticking. |
This was done with a modified SymLogNorm that takes a base. So for sure, using base = np.e is different than base=10. Essentially a "decade" in np.e is smaller than one in base 10, and so the linear region is smaller. Note however, that even the base=10 case does not yield equally spaced "ticks".
|
vals = np.array([-30, -1, 2, 6], dtype=float) | ||
normed_vals = norm(vals) | ||
expected = [0., 0.53980074, 0.826991, 1.02758204] | ||
expected = [-0.842119, 0.450236, 0.599528, 1.277676] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these values were just plain wrong before...
- The first value (-30) is less than
vmax
(5), so should come out as less than zero - The second value (-1) is less than 0, so should come out as less than 0.5
The problem I see here is that the code found in My thought is that there should be one, and only one, math Scales could inherit from that This would guarantee consistency between the underlying math function chosen. |
@anntzer is working towards that. We could also change this to use |
I think this is ready for review. Particularly interested in opinions on
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is different than what we had before, even for base=np.e
. I don't have any skin in the game whether this one is better or worse but it needs more than this API note.
It is also inconsistent with SymmetricalLogTransform
, and hence SymmetricalLogScale
, so that needs to be cleaned up or we can't make a proper axes for it when it gets turned into a colorbar. My understanding is that the current implementation has a smooth derivative at the transition. Does this new implementation? Do we care about that? Not sure we do...
Overall, I think this requires its own gallery page clearly explaining the properties of the transform, or cite a reference that explains exactly what the algorithm here is. People need to be able to cite what this thing is in their papers.
normed_vals = norm(vals) | ||
assert_array_almost_equal(normed_vals, expected) | ||
expected = [0, 0.25, 0.5, 0.75, 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is consistent with SymmetricLogTransform
yet. I'm OK if we want to change that as well to be consistent with this code, but they can't be inconsistent!
import matplotlib.scale as mscale
trans = mscale.SymmetricalLogTransform(10, 1, 1)
new = trans.transform([-10, -1, 0, 1, 10])
new = (new - new[0]) / (new[-1] - new[0])
print(new)
[0.0 0.23684210526315788 0.5 0.7631578947368421 1.0]
I'm fine if this implementation is desired, but then we need to change SymmetricLogTransform
.
Thanks for the comments. re.
I'm not sure what this means or how it could be implemented? Totally agreed that there should be a gallery example going through this carefully. |
I mean, I think thats why the current version is the way it is. https://iopscience.iop.org/article/10.1088/0957-0233/24/2/027001 |
Ah nice - thoughts on using the approach in that paper (open access here) instead of (what I am interpreting as) the current approach? I think I'm pro using the published one because
|
Well, I prefer yours because its simpler and does what we say it should. But open to using either so long as its explained clearly... |
@dstansby we discussed this on the call; something like this could/should go in v3.3. For 3.2, we need to add a base kwarg, and deprecate np.e as the base so the default is consistent with our docs and what most folks have probably been assuming. I'll open a PR for that right now. |
I wonder whether we really need to support a symlog norm and whether we could consider deprecating it instead. We could also consider moving it to a separate package, similar to mpl-probscale. |
Sure; regardless of cosmetic changes I still think the maths is throwing out incorrect numbers: #16391 (comment) If we're not going to overhaul in 3.2, I think we should put a big warning on it that the function is un-tested and suspected wrong. |
Judging by the tests, though, I don't think the original authors thought it was spitting out the wrong maths. Unfortunately they failed to write down what they thought the correct maths are. That all said, overall I agree with @anntzer. This seems to be something we invented, versus something the scientific community is asking for. If it could be made its own package that would be fabulous. But if we do persist in providing it in core, it has to be well-defined and documented. At the very least, the cited paper does that. |
I think that everyone brings up really good points about the use of this function and that it is very subjective as to what different people would want out of it (or whether they should be using it at all). My specific use-case... I have a vector field that spans orders of magnitude and then I dot that into some other vector path (integral of v dot dx) to get a scalar field that has positive and negative values (spanning orders of magnitude) that depend on the direction of your integration path (dx). For the vector field, I write my own symlog vector scaling function that preserves angles, and I somewhat arbitrarily use a form close to what the referenced paper proposes.
(note that I don't think the linked paper will work with vector quantities even though they say it is bisymmetric because of the changing angles, but I didn't read it that closely). |
Given that you already wrote your custom normalization (which is certainly the "responsible" thing to do), I think what Matplotlib should do is really just making sure that you can easily use it as scale/norm, not providing its own symlog? |
Yes, I agree, and your previous point about using my own |
Back in the olden days of Matlab, we had one Norm and we liked it, and we just plotted our data using that linear norm. i.e. you transform the array first and then pcolor it. Surely thats good enough for exploratory data analysis: X = myData()
Y = mytransform(X)
pcolormesh(Y) |
I don't disagree... ¯_(ツ)_/¯ The nicety in mpl is now adding to your simple example: Again, I'm not opposed to deprecation, and I think most people are in agreement that this is a pretty niche norm. |
Fully agreed, the point of using a Norm is that the colorbar gives you proper numbers. But that means that each norm should have an associated scale, or the advantage is pretty much nullified. |
@anntzer, do you have more details on the proposed unification of scales and norms? I think this should be done ASAP so that most norms have a scale associated with it that the colorbar can just use. Will your proposed factory give us that? |
In the meantime, are we decided on deprecating and removing symlognorm? I am happy to do this, with extensive documentation on alternatives. |
Well maybe a quick poll and try to get main developers to vote? 👍 keep SymLogNorm and symlogscale |
My patch makes it possible to derive norms from scales (in which case the scale can known about the parent norm). Obviously you can still construct independent norms (e.g. BoundaryNorm) so not all norms will have an associated scale.
Well, #14916 is not exactly recent :) |
If you break symlog scale you break a lot of code I and a lot of other people in time-resolved spectroscopy use, please don't do that. Matplotlib is generally very careful about breaking code, so I don't see why this is not the case for scales. As the original author of the retrospective faulty SymlogNorm, I don't care too much about it anymore. I fully agree with the found deficiencies, but my usage at the time of coding does not depend on exactly reproducing the absolute data form a colormap. Instead of the relative amplitude of the values is importent, something which is in some cases better reproduced by a symlog-scale than a linear scale. Note that the data has both positive and negative signals and varies quite a lot in its amplitude. Hence I never cared about the base, since the final differences in the map were not that visible since it the scale is normalized. Again, I retrospect this was wrong. |
I have only just noticed this PR and haven't had time to understand all the issues involved, but I would like to strongly urge (aka beg) that SymLogNorm is not deprecated. A new x-ray scattering method called ΔPDF generates both positive and negative probability maps, which are ideally viewed in symmetric (often log) plots, e.g., see Krogstad, M. J. et al. Nat Mater 19, 63–68 (2020). We use MPL to plot with symmetric limits that are automatically enforced in NeXpy by choosing a divergent color map. If this is a niche, I think that it will be a growing one, since we are already collaborating with a number of research groups to produce ΔPDF data on a routine basis. |
@rayosborn as discussed on gitter, would |
I actually like the
Edit: I just implemented this and realized it works fine for colorbar normalization, but not for x/y plots with symlog scales due to the clipping creating hard cutoffs rather than smoother transitions. |
I'm not too bothered about what method is used to generate the symmetric "log" plots, since we don't analyze the images themselves. They are used to guide our interpretation and to present the data in talks and publications. If we fit any models, it is to the actual data, not the plotted representation. |
My concern is reproducibility. I’ve hand digitized a good number of Figures from papers where the data was no longer available, and if the scale was not clearly defined in the paper it would lead to errors. |
This is work I started a while ago to re-write SymLogNorm into code I could understand and read. As I wrote this I realised that the original code was just plain wrong... I have now added tests include values that are easy to manually calculate and verify. Suggestions for more tests welcome.
Fixes #16376