Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Move axisartist towards untransposed transforms (operating on (N, 2) arrays instead of (2, N) arrays). #27551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented Dec 20, 2023

While Matplotlib normally represents lists of (x, y) coordinates as (N, 2) arrays and transforms (which we'll call "trans") have shape signature (N, 2) -> (N, 2), axisartist uses the opposite convention of using (2, N) arrays (or size-2 tuples of 1D arrays) and transforms (which it typically calls "transform_xy"). Change that and go back to Matplotlib's standard represenation in some of axisartist's internal representations for consistency. Also replace some uses of (x1, y1, x2, y2) quadruplets by single Bbox objects, which avoid having to keep track of the order of the points (is it x1, y1, x2, y2 or x1, x2, y1, y2?).

  • Add a _find_transformed_bbox(trans, bbox) API to ExtremeFinderSimple and its subclasses, replacing __call__(transform_xy, x1, y1, x2, y2). (I intentionally did not overload __call__'s signature yet nor did I deprecate it for now; we can consider doing that later.)
  • Deprecate GridFinder.{,inv_}transform_xy, which implement the transposed transform API.
  • Switch grid_info["extremes"] from quadruplet representation to Bbox.
  • Switch grid_info["lon"]["lines"] and likewise for "lat" from list-of-size-1-lists-of-pairs-of-1D-arrays to list-of-(N, 2)-arrays.
  • Switch grid_info["line_xy"] from pair-of-1D-arrays to a (N, 2) array.
  • Let _get_raw_grid_lines take a Bbox as last argument instead of 4 coordinates.

Note that I intentionally mostly didn't touch (transpose) public-facing APIs for now, this may happen later.

PR summary

PR checklist

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is a nice refactor to use our internal transfrom tooling rather than reinventing it all. Just some minor suggestions from me.

Comment on lines 200 to 202
lon_min, lat_min = tbbox.min
lon_max, lat_max = tbbox.max
grid_info["extremes"] = Bbox.from_extents(lon_min, lat_min, lon_max, lat_max)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use .extents later on which seems like you could use it here too?

Perhaps just store the frozen version which essentially does this for you, or do we even need it be "frozen" or can you put the bbox directly into the grid_info dictionary? (similar in the from_extents() call later on too)

Suggested change
lon_min, lat_min = tbbox.min
lon_max, lat_max = tbbox.max
grid_info["extremes"] = Bbox.from_extents(lon_min, lat_min, lon_max, lat_max)
lon_min, lat_min, lon_max, lat_max = tbbox.extents
grid_info["extremes"] = tbbox.frozen()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, changed.
I also further swapped the signature of _update_grid to also just take a single Bbox as parameter.

@timhoffm
Copy link
Member

timhoffm commented Dec 18, 2024

While Matplotlib normally represents lists of (x, y) coordinates as (N, 2) arrays [...], axisartist uses the opposite convention of using (2, N) arrays (or size-2 tuples of 1D arrays)
Note that I intentionally mostly didn't touch (transpose) public-facing APIs for now, this may happen later.

On a side-note, we can (and maybe should?) distinguish/handle (N, 2) arrays and (size-2 tuples of 1D arrays) for the API. That (size-2 tuples of 1D arrays) is technically a 2D array-like does not mean we have to interpret it in the same way as a 2D array.

Of course internally, (N, 2) arrays are the way to go, but for user-facing API not all array-likes have to be interpreted the same way. For example, one currently has to do

x = np.linspace(0, 6.28)
y = np.sin(x)
points = np.array([x, y]).T
Path(points)

The np.array(...).T feels quite awkward and is something that new users have to learn.

(Disregarding backward compatibility for now), I think it's not unreasonable to also allow

points = (x, y)
Path(points)

Note that the distinction would be very specifically limited to tuple of array-like, i.e. an implementation would likely coerce using

if isinstance(points, tuple):
    # assume tuple of coordinate arrays (x, y, ...)
    points = np.array(*points).T

@anntzer
Copy link
Contributor Author

anntzer commented Dec 18, 2024

The np.array(...).T feels quite awkward and is something that new users have to learn.

Personally I always write np.column_stack([x, y]) which makes more semantic sense to me.

Note that the distinction would be very specifically limited to tuple of array-like, i.e. an implementation would likely coerce using

There are currently some APIs (e.g., boxplot) that distinguish between lists (or tuples) of lists and 2d ndarrays, and they cause endless confusion... See e.g. #2539, #8092.

@timhoffm
Copy link
Member

Personally I always write np.column_stack([x, y]) which makes more semantic sense to me.

Indeed.

There are currently some APIs (e.g., boxplot) that distinguish between lists (or tuples) of lists and 2d ndarrays, and they cause endless confusion... See e.g. #2539, #8092.

Noted. Though I'm unclear how endless this trouble actually is. #2539 is a single report. #8092 is specifically on array of arrays, which is not really specified in the docs. I'd consider the implementation a bug and would simply disallow array of arrays. Also note that specifically boxplot() must support different number of data points per box and thus a 2D array is not sufficient, OTOH data are often in the form of 2D arrays, so accepting that is reasonable as well (and related but without discussion aggregating along columns).

I don't want to open a full discussion now. I agree with the scope of the PR that we don't want 2D (2, N) arrays.

dy = (y_max - y_min) / self.ny
return x_min - dx, x_max + dx, y_min - dy, y_max + dy
def _find_transformed_bbox(self, trans, bbox):
grid = np.reshape(np.meshgrid(np.linspace(bbox.x0, bbox.x1, self.nx),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this is private, it should have a docstring.

In particular it should explain the padding (expanded()) and it should mention that it's semantically equivalent to __call__ but with a better API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

While Matplotlib normally represents lists of (x, y) coordinates as
(N, 2) arrays and transforms (which we'll call "trans") have shape
signature (N, 2) -> (N, 2), axisartist uses the opposite convention
of using (2, N) arrays (or size-2 tuples of 1D arrays) and transforms
(which it typically calls "transform_xy").  Change that and go back to
Matplotlib's standard represenation in some of axisartist's internal
representations for consistency.  Also replace some uses of (x1, y1, x2,
y2) quadruplets by single Bbox objects, which avoid having to keep track
of the order of the points (is it x1, y1, x2, y2 or x1, x2, y1, y2?).

- Add a `_find_transformed_bbox(trans, bbox)` API to ExtremeFinderSimple
  and its subclasses, replacing `__call__(transform_xy, x1, y1, x2, y2)`.
  (I intentionally did not overload `__call__`'s signature yet nor did I
  deprecate it for now; we can consider doing that later.)
- Deprecate `GridFinder.{,inv_}transform_xy`, which implement the
  transposed transform API.
- Switch `grid_info["extremes"]` from quadruplet representation to Bbox.
- Switch `grid_info["lon"]["lines"]` and likewise for "lat" from
  list-of-size-1-lists-of-pairs-of-1D-arrays to list-of-(N, 2)-arrays.
- Switch `grid_info["line_xy"]` from pair-of-1D-arrays to a (N, 2) array.
- Let `_get_grid_info` and `_get_raw_grid_lines` take a Bbox as (last)
  argument instead of 4 coordinates.

Note that I intentionally mostly didn't touch (transpose) public-facing
APIs for now, this may happen later.
@timhoffm timhoffm added this to the v3.11.0 milestone Dec 19, 2024
@timhoffm timhoffm merged commit 82adc45 into matplotlib:main Dec 19, 2024
39 checks passed
@anntzer anntzer deleted the aaut branch December 20, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants