-
-
Notifications
You must be signed in to change notification settings - Fork 407
enh: Support passing selection dictionary to Dataset.select #6617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I don't see any problem with this approach. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6617 +/- ##
=======================================
Coverage 88.83% 88.83%
=======================================
Files 327 328 +1
Lines 69708 69743 +35
=======================================
+ Hits 61922 61954 +32
- Misses 7786 7789 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Found out that |
@@ -415,6 +415,13 @@ def select(self, selection_specs=None, **selection): | |||
specs match the selected object. | |||
|
|||
""" | |||
if isinstance(selection_expr, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be Mapping
. Never check isinstance(..., list)
or isinstance(..., dict)
unless you know exactly why you want to treat these different from other Sequence
s and Mapping
s.
@@ -629,6 +635,13 @@ def select(self, selection_expr=None, selection_specs=None, **selection): | |||
or a scalar if a single value was selected | |||
""" | |||
from ...util.transform import dim | |||
if isinstance(selection_expr, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
This doesn't work at all when actually passing dimensions, see https://github.com/holoviz-topics/hv-anndata/blob/586e34beabeed853be172b9982e3efa7737834cd/tests/test_interface.py#L97-L121
|
Thanks, rather than changing all the |
This PR proposes an API change (or rather an additional signature) for
Dataset.select
.Currently we support passing per-dimensions select specifications as keyword arguments. This is generally quite convenient because in most cases dimensions are valid identifiers so the keyword syntax, e.g.
.select(x=(0, 10))
provides the shortest and most convenient syntax. However when the dimension name is not a valid identifier, e.g. it's a string digit or contains other non-valid identifiers you have to write it out using dictionary unpacking:This is not even particularly contrived because when you construct an element from a pandas DataFrame with default column names this is what happens. While a little cumbersome this use case at least works.
However, we are currently in the process of creating a new data interface for
anndata
and because of the complex data model we have to create special dimension objects, which are not easily mapped onto simple string names. This is where the current select approach completely breaks down since keyword arguments must be string based we cannot perform a select operation on elements backed by ananndata
dataset, e.g.:Will error because x is not a string. Therefore I propose we overload the
selection_expr
argument for.select
making it possible to instead write select operations as:This is fully backward compatible since it would previously just error. Before I do any more work on this I'd love to hear feedback.