You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just watched @thomasjpfan's SciPy lightning talk where he showed using .set_output(transform='pandas') to return results wrapped in pandas dataframes. I want the same but for xarray.Dataset.
Describe your proposed solution
However it works for pandas but for xarray. (Which PR was it implemented in?) There is even an xarray.Dataset.from_dataframe method.
Describe alternatives you've considered, if relevant
No response
Additional context
I think a fully N-dimensional object would make a lot of sense to wrap sklearn output in. I recently found myself wishing this existed whilst doing PCA for example.
The text was updated successfully, but these errors were encountered:
Thanks for opening this issue. I'm going to cross-link it with #25896 which is related as it is about supporting dataframes other than pandas. I think it makes sense to try and have an overview of "all the demands" when trying to come up with a solution. At least I hope we can come up with a general abstraction that makes it simpler to add new in-/outputs.
I'll leave it to you to decide if you want to keep this issue open or close it&post a short comment about xarray in the linked issue.
I think this in particular is also very much related to the discussion we had during the sprint in Paris, where we talked about a heterogeneous container.
I don't remember all the details anymore, but I think @thomasjpfan has a good overview.
At SciPy 2023, I spoke with Tom about supporting XArray directly in scikit-learn as another "output" option. Compared to Panda's DataFrames, XArray has a bit more metadata that is "sample aligned" to be passed to the output. I'm planning to open a PR proposing a general abstraction to easily support other output containers.
Describe the workflow you want to enable
I just watched @thomasjpfan's SciPy lightning talk where he showed using
.set_output(transform='pandas')
to return results wrapped in pandas dataframes. I want the same but forxarray.Dataset
.Describe your proposed solution
However it works for pandas but for xarray. (Which PR was it implemented in?) There is even an
xarray.Dataset.from_dataframe
method.Describe alternatives you've considered, if relevant
No response
Additional context
I think a fully N-dimensional object would make a lot of sense to wrap sklearn output in. I recently found myself wishing this existed whilst doing PCA for example.
The text was updated successfully, but these errors were encountered: