Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Transform output to xarray objects #26835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
TomNicholas opened this issue Jul 13, 2023 · 3 comments
Open

Transform output to xarray objects #26835

TomNicholas opened this issue Jul 13, 2023 · 3 comments
Assignees

Comments

@TomNicholas
Copy link

Describe the workflow you want to enable

I just watched @thomasjpfan's SciPy lightning talk where he showed using .set_output(transform='pandas') to return results wrapped in pandas dataframes. I want the same but for xarray.Dataset.

Describe your proposed solution

However it works for pandas but for xarray. (Which PR was it implemented in?) There is even an xarray.Dataset.from_dataframe method.

Describe alternatives you've considered, if relevant

No response

Additional context

I think a fully N-dimensional object would make a lot of sense to wrap sklearn output in. I recently found myself wishing this existed whilst doing PCA for example.

@TomNicholas TomNicholas added Needs Triage Issue requires triage New Feature labels Jul 13, 2023
@betatim
Copy link
Member

betatim commented Jul 14, 2023

Thanks for opening this issue. I'm going to cross-link it with #25896 which is related as it is about supporting dataframes other than pandas. I think it makes sense to try and have an overview of "all the demands" when trying to come up with a solution. At least I hope we can come up with a general abstraction that makes it simpler to add new in-/outputs.

I'll leave it to you to decide if you want to keep this issue open or close it&post a short comment about xarray in the linked issue.

@adrinjalali
Copy link
Member

I think this in particular is also very much related to the discussion we had during the sprint in Paris, where we talked about a heterogeneous container.

I don't remember all the details anymore, but I think @thomasjpfan has a good overview.

@thomasjpfan
Copy link
Member

At SciPy 2023, I spoke with Tom about supporting XArray directly in scikit-learn as another "output" option. Compared to Panda's DataFrames, XArray has a bit more metadata that is "sample aligned" to be passed to the output. I'm planning to open a PR proposing a general abstraction to easily support other output containers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants