Thanks to visit codestin.com
Credit goes to github.com

Skip to content

DOC Rework plot_document_clustering.py example #23528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 57 commits into from
Jun 20, 2022

Conversation

ArturoAmorQ
Copy link
Member

Reference Issues/PRs

Related to #22928 and #23266

What does this implement/fix? Explain your changes.

This is the third release of the revamped examples to serve as a tutorial series on text analysis.

Any other comments?

Side effect: Implements notebook style as intended in #22406

@ogrisel
Copy link
Member

ogrisel commented Jun 3, 2022

Can you please merge the main branch back to this PR to see if #23508 fixes the doc build?

@lesteve
Copy link
Member

lesteve commented Jun 3, 2022

Can you please merge the main branch back to this PR to see if #23508 fixes the doc build?

The CircleCI hosting job is triggered but there is no "check the rendered doc" direct link to the artifact yet, see #23534 (comment) for more details.

@ogrisel
Copy link
Member

ogrisel commented Jun 7, 2022

The doc ci is broken because the k-means ++ init on very sparse data can select initial centroids than never get updated and this cause problems in the silhouette clustering evaluation.

The example build log is very verbose and it takes time to render in firefox when scrolling. Here is a PR to make the log size much more manageable: #23557.

@glemaitre glemaitre self-requested a review June 8, 2022 09:49
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments.

Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution, @ArturoAmorQ! 🙂

Here are some comments and suggestion.

In a nutshell, I suggest that:

  • sentences can sometimes be shorter and their meaning more accurate

  • Sphinx references can be adapted as

    :func:`~*`
    

    should be changed to:

    :class:`~*`
    

    when referencing classes.

  • cross-reference can be made when content already exist in the documentation

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Just a final batch of suggestions.

Thank you very much @ArturoAmorQ!

@ogrisel
Copy link
Member

ogrisel commented Jun 20, 2022

There seem to be a problem with doc-min-dependencies: the last two commits had run that lasted for hours and I could not see the output in the github actions web interface. So I cancelled them. I will push a new commit to see if it still happens.

@ArturoAmorQ
Copy link
Member Author

I will push a new commit to see if it still happens.

I merged main in case that helps. Thanks for your time and valuable feedback, @ogrisel, @jjerphan and @glemaitre !

@ogrisel
Copy link
Member

ogrisel commented Jun 20, 2022

It seems to work indeed.

@ogrisel ogrisel merged commit 0943f52 into scikit-learn:main Jun 20, 2022
@ogrisel
Copy link
Member

ogrisel commented Jun 20, 2022

Merged! Thank you very much @ArturoAmorQ !

I think the linked example for adjustment for chance could also benefit from a tutorialization:

https://scikit-learn.org/dev/auto_examples/cluster/plot_adjusted_for_chance_measures.html#sphx-glr-auto-examples-cluster-plot-adjusted-for-chance-measures-py

@jjerphan
Copy link
Member

Thank you, @ArturoAmorQ!

@ArturoAmorQ ArturoAmorQ deleted the doc_clustering branch June 23, 2022 15:20
ogrisel added a commit to ogrisel/scikit-learn that referenced this pull request Jul 11, 2022
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
Co-authored-by: Julien Jerphanion <[email protected]>
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Aug 4, 2022
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
Co-authored-by: Julien Jerphanion <[email protected]>
glemaitre added a commit that referenced this pull request Aug 5, 2022
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
Co-authored-by: Julien Jerphanion <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants