Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

matthewfeickert
Copy link
Member

@matthewfeickert matthewfeickert commented May 31, 2023

Resolves #4

Remove all but the last N_LATEST_UPLOADS (defaulted to 5) package version uploads to the package index to ensure space. To do this, reply on the output form of anaconda show to be able to filter on the item delimiter character sequence for each version currently uploaded.

As an explicit example:

$ anaconda show scientific-python-nightly-wheels/xarray
Using Anaconda API: https://api.anaconda.org
Name:    xarray
Summary: 
Access:  public
Package Types:  Standard Python
Versions:
   + 2023.5.1.dev2+gd8ec3a3f
   + 2023.5.1.dev5+gf96aca45
   + 2023.5.1.dev6+g69445c62
   + 2023.5.1.dev9+g609a9016
   + 2023.5.1.dev10+gf45eb733
   + 2023.5.1.dev11+gfcb81756
   + 2023.5.1.dev12+gb319d867
   + 2023.5.1.dev14+ge3db6164

To install this package with pypi run:
     pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple xarray

shows that by filtering on +

$ anaconda show scientific-python-nightly-wheels/xarray &> >(grep '+')
   + 2023.5.1.dev2+gd8ec3a3f
   + 2023.5.1.dev5+gf96aca45
   + 2023.5.1.dev6+g69445c62
   + 2023.5.1.dev9+g609a9016
   + 2023.5.1.dev10+gf45eb733
   + 2023.5.1.dev11+gfcb81756
   + 2023.5.1.dev12+gb319d867
   + 2023.5.1.dev14+ge3db6164

and then stripping ' + '

$ anaconda show scientific-python-nightly-wheels/xarray &> >(grep '+') | \
    awk '{print $2}'
2023.5.1.dev2+gd8ec3a3f
2023.5.1.dev5+gf96aca45
2023.5.1.dev6+g69445c62
2023.5.1.dev9+g609a9016
2023.5.1.dev10+gf45eb733
2023.5.1.dev11+gfcb81756
2023.5.1.dev12+gb319d867
2023.5.1.dev14+ge3db6164

one can obtain a newline separated list of all package uploads, where the most recent uploads are listed last. After stripping off the N_LATEST_UPLOADS lines that correspond to the package versions to keep for testing

$ anaconda show scientific-python-nightly-wheels/xarray &> >(grep '+') | \
    awk '{print $2}' | \
    head --lines -5 > remove-package-versions.txt
$ cat remove-package-versions.txt 
2023.5.1.dev2+gd8ec3a3f
2023.5.1.dev5+gf96aca45
2023.5.1.dev6+g69445c62

the remaining (older) package uploads can be removed with anaconda remove.

Most of this is taken from matplotlib/matplotlib#23349 (c.f. #4 (comment)).

To provide a local reproducible example of the behavior, curl the following attached Shell script into a continuumio/miniconda3:latest container with anaconda-client installed

example-pr-12.sh:
# example-pr-12.sh
#!/bin/bash

# fail on undefined variables
set -u
# Prevent pipe errors to be silenced
set -o pipefail
# Exit if any command exit as non-zero
set -e

# Input variables
ANACONDA_USER="scientific-python-nightly-wheels"
INPUT_ARTIFACTS_PATH="dist"
N_LATEST_UPLOADS=5

anaconda show "${ANACONDA_USER}" &> >(grep "${ANACONDA_USER}/") | \
    awk '{print $1}' | \
    sed 's|.*/||g' > package-names.txt

if [ -s package-names.txt ]; then
    while LANG=C IFS= read -r package_name ; do

        printf "\n# package: %s\n" "${package_name}"

        anaconda show "${ANACONDA_USER}/${package_name}" &> >(grep '+') | \
            awk '{print $2}' | \
            head --lines "-${N_LATEST_UPLOADS}" > remove-package-versions.txt

        if [ -s remove-package-versions.txt ]; then
            while LANG=C IFS= read -r package_version ; do
                echo "# Removing ${ANACONDA_USER}/${package_name}/${package_version}"
                # anaconda --token ${{ secrets.ANACONDA_TOKEN }} remove \
                #   --force \
                #   "${ANACONDA_USER}/${package_name}/${package_version}"
            done <remove-package-versions.txt
        fi

    done <package-names.txt
fi
$ docker run --rm -ti continuumio/miniconda3:latest /bin/bash
(base) root@5d1ea993cbe9:/# conda install -y anaconda-client -c conda-forge
(base) root@5d1ea993cbe9:/# apt update && apt install -y curl
(base) root@5d1ea993cbe9:/# curl -sL https://github.com/scientific-python/upload-nightly-action/files/11618429/example-pr-12.sh.txt -o example-pr-12.sh
(base) root@5d1ea993cbe9:/# bash example-pr-12.sh 

# package: ipython

# package: matplotlib

# package: networkx

# package: numpy

# package: scikit-image

# package: scikit-learn

# package: scipy

# package: xarray

(base) root@5d1ea993cbe9:/# sed -i 's/N_LATEST_UPLOADS=5/N_LATEST_UPLOADS=2/g' example-pr-12.sh  # demo purposes
(base) root@5d1ea993cbe9:/# bash example-pr-12.sh

# package: ipython

# package: matplotlib
# Removing scientific-python-nightly-wheels/matplotlib/3.6.0.dev3181+g9f17f3f851
# Removing scientific-python-nightly-wheels/matplotlib/3.6.0.dev3182+gd6bad5f684

# package: networkx

# package: numpy
# Removing scientific-python-nightly-wheels/numpy/1.25.0.dev0+1532.g5d6c744db

# package: scikit-image

# package: scikit-learn

# package: scipy

# package: xarray
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev9+g609a9016
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev10+gf45eb733
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev11+gfcb81756
(base) root@5d1ea993cbe9:/# 

example-pr-12.sh.txt

I'm marking this as closing Issue #4 despite that it doesn't include a time cutoff as described in #4 (comment) given that I do not know of a pip or anaconda-client API that gets upload date information from the index (Probably exists somewhere though?) and this PR resolves all other matters as far as I understand it. We can revise or extend this if desired.

@matthewfeickert matthewfeickert added the enhancement New feature or request label May 31, 2023
@matthewfeickert matthewfeickert self-assigned this May 31, 2023
Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @matthewfeickert. If that's the same approach that Matplotlib has been using successfully, then all good to me.

The only thing to me is that, IIUC this is to be done per project right? On SciPy and NumPy at least, we are not using actions to do the upload but we rely on CirrusCI. So we would need to add another action that would not upload and just check for removal? (In practice we overwrite our upload and only have a single version. But other packages might do differently.)

Should we have a nightly action on this repo that would run this for all packages?

@matthewfeickert
Copy link
Member Author

matthewfeickert commented May 31, 2023

The only thing to me is that, IIUC this is to be done per project right? On SciPy and NumPy at least, we are not using actions to do the upload but we rely on CirrusCI. So we would need to add another action that would not upload and just check for removal? (In practice we overwrite our upload and only have a single version. But other packages might do differently.)

Should we have a nightly action on this repo that would run this for all packages?

Ah I see that I have misunderstood what "centralise" meant in #4 (comment).

@tupui so for SciPy and NumPy you use CirrusCI for all the uploads to https://anaconda.org/scientific-python-nightly-wheels? If so, then yes, this would all need to be a separate action and this PR should be closed.

I think that it would be better to do a 1-action-per-repo policy, so in that case this PR should get closed and I could port the relevant logic to the new repo (I'm not very good at naming things so remove-nightly-wheels or something of that sort).

@tupui
Copy link
Member

tupui commented May 31, 2023

Let's continue the discussion on the issue before we make any decisions about closing etc. 😃

@matthewfeickert matthewfeickert force-pushed the feat/add-removal-of-last-n-uploads branch from 2577b71 to 56cd8fe Compare May 31, 2023 17:44
@matthewfeickert matthewfeickert changed the title ENH: Add removal of old uploads CI: Add removal of old uploads May 31, 2023
Copy link
Member Author

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given #4 (comment) please see these updated changes. I have tested this already on my fork with matplotlib (c.f. the aside in #8 (comment) if wondering why I'm doing so) and things work as expected (which is good as I'm copying an existing workflow I wrote :P).

@matthewfeickert matthewfeickert requested a review from tupui May 31, 2023 17:56
@stefanv
Copy link
Member

stefanv commented May 31, 2023

Agreed that querying the index for the list of packages would make this a bit more robust against future package additions.

Aside, an alternative to sed magic:

| awk '{print $2}'

Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @matthewfeickert 🚀 That's good on my side. Approving as the suggestion is optional to me for now.

@matthewfeickert matthewfeickert force-pushed the feat/add-removal-of-last-n-uploads branch from 56cd8fe to 5fd637f Compare May 31, 2023 19:51
@matthewfeickert matthewfeickert requested a review from tupui May 31, 2023 20:04
Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good on my side 🚀 thanks @matthewfeickert

Copy link
Member

@stefanv stefanv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If left some minor bash styling feedback, but feel free to take it or leave it.

matthewfeickert added a commit to matthewfeickert/matplotlib that referenced this pull request Jun 1, 2023
* Remove the workflow step for querying and removing all but the last 5
  nightly wheel uploads to the scientific-python-nightly-wheels Anaconda
  Cloud package index as this is now centrally done by the Scientific
  Python org for all projects that upload to
  https://anaconda.org/scientific-python-nightly-wheels.
   - c.f. scientific-python/upload-nightly-action#12
@matthewfeickert matthewfeickert force-pushed the feat/add-removal-of-last-n-uploads branch from 5fd637f to d28e1de Compare June 1, 2023 05:45
@matthewfeickert matthewfeickert requested a review from stefanv June 1, 2023 05:45
@matthewfeickert matthewfeickert force-pushed the feat/add-removal-of-last-n-uploads branch 2 times, most recently from 1d24930 to c121a1e Compare June 5, 2023 21:58
@matthewfeickert
Copy link
Member Author

matthewfeickert commented Jun 8, 2023

Thanks for the review @jarrodmillman! Does your approval mean that you were able to setup the remove-old-wheels GitHub Actions environment with the Anaconda cloud API token as an environment restricted secret? If so, then I think we can merge.

@matthewfeickert matthewfeickert force-pushed the feat/add-removal-of-last-n-uploads branch from c121a1e to 85b99a0 Compare June 8, 2023 16:00
@jarrodmillman jarrodmillman added this to the 0.2.0 milestone Jun 8, 2023
* Add GitHub Actions workflow to remove old wheels from the package
  index each night at 1:23 UTC. Also allow for manual runs via workflow
  dispatch.
* Remove all but the last N_LATEST_UPLOADS package version uploads to
  the package index to ensure space. To do this, reply on the output
  form of `anaconda show` to be able to filter on the item delimiter
  character sequence for each version currently uploaded.

  As an explicit example:

  ```
  $ anaconda show scientific-python-nightly-wheels/xarray
  Using Anaconda API: https://api.anaconda.org
  Name:    xarray
  Summary:
  Access:  public
  Package Types:  Standard Python
  Versions:
     + 2023.5.1.dev9+g609a9016
     + 2023.5.1.dev10+gf45eb733
     + 2023.5.1.dev11+gfcb81756
     + 2023.5.1.dev12+gb319d867
     + 2023.5.1.dev14+ge3db6164

  To install this package with pypi run:
       pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple xarray
  ```

  shows that by filtering on '+' and then stripping ' + ' one can obtain a
  newline separated list of all package uploads, where the _most recent_
  uploads are listed last. After stripping off the N_LATEST_UPLOADS
  lines that correspond to the package versions to keep for testing, the
  remaining (older) package uploads can be removed with `anaconda remove`.
* Use Dependabot to keep the GitHub Actions used in workflows up to
  date.
@matthewfeickert matthewfeickert force-pushed the feat/add-removal-of-last-n-uploads branch from 85b99a0 to e964540 Compare June 8, 2023 16:23
@jarrodmillman jarrodmillman marked this pull request as draft June 8, 2023 16:26
@jarrodmillman
Copy link
Member

See #12 (comment)

@tupui tupui marked this pull request as ready for review June 9, 2023 14:51
@jarrodmillman
Copy link
Member

Thanks @matthewfeickert !!!

@jarrodmillman jarrodmillman merged commit c049309 into scientific-python:main Jun 9, 2023
@matthewfeickert matthewfeickert deleted the feat/add-removal-of-last-n-uploads branch June 9, 2023 16:42
@matthewfeickert
Copy link
Member Author

Just did a manual run with workflow dispatch and all looks good!

# package: ipython

# package: matplotlib
# Removing scientific-python-nightly-wheels/matplotlib/3.6.0.dev3181+g9f17f3f851
Using Anaconda API: https://api.anaconda.org/
# Removing scientific-python-nightly-wheels/matplotlib/3.6.0.dev3182+gd6bad5f684
Using Anaconda API: https://api.anaconda.org/
# Removing scientific-python-nightly-wheels/matplotlib/3.8.0.dev1157+g2f778fda79
Using Anaconda API: https://api.anaconda.org/
# Removing scientific-python-nightly-wheels/matplotlib/3.8.0.dev1150+g3d3a1afd6d
Using Anaconda API: https://api.anaconda.org
# Removing scientific-python-nightly-wheels/matplotlib/3.8.0.dev1192+gbb335a115c
Using Anaconda API: https://api.anaconda.org
# Removing scientific-python-nightly-wheels/matplotlib/3.8.0.dev1211+gdcb8180edc
Using Anaconda API: https://api.anaconda.org
# Removing scientific-python-nightly-wheels/matplotlib/3.8.0.dev1228+g19d93b7876
Using Anaconda API: https://api.anaconda.org

# package: networkx

# package: numpy
# Removing scientific-python-nightly-wheels/numpy/1.25.0.dev0+1532.g5d6c7[44](https://github.com/scientific-python/upload-nightly-action/actions/runs/5224131691/jobs/9431935913#step:6:45)db
Using Anaconda API: https://api.anaconda.org

# package: pandas

# package: scikit-image

# package: scikit-learn

# package: scipy

# package: statsmodels

# package: xarray
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev9+g609a9016
Using Anaconda API: https://api.anaconda.org
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev10+gf[45](https://github.com/scientific-python/upload-nightly-action/actions/runs/5224131691/jobs/9431935913#step:6:46)eb733
Using Anaconda API: https://api.anaconda.org/
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev11+gfcb81756
Using Anaconda API: https://api.anaconda.org/
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev12+gb319d867
Using Anaconda API: https://api.anaconda.org/
# Removing scientific-python-nightly-wheels/xarray/2023.5.1.dev14+ge3db6164
Using Anaconda API: https://api.anaconda.org/

Should be good to go for nightly cron jobs now and then just need to make sure that the workflow doesn't get shut off due to lack of repo activity.

@stefanv
Copy link
Member

stefanv commented Jun 9, 2023

Awesome, thanks @matthewfeickert! I didn't realize inactive repos have their actions switched off. What do we need to do; a monthly empty commit?

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Jun 9, 2023

I didn't realize inactive repos have their actions switched off. What do we need to do; a monthly empty commit?

Yeah, from https://docs.github.com/en/actions/managing-workflow-runs/disabling-and-enabling-a-workflow

Warning: To prevent unnecessary workflow runs, scheduled workflows may be disabled automatically. When a public repository is forked, scheduled workflows are disabled by default. In a public repository, scheduled workflows are automatically disabled when no repository activity has occurred in 60 days.

Though you will also get an email as a heads up and you can click a button in the email to keep the workflow alive. I'm also not 100% if "repository activity" means a Git operation or if it just means interaction with the repository in some way on GitHub.

My assumption is that as we have Dependabot turned on this won't be an issue in general, but if it becomes one then we can also just have a GHA workflow that once a month checks out a keep-alive branch, does a git commit --amend --no-edit, and --force-with-lease pushes back to the keep-alive branch. There's also various GitHub Action workflows on the GitHub Actions marketplace that do this sort of thing for you.

@bsipocz
Copy link
Member

bsipocz commented Jun 9, 2023

It could also push a timestamp of its last run, maybe in a separate, empty branch, or gh-pages?

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Jun 9, 2023

It could also push a timestamp of its last run, maybe in a separate, empty branch, or gh-pages?

Yup, but then you're accumulating a daily commit history on that other branch that still lives in the repo for no real reason. Is it easy to ignore? Sure, but seems kinda unnecessary. As you point out though, you could use gh-pages as a way to throw this stuff into the void in some sense, but if for any reason in the future you want to use the gh-pages for an actual website then you need to shift things around.

Personally, I would put this problem in the "kick it down the road until it becomes a problem" bin, as for the multiple projects that I maintain that are are at this stage repos for LTS Docker images this hasn't come up much. Though if anyone has been nerdsniped by this conversation please go for it! 😉

@bsipocz
Copy link
Member

bsipocz commented Jun 9, 2023

I had repos where we run into this in practice, and wondered why the actions stopped working. So I would just like to avoid being on that road again :)

@stefanv
Copy link
Member

stefanv commented Jun 9, 2023

One commit every 50 days on a keepalive branch isn't too bad :)

https://github.com/marketplace/actions/keepalive-workflow

@bsipocz
Copy link
Member

bsipocz commented Jun 9, 2023

Ahh, nice, thanks for the link. I was starting to hack something together, but it's probably much easier to just do the action. I'll open a PR for this.

melissawm pushed a commit to melissawm/matplotlib that referenced this pull request Jun 15, 2023
* Remove the workflow step for querying and removing all but the last 5
  nightly wheel uploads to the scientific-python-nightly-wheels Anaconda
  Cloud package index as this is now centrally done by the Scientific
  Python org for all projects that upload to
  https://anaconda.org/scientific-python-nightly-wheels.
   - c.f. scientific-python/upload-nightly-action#12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Old wheels culling.
5 participants