-
Notifications
You must be signed in to change notification settings - Fork 29
SEAB-7226: Address DOI creation failures #6174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEAB-7226: Address DOI creation failures #6174
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## hotfix/1.18.1 #6174 +/- ##
===================================================
- Coverage 74.07% 74.02% -0.05%
- Complexity 5724 5731 +7
===================================================
Files 397 397
Lines 20571 20611 +40
Branches 2116 2117 +1
===================================================
+ Hits 15238 15258 +20
- Misses 4326 4345 +19
- Partials 1007 1008 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Idle thought reading in sequence. How do we think this will react, thinking particularly of the Broad case where they may be deleting and re-creating multiple tags at the same time (will one webservice delete the in-progress draft deposits being created for another one?)
On the other hand, maybe this is slow enough? |
denis-yuen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried relying on the index before and had poor results, see below
| // Create a Lucene query that finds drafts corresponding to the specified concept DOI. | ||
| // Apparently, this endpoint pulls information from ElasticSearch, so the view may be stale. | ||
| // Drafts may take a while to appear, or seem to persist after they are deleted. | ||
| String query = "(conceptrecid:\"%d\") AND (submitted:\"false\")".formatted(conceptDoiId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I tried using this approach before. See comments from https://ucsc-cgl.atlassian.net/browse/SEAB-7226?focusedCommentId=49380 through https://ucsc-cgl.atlassian.net/browse/SEAB-7226?focusedCommentId=49387
The index is definitely incomplete. I had better luck using the new endpoints documented in https://github.com/dockstore/swagger-java-zenodo-client/pull/28/files#diff-3d6e9eaeeda7aac0f94cadbe92f2b969e2aee88dff371d7622d25a46b9a36b5aR553
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(could also try both)
|
|
||
| private static void deleteDeposit(DepositsApi depositsApi, int depositId) { | ||
| try { | ||
| depositsApi.deleteDeposit(depositId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a specific endpoint for deleting draft deposits that may be safer
https://github.com/dockstore/swagger-java-zenodo-client/pull/28/files#diff-3d6e9eaeeda7aac0f94cadbe92f2b969e2aee88dff371d7622d25a46b9a36b5aR536
I think it's ok, the reasoning is something like:
|
|
Ok, so, I ran some experiments and dabbled with some test code. I made some improvements to the Here's my conclusion: I strongly recommend we go with the current solution. Would like to get this into the hotfix, deploy, and assess how it works after a couple of weeks. Can we do that? My reasoning:
|
To be clear, this is not an argument in favour of the current approach. The "new" endpoints are documented by zenodo in openapi, but incompletely without return objects. The "old" endpoints are purely documented by us in openapi by inspecting their textual documentation and behaviour. I then extended the openapi description that we use (owned by us in our repository) for the "old" endpoints to cover those two "new" endpoints. |
I'm not sure about the downside of just using the new endpoint here. |
denis-yuen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with splitting the difference, how about using the old search endpoint but the new endpoint for deleting drafts?
It is indeed an argument in favor of the current approach, both of the "old" endpoints are documented here: |
|
|
I changed the code to use the "new" |
| // to mix a few published records into the response, or doesn't list the draft first. | ||
| final int maxResults = 10; | ||
| // In the Zenodo API, page numbers start at 1 (!) | ||
| return previewApi.listUserRecords(query, "newest", maxResults, 1, true, false).getHits().getHits().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created https://ucsc-cgl.atlassian.net/browse/SEAB-7341
probably need to work on tags/namespace
Description
This PR makes some improvements to the Zenodo DOI generation code that should improve the chances of it successfully generating new DOIs.
Some time in June, as far as we can tell, the DOI creation process began to fail because the Zenodo
createFileanddeleteFileendpoints were responding with spurious 403 errors, often more often than not, and occasional 503s. The calls would also sometimes succeed, and it does not appear that we were using the endpoint wrong. Rather, something is going sideways on the Zenodo side.When the DOI generation process fails, a DOI is not generated for the tagged version, which is a bummer. However, something more insidious was happening...
The failed DOI generations left draft deposits in the Zenodo system, causing all future DOI generation attempts to fail for the associated workflow.
This PR addresses the above problems by:
createFileanddeleteFilecalls on failure, to increase the probability that we will succeed. The code currently makes 5 attempts, each separated by a 1 second sleep. I'm tempted to increase the number of attempts, but also concerned about triggering a rate limit.In tandem, the above changes should allow DOI generation to succeed much more frequently. However, it'll still fail on occasion.
It's very difficult to test how this code responds to various Zenodo failures, especially via automatic tests. So, instead, I user tested locally, by tweaking the code in various spots to simulate various failures (including leaving a draft in the Zenodo sandbox), and submitted various requests to confirm that the code was working properly.
Review Instructions
On staging, push some tagged versions on an entry, and confirm that most of the DOIs have been generated correctly. Try the same thing on prod, after we deploy. After a few weeks on prod, analyze the logs and see if we need to take any more action.
Issue
https://ucsc-cgl.atlassian.net/browse/SEAB-7226
Security and Privacy
If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.
e.g. Does this change...
Please make sure that you've checked the following before submitting your pull request. Thanks!
mvn clean install@RolesAllowedannotation