Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@svonworl
Copy link
Contributor

@svonworl svonworl commented Oct 9, 2025

Description
Per the linked ticket, the Zenodo DOI process was failing, so some tagged version that should have automatic DOIs did not.

This PR adds a new admin/curator-only endpoint that we can use to identify these "missing" DOIs, so that we can generate them. My experiments on some db dumps suggest that about 50% of the versions that should have an automatic DOI do not (because DOI generation failed). So, this endpoint will returns lots of versions.

A DOI is "missing" if the version and parent entry meet the automatic DOI criteria, and the version was created after February 19, 2025 (the day we launched the automatic DOI feature).

The new getVersionsMissingAutomaticDoi endpoint is different from the getVersionsNeedingRetroactiveDoi endpoint. getVersionsMissingAutomaticDoi identifies all versions that should have been assigned a DOI by the webservice, and returns them in most-recent-first order. getVersionsNeedingRetroactiveDoi is intended to be used to distribute retroactive DOIs evenly to all eligible workflows, even old ones.

The new endpoint is backed by a db query, which I originally tried to adapt from one of the queries used by getVersionsNeedingRetroactiveDoi. The resulting query ran forever (longer than my patience for it), so I reordered some parts of the join, improving response time to about 60 seconds. However, this was still too slow, so I came up with a new query that essentially intersects two sets of version IDs: one set that meets the version-related criteria, and another set corresponding to workflows that meet the workflow-related criteria. The resulting query runs in a second or so.

Review Instructions
On staging, hit the getVersionsMissingAutomaticDoi endpoint, and confirm that the first few results correspond to recent tagged versions that should have an automatic DOI, but do not.

Issue
https://ucsc-cgl.atlassian.net/browse/SEAB-7226

Security and Privacy

If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.

  • Security and Privacy assessed

e.g. Does this change...

  • Any user data we collect, or data location?
  • Access control, authentication or authorization?
  • Encryption features?

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • Follow the existing JPA patterns for queries, using named parameters, to avoid SQL injection
  • If you are changing dependencies, check the Snyk status check or the dashboard to ensure you are not introducing new high/critical vulnerabilities
  • Assume that inputs to the API can be malicious, and sanitize and/or check for Denial of Service type values, e.g., massive sizes
  • Do not serve user-uploaded binary images through the Dockstore API
  • Ensure that endpoints that only allow privileged access enforce that with the @RolesAllowed annotation
  • Do not create cookies, although this may change in the future
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

@svonworl svonworl self-assigned this Oct 9, 2025
@svonworl svonworl requested a review from denis-yuen October 9, 2025 06:10
@codecov
Copy link

codecov bot commented Oct 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.14%. Comparing base (cc46e6f) to head (8094dbc).
⚠️ Report is 4 commits behind head on hotfix/1.18.1.

Additional details and impacted files
@@                 Coverage Diff                 @@
##             hotfix/1.18.1    #6175      +/-   ##
===================================================
+ Coverage            74.07%   74.14%   +0.07%     
- Complexity            5724     5730       +6     
===================================================
  Files                  397      397              
  Lines                20571    20581      +10     
  Branches              2116     2116              
===================================================
+ Hits                 15238    15260      +22     
+ Misses                4326     4312      -14     
- Partials              1007     1009       +2     
Flag Coverage Δ
bitbuckettests 25.81% <0.00%> (-0.02%) ⬇️
hoverflytests 27.54% <100.00%> (+0.10%) ⬆️
integrationtests 55.99% <0.00%> (-0.03%) ⬇️
languageparsingtests 10.78% <0.00%> (-0.01%) ⬇️
localstacktests 21.17% <0.00%> (-0.02%) ⬇️
toolintegrationtests 29.74% <0.00%> (-0.02%) ⬇️
unit-tests_and_non-confidential-tests 26.09% <0.00%> (-0.13%) ⬇️
workflowintegrationtests 39.49% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swagger editor validator breakage is known from develop (can probably grab the one line change).

Looks good in general.
The bigger issue is a lack of testing, think we have the time to add something simple.

@svonworl
Copy link
Contributor Author

Swagger editor validator breakage is known from develop (can probably grab the one line change).

Looks good in general. The bigger issue is a lack of testing, think we have the time to add something simple.

I added some simple tests. Alas, ZenodoIT is failing to run for me locally, so I had to lean on CircleCI, and it took a lot longer than expected. We should try to figure out what's wrong (looks like a cert issue somewhere in the hoverfly area), but some other time, bigger fish to fry at the moment..

@svonworl svonworl requested a review from denis-yuen October 10, 2025 04:23
hoverfly.simulate(ZENODO_SIMULATION_SOURCE);
WorkflowsApi workflowsApi = new WorkflowsApi(getOpenAPIWebClient(true, USER_2_USERNAME, testingPostgres));
handleGitHubRelease(workflowsApi, DockstoreTesting.WORKFLOW_DOCKSTORE_YML, "refs/tags/0.8", USER_2_USERNAME);
assertEquals(0, workflowsApi.getVersionsMissingAutomaticDoi(1000).size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, could add some comments to make this more readable

@svonworl svonworl requested a review from denis-yuen October 14, 2025 16:36
Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sonar has a couple comments that are worth a quick clean-up

@svonworl
Copy link
Contributor Author

sonar has a couple comments that are worth a quick clean-up

I changed the code in question to throw a better exception, although in practice, the error condition (no Workflow for a given WorkflowVersion) should happen very rarely (if at all, I'm on the fence as to whether or not it's actually possible).

The other Sonarcloud feedback recommended that we not use the deprecated method getNamedQuery. I did not address this, we use getNamedQuery everywhere, and we should change all of the invocations at once (or at least all within the same file). Not a change for a hotfix.

@sonarqubecloud
Copy link

@svonworl svonworl merged commit 4815a2e into hotfix/1.18.1 Oct 14, 2025
21 of 23 checks passed
@svonworl svonworl deleted the feature/seab-7226/add-endpoint-to-identify-missing-dois branch October 14, 2025 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants