-
Notifications
You must be signed in to change notification settings - Fork 2
UFAL/Refbox upgrade #1015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UFAL/Refbox upgrade #1015
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughThis update introduces a new REST endpoint in Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ClarinRefBoxController
participant HandleService
participant ItemService
participant Utils
participant ConfigurationService
Client->>ClarinRefBoxController: GET /api/core/refbox?handle=...
ClarinRefBoxController->>HandleService: resolveHandle(handle)
HandleService-->>ClarinRefBoxController: Item or error
alt Item found
ClarinRefBoxController->>ItemService: getMetadata(item, ...)
ClarinRefBoxController->>Utils: getCanonicalHandleUrlNoProtocol(item)
ClarinRefBoxController->>ConfigurationService: getProperty(...)
ClarinRefBoxController-->>Client: 200 OK (RefBoxDTO JSON)
else Error
ClarinRefBoxController-->>Client: 4xx/422 error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java(2 hunks)
🔇 Additional comments (1)
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java (1)
11-11: LGTM: Import addition is appropriate.The
jsonPathimport is correctly added to support the new test assertions.
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java
Outdated
Show resolved
Hide resolved
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request implements upgrades to the CLARIN/UFAL RefBox functionality by adding comprehensive integration tests and refactoring existing code. The changes focus on improving test coverage for the /api/core/refbox endpoint and updating the controller implementation.
- Added extensive integration tests to verify RefBox API responses for items with and without featured services
- Refactored the controller to support a new consolidated RefBox endpoint with proper DTO structure
- Added utility method for canonical handle URL formatting
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| ClarinRefBoxControllerIT.java | Added 11 new integration tests covering various RefBox endpoint scenarios including error cases and edge conditions |
| Utils.java | Added utility method to format canonical handle URLs without protocol prefix |
| RefBoxDTO.java | New DTO class for RefBox response structure |
| FeaturedServiceLinkDTO.java | New DTO for featured service link data |
| FeaturedServiceDTO.java | New DTO for featured service data |
| ExportFormatDTO.java | New DTO for export format data |
| ClarinRefBoxController.java | Added new endpoint method and refactored OaiMetadataWrapper class |
Comments suppressed due to low confidence (1)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java:595
- The field name 'value' is inconsistent with the getter method name 'getMetadata()'. Consider renaming the field to 'metadata' to maintain consistency.
private String value;
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java
Show resolved
Hide resolved
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java
Outdated
Show resolved
Hide resolved
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java (1)
378-394: Hard-coded HTML formatting reduces maintainability.The method contains hard-coded HTML tags that make the code less maintainable, as noted in previous reviews.
Consider using constants or a template approach:
+ private static final String DISPLAY_TEXT_TEMPLATE = "%s%s, \n <i>%s</i>%s, \n <a href=\"%s\">%s</a>."; + private String buildDisplayText(Item item) { // ... existing logic for gathering data ... - StringBuilder sb = new StringBuilder(); - if (authorText != null && !authorText.isEmpty()) { - sb.append(authorText); - } - if (year != null && !year.isEmpty()) { - if (sb.length() > 0) { - sb.append(", "); - } - sb.append(year); - } - sb.append(", \n <i>").append(title != null ? title : "").append("</i>"); - if (repository != null && !repository.isEmpty()) { - sb.append(", ").append(repository); - } - sb.append(", \n <a href=\"").append(identifier != null ? identifier : "").append("\">") - .append(identifier != null ? identifier : "").append("</a>."); - return sb.toString(); + + String authorPart = (authorText != null && !authorText.isEmpty()) ? authorText : ""; + String yearPart = (year != null && !year.isEmpty()) ? ", " + year : ""; + String titlePart = title != null ? title : ""; + String repositoryPart = (repository != null && !repository.isEmpty()) ? ", " + repository : ""; + String identifierPart = identifier != null ? identifier : ""; + + return String.format(DISPLAY_TEXT_TEMPLATE, authorPart, yearPart, titlePart, + repositoryPart, identifierPart, identifierPart); }
🧹 Nitpick comments (1)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java (1)
400-413: LGTM! Consider making the author threshold configurable.The method handles various author scenarios well. The magic number 5 for the author threshold could potentially be made configurable for flexibility.
+ private static final int MAX_AUTHORS_BEFORE_ET_AL = 5; + private String formatAuthors(Item item, List<String> authors) { String authorText = ""; if (authors.isEmpty()) { authorText = itemService.getMetadataFirstValue(item, "dc", "publisher", null, Item.ANY); } else if (authors.size() == 1) { authorText = authors.get(0); - } else if (authors.size() <= 5) { + } else if (authors.size() <= MAX_AUTHORS_BEFORE_ET_AL) { authorText = String.join("; ", authors); authorText = authorText.replaceAll(";([^;]*)$", " and$1"); } else { authorText = authors.get(0) + "; et al."; } return authorText; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java(6 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java (5)
dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/ExportFormatDTO.java (1)
ExportFormatDTO(12-52)dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/FeaturedServiceDTO.java (1)
FeaturedServiceDTO(14-59)dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/FeaturedServiceLinkDTO.java (1)
FeaturedServiceLinkDTO(12-34)dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/RefBoxDTO.java (1)
RefBoxDTO(14-61)dspace-api/src/main/java/org/dspace/core/Utils.java (1)
Utils(54-581)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: dspace-dependencies / docker-build (linux/amd64, ubuntu-latest, true)
- GitHub Check: Run Integration Tests
- GitHub Check: Run Unit Tests
🔇 Additional comments (3)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java (3)
28-28: LGTM!The new imports are appropriate for the added refbox functionality and follow standard conventions.
Also applies to: 51-51, 53-56, 59-59, 65-65, 67-67
110-111: LGTM!The service injections follow standard Spring autowiring patterns and are necessary for the new functionality.
Also applies to: 134-135
603-603: LGTM!The refactoring of the internal field name from
metadatatovalueis clean and maintains backward compatibility through the getter method name.Also applies to: 605-606, 610-610, 613-614
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java
Show resolved
Hide resolved
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java
Show resolved
Hide resolved
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java (4)
304-339: Add explicit handle parameter validation as previously recommended.The method lacks the explicit null/blank validation for the handle parameter that was recommended in previous reviews. While the current code works, explicit validation provides clearer error messages and intent.
Consider adding the validation before handle resolution:
@RequestMapping(method = RequestMethod.GET, produces = "application/json") public ResponseEntity<RefBoxDTO> getRefboxInfo( @RequestParam(name = "handle") String handle, HttpServletRequest request) throws SQLException { Context context = ContextUtil.obtainContext(request); if (context == null) { throw new RuntimeException("Cannot obtain the context from the request."); } + + if (StringUtils.isBlank(handle)) { + throw new UnprocessableEntityException("The handle parameter is required."); + } DSpaceObject dSpaceObject = handleService.resolveToObject(context, handle);
376-393: Consider extracting HTML formatting to constants or template.The hardcoded HTML tags in the display text formatting reduce maintainability, as noted in previous reviews. Consider using constants or a template approach for better maintainability.
The current implementation works but could be improved for long-term maintenance.
419-435: Address configuration null safety and format type issues from previous reviews.The method has issues that were previously identified:
- Missing null safety for
configurationService.getProperty("dspace.server.url")- Hard-coded export format specifications that should be configurable
- Incorrect dataType "json" for both bibtex and cmdi formats
Consider implementing the previously suggested improvements for configuration null checks and making format types configurable rather than hard-coded.
454-462: Improve metadata parsing robustness as previously recommended.The metadata parsing could be more robust, similar to the error handling in the existing
mapFeaturedServiceLinksmethod. The current implementation only checks for exactly 2 parts but could handle edge cases better.Consider implementing the previously suggested improvements for handling malformed metadata values with proper logging and error handling.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java(5 hunks)dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/ExportFormatDTO.java(1 hunks)dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/FeaturedServiceDTO.java(1 hunks)dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/FeaturedServiceLinkDTO.java(1 hunks)dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/RefBoxDTO.java(1 hunks)dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java(2 hunks)
🧬 Code Graph Analysis (1)
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java (4)
dspace-api/src/main/java/org/dspace/core/Utils.java (1)
Utils(54-581)dspace-api/src/test/java/org/dspace/builder/CollectionBuilder.java (1)
CollectionBuilder(37-378)dspace-api/src/test/java/org/dspace/builder/CommunityBuilder.java (1)
CommunityBuilder(31-182)dspace-api/src/test/java/org/dspace/builder/ItemBuilder.java (1)
ItemBuilder(39-458)
✅ Files skipped from review due to trivial changes (2)
- dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/RefBoxDTO.java
- dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/FeaturedServiceDTO.java
🚧 Files skipped from review as they are similar to previous changes (2)
- dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/ExportFormatDTO.java
- dspace-server-webapp/src/main/java/org/dspace/app/rest/model/refbox/FeaturedServiceLinkDTO.java
🧰 Additional context used
🧬 Code Graph Analysis (1)
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java (4)
dspace-api/src/main/java/org/dspace/core/Utils.java (1)
Utils(54-581)dspace-api/src/test/java/org/dspace/builder/CollectionBuilder.java (1)
CollectionBuilder(37-378)dspace-api/src/test/java/org/dspace/builder/CommunityBuilder.java (1)
CommunityBuilder(31-182)dspace-api/src/test/java/org/dspace/builder/ItemBuilder.java (1)
ItemBuilder(39-458)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: dspace-dependencies / docker-build (linux/amd64, ubuntu-latest, true)
- GitHub Check: Run Integration Tests
- GitHub Check: Run Unit Tests
🔇 Additional comments (10)
dspace-server-webapp/src/test/java/org/dspace/app/rest/ClarinRefBoxControllerIT.java (7)
10-34: LGTM - Appropriate imports for enhanced test functionality.The new imports support the comprehensive testing of the refbox endpoint, including collection matching with
hasItem, configuration-based URL construction withConfigurationService, and utility methods withUtils.
77-110: LGTM - Comprehensive test coverage for items with featured services.The test method effectively validates the complete refbox response structure, including dynamic URL generation for export formats and proper featured service metadata parsing. The use of JSONPath expressions with filtering is particularly good for testing complex nested structures.
112-140: LGTM - Correctly tests empty featured services scenario.The test properly validates the behavior when an item has no featured service metadata. The fix to use the correct
itemvariable (instead ofitemWithFS) addresses the previous review feedback, and the assertion thatfeaturedServices.featuredServiceis empty is appropriate.
142-154: LGTM - Proper error handling test for missing handle parameter.The test correctly validates that a missing handle parameter results in a 4xx client error, which is appropriate for missing required parameters.
156-226: LGTM - Comprehensive metadata variation testing.These test methods provide excellent coverage of edge cases for display text generation, including scenarios with only publisher, year, title, DOI, and whitespace metadata. This ensures the display text building logic handles various real-world metadata combinations gracefully.
228-255: LGTM - Good edge case coverage for featured service metadata.These tests appropriately handle edge cases in featured service processing:
- The duplicate entries test correctly expects 2 entries (as clarified by the developer)
- The malformed link test ensures graceful handling when metadata doesn't follow the expected
key|valueformatBoth tests contribute to robust error handling and data processing validation.
257-324: LGTM - Thorough testing of author formatting business logic.These test methods comprehensively validate the complex author formatting rules:
- Single author: direct display
- Two authors: joined with "and"
- 2-5 authors: semicolon-separated with final "and"
- 6+ authors: first author plus "et al."
The test assertions correctly verify each formatting scenario matches the expected business requirements.
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java (3)
28-66: LGTM - Appropriate imports and dependencies for refbox functionality.The new imports support the comprehensive refbox endpoint including stream processing for metadata, DTO classes for structured responses, exception handling, and handle service integration.
Also applies to: 130-131
402-413: LGTM - Correct implementation of author formatting business logic.The method properly implements the complex author formatting requirements:
- Single author: direct display
- 2-5 authors: semicolon separation with "and" for the last author
- 6+ authors: first author plus "et al."
The implementation matches the comprehensive test coverage and business requirements.
620-632: LGTM - Good refactoring that maintains API compatibility.The internal field rename from
metadatatovalueimproves naming clarity while maintaining external API compatibility. ThegetMetadata()method continues to work as expected for existing consumers.
dspace-server-webapp/src/main/java/org/dspace/app/rest/ClarinRefBoxController.java
Outdated
Show resolved
Hide resolved
* UFAL/DOI - Added type of resource to data cite (#975) * UFAL/The process output is not displayed because of S3 direct download (#971) * The S3 direct download is provided only for the files located in the ORIGINAL bundle * Use constant for the ORIGINAL string value * Check if type is html (#983) * check if type is html * added test for html mime type * used static string for text/html, added check * Ufal dtq sync062025 (#985) * we should identify as clarin-dspace Fix test (cherry picked from commit 6cdf2d1) * update email templates to use dspace.shortname dspace.name can be a long string not fit for Email subjects nor signatures (cherry picked from commit 98d60dd) * match v5 submission (cherry picked from commit 4a2b65f) * get rid of lr.help.phone Phone is now conditional in the templates. Use `mail.message.helpdesk.telephone` if you want it. The change in the *.java files is to preserve the params counts. The relevant templates are getting the phone directly from config (cherry picked from commit cba5695) * Add option to configure oai sample identifier some validators use this value, should be a real id in prod deployments (cherry picked from commit 912f13f) * NRP deposit license (cherry picked from commit ba23878) * Fix ufal#1219 Get rid of setting the jsse.enableSNIExtension property which causes issues with handle minting (cherry picked from commit 7d03173) * UFAL/Improve file preview generating (#972) * get name and size from metadata and header of file, avoid input stream using * remove temp file, checkstyle, do not load full file * add { } after if * added check for max preview file * used ZipFile and TarArchived for filepreview generating * added removed lines * used 7z for zip and tar files * removed 7z and used zip and tar entry * improved file previrew generating speed, used string builder, xml builder, authorization only if is required * checkstyle, return boolean from haspreview and previrews from getPreview, replaced return with continue * fix problem with hibernate session * fix .tar.gz generating * skip fully entry for tar * added indexes for speed up queries * added license header * named constant by upper case * inicialized fileInfo, refactorization of code based on copilot review --------- Co-authored-by: milanmajchrak <[email protected]> * Fix the file preview integration test (#989) * The hasPreview method has been changed, but the IT wasn't updated correctly * Use the correct checkbox for the input field - use repeatable (#991) * UFAL/EU Sponsor openaire id should not be required (#1001) * EU Sponsor openaire id should not be required * Not required also in the czech submission forms * Logging error message while emailing users (#1000) * Logging error message --------- Co-authored-by: Matus Kasak <[email protected]> Co-authored-by: milanmajchrak <[email protected]> * UFAL/Teaching and clariah submissions does not have clarin-license (#1005) * UFAL/Fix logging in LogoImportController (#1003) * fix logging * used formatter for msg * UFAL/Update the resource policy rights when changing submitter (#1002) * removed res policies for submitter and created newones when item is shared * avoid magic number, use constant * set submitter in existing res policies * removed not used shared link * UFAL/Added date to title when creating new version (#984) * added date to versioned item title * used more modern approach for getting current time * renamed test * used var for reusing * UFAL/Item handle info in email after download request (#1006) * Added item handle to email * Exception when item not found * Checked grammar * Handled multiple items found by bitstream * Using PID instead of handle --------- Co-authored-by: Matus Kasak <[email protected]> * UFAL/Incorrect password hash funct used during migration (#999) * password in request is already hashed, used different password hash funct * renamed password param in eperson endpoint * [devOps] labelling reviewing process * [devOps] labelling reviewing process * UFAL/New version keeps the old identifier * UFAL/Send email to editor after submitting item (#1016) Co-authored-by: Matus Kasak <[email protected]> * UFAL/Local file size is 0 for file with no zero size (#1017) * update item metadata after the bitstream size has changed * issue 1241: ItemFilesMetadataRepair script implementation (DSpace#1243) (#1021) * issue 1241: ItemFilesMetadataRepair script implementation * extend script to be applicabble for all items, and for items with files metadata that have missing bitstreams (files) * implement dry-run option * option description fix * Improve error message * Use "0" instead of "" + 0 * Improve error message (cherry picked from commit 706f6f6) Co-authored-by: kuchtiak-ufal <[email protected]> * UFAL/Refbox upgrade (#1015) * Created integration test * Created an endpoint for complete ref box information like in the v5 * Added integration tests for formatting authors * Removed double semicolon * Fetch the metadata value following the current locale * Updated firstMetadataValue because it did return empty string instead of null * Use DEFAULT_LANGUAGE instead of current locale * UFAL/Added doc - issue link (#1023) --------- Co-authored-by: Paurikova2 <[email protected]> Co-authored-by: Ondřej Košarko <[email protected]> Co-authored-by: Kasinhou <[email protected]> Co-authored-by: Matus Kasak <[email protected]> Co-authored-by: jurinecko <[email protected]> Co-authored-by: jm <jm@maz> Co-authored-by: kuchtiak-ufal <[email protected]>
Problem description
Reported issues
Not-reported issues
Analysis
(Write here, if there is needed describe some specific problem. Erase it, when it is not needed.)
Problems
(Write here, if some unexpected problems occur during solving issues. Erase it, when it is not needed.)
Summary by CodeRabbit
New Features
Bug Fixes
Tests