Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Issue/3608 alt 1#3643

Open
chensuihui wants to merge 28 commits into
magda-io:nextfrom
chensuihui:issue/3608-alt-1
Open

Issue/3608 alt 1#3643
chensuihui wants to merge 28 commits into
magda-io:nextfrom
chensuihui:issue/3608-alt-1

Conversation

@chensuihui
Copy link
Copy Markdown

What this PR does

Fixes #3608

This PR implements the new semantic index query API access control flow.

Specifically, it:

  • adds a filter records by access endpoint to the Registry API so a list of record IDs can be filtered based on the current user's read access
  • adds the Phase 1 logic to semantic search, which performs semantic retrieval first and then filters the returned record IDs via the new access filter endpoint
  • adds the Phase 2 fallback logic, which narrows semantic retrieval to accessible records when Phase 1 returns no authorised results

Checklist

  • There are unit tests to verify my changes are correct or unit tests aren't applicable
  • I've updated CHANGES.md with what I changed.

Copilot AI review requested due to automatic review settings March 30, 2026 03:11
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the new semantic index query API access-control flow (#3608) by adding a Registry access-filter endpoint and updating semantic search to use a two-phase search (broad vector search + access filtering, then a fallback search constrained to accessible record IDs).

Changes:

  • Added Registry /records/access-filter endpoint and supporting request/response models + auth/tenant isolation tests.
  • Added Phase 1/Phase 2 access-control logic to semantic search service, plus query-builder support for recordId-scoped vector queries.
  • Added TypeScript API clients and updated semantic-search API wiring/tests to pass JWT + tenant context.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
magda-typescript-common/src/SearchApiClient.ts New client for Search API dataset search (used for Phase 2 fallback).
magda-typescript-common/src/RegistryApiClient.ts New client for Registry access-filter endpoint (used for Phase 1 filtering).
magda-semantic-search-api/src/service/SemanticSearchService.ts Implements two-phase access-aware semantic search flow.
magda-semantic-search-api/src/service/queryBuilder.ts Adds recordId-scoped KNN query builder.
magda-semantic-search-api/src/api/createApiRouter.ts Plumbs session + tenant headers into SearchParams.
magda-semantic-search-api/src/model.ts Extends search params to carry jwt + tenantId.
magda-semantic-search-api/src/index.ts Wires new clients into service + adds Search API URL CLI option.
magda-semantic-search-api/src/test/service/semanticSearchService.spec.ts Adds unit tests for phase 1 filtering + phase 2 fallback behavior.
magda-semantic-search-api/src/test/service/queryBuilder.spec.ts Adds coverage for new query-builder branches and recordId-scoped queries.
magda-semantic-search-api/src/test/searchRoute.spec.ts Ensures headers are forwarded into service params + adds /retrieve route test.
magda-scala-common/src/main/scala/au/csiro/data61/magda/model/Registry.scala Adds shared request/response case classes for access-filter endpoint.
magda-registry-api/src/main/scala/au/csiro/data61/magda/registry/RecordsServiceRO.scala Adds the access-filter route to Registry API (read-only service).
magda-registry-api/src/test/scala/au/csiro/data61/magda/registry/RecordServiceAuthSpec.scala Adds auth + tenant isolation + input sanitization tests for access-filter.
CHANGES.md Notes the new semantic index query API/access-control changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread magda-typescript-common/src/SearchApiClient.ts
Comment thread magda-typescript-common/src/RegistryApiClient.ts Outdated
Comment thread magda-typescript-common/src/RegistryApiClient.ts Outdated
Comment thread magda-typescript-common/src/RegistryApiClient.ts Outdated
Comment thread magda-typescript-common/src/SearchApiClient.ts Outdated
Comment thread magda-semantic-search-api/src/service/queryBuilder.ts Outdated
Comment thread magda-typescript-common/src/SearchApiClient.ts
Copy link
Copy Markdown
Contributor

@t83714 t83714 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done 👍 Thanks for pulling this together & congrats to your first PR 🎉 - I've left a few comments. Please let me know your thoughts.

Comment thread magda-typescript-common/src/RegistryApiClient.ts Outdated
Comment thread magda-semantic-search-api/src/service/SemanticSearchService.ts
Comment thread magda-semantic-search-api/src/service/queryBuilder.ts Outdated
Comment thread CHANGES.md Outdated
Comment thread magda-typescript-common/src/SearchApiClient.ts Outdated
Copy link
Copy Markdown
Contributor

@t83714 t83714 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chensuihui thanks for the updates - left a few comments - I think we're very close 😄
A few extra points:

  • noticed the the latest commit was force pushed - can we avoid force push in future? Force push is generally considered as dangerous because it breaks shared history. Moreover, it also could confuse diffs / PR discussions with suddenly change. Instead, you should merge remote, when out of sync with remote. e.g. git merge --no-commit --no-ff origin/issue/3608-alt-1
  • the PR since now has merge conflicts require fixing
  • I sent your latest commit to CI for checking and got some errors. Can you have a look? https://gitlab.com/magda-data/magda/-/pipelines/2467160390

Comment thread magda-typescript-common/src/registry/RegistryClient.ts Outdated
Comment thread magda-typescript-common/src/SearchApiClient.ts Outdated
Comment thread magda-semantic-search-api/src/api/createApiRouter.ts Outdated
Comment thread magda-semantic-search-api/src/api/createApiRouter.ts Outdated
Copy link
Copy Markdown
Contributor

@t83714 t83714 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I can see all pipelines passed 🚀
Left one commit (re:response status code) for a last-minute tweak~
After that I guess we're ready to merge the PR 🎉

}
const parsed = Number(raw);
if (!Number.isInteger(parsed) || parsed < 0) {
throw new Error("Invalid X-Magda-Tenant-Id");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - I guess my last review missed this - can we throw a BadRequestError here instead of Error? Right now, the Error will be forwarded to generic error middleware, returning 500 (internal server).
We probably can throw BadRequestError and capture it in the error middleware (in this file) and respond a proper 400 code (Bad Request).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@t83714
Copy link
Copy Markdown
Contributor

t83714 commented May 7, 2026

@chensuihui I created a test release: https://github.com/magda-io/magda/releases/tag/v6.0.0-alpha.15 from your branch. Once the release job (https://gitlab.com/magda-data/magda/-/pipelines/2507504434) is done, you can use the release version number v6.0.0-alpha.15 to do a local full deployment test. By the way, let me know if you have time tomorrow afternoon - I can quickly show you the release process so you can release a test release for your future testing.

@t83714
Copy link
Copy Markdown
Contributor

t83714 commented May 15, 2026

@chensuihui when you have time, please let me know how did you go with local deployment testing of https://github.com/magda-io/magda/releases/tag/v6.0.0-alpha.16
If your test turns out ok, I can merge this PR~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants