Background
MAGDA has been exploring richer assistant-driven workflows around dataset discovery, semantic search, and local analysis. Running the full workflow on the MAGDA server side is difficult for several practical reasons:
- agent workloads are hard to sandbox properly inside a cluster,
- scaling long-running assistant workloads can become expensive for deployments with large user demand,
- browser-based workflows are awkward for long-running tasks and rich outputs such as charts, generated files, and analysis notebooks,
- large source files are easier and safer to process on the user's machine than through the web UI or a central service,
- building and maintaining a high-quality hosted agent harness is a large product and operations commitment.
MAGDA already has the foundations for a different approach: broad REST API coverage, generated API documentation, API key authentication, search APIs, registry APIs, storage APIs, and the newer hybrid/semantic search work. Instead of hosting arbitrary assistant runs inside MAGDA, MAGDA can provide a local command-line harness that lets users and local developer-assistant tools access MAGDA resources from their own machine.
Proposal
Provide a local MAGDA command-line interface named mgd. The CLI should become the supported automation boundary for local data discovery and analysis workflows.
The CLI should:
- expose curated workflow commands for common MAGDA tasks,
- provide a raw API fallback for documented REST endpoints,
- use MAGDA's existing API-key authentication model,
- stream large downloads and uploads,
- work cleanly with common Unix tools such as
jq, grep, sed, awk, xargs, cat, and ls,
- support local developer-assistant workflows through separate skill/instruction packages.
Initial Scope
Phase 1:
- Build the first npm-installable version of the
mgd CLI.
- Support profile and API-key based authentication.
- Support dataset search, dataset inspection, distribution listing, file download, and raw API requests.
- Publish the initial package through npm as
@magda/mgd without requiring users to clone the MAGDA repository.
- Add assistant workflow skills that document safe and effective CLI usage.
Phase 2:
- Add standalone binary distribution for users who do not have Node.js or npm installed.
Child Issues
Acceptance Criteria
- There is an agreed local CLI direction for MAGDA-assisted local workflows.
- There are initial implementation issues for the CLI, assistant workflow skills, and standalone binary packaging.
- The CLI design remains compatible with normal shell scripting and Unix pipelines.
- The first release scope does not require running arbitrary assistant workloads inside the MAGDA cluster.
Out of Scope
- MCP support.
- Hosting arbitrary assistant workflows inside MAGDA.
- Replacing the existing web UI, chatbot, SQL console, or API docs.
- Full high-level command coverage for every MAGDA REST endpoint in the first release.
Background
MAGDA has been exploring richer assistant-driven workflows around dataset discovery, semantic search, and local analysis. Running the full workflow on the MAGDA server side is difficult for several practical reasons:
MAGDA already has the foundations for a different approach: broad REST API coverage, generated API documentation, API key authentication, search APIs, registry APIs, storage APIs, and the newer hybrid/semantic search work. Instead of hosting arbitrary assistant runs inside MAGDA, MAGDA can provide a local command-line harness that lets users and local developer-assistant tools access MAGDA resources from their own machine.
Proposal
Provide a local MAGDA command-line interface named
mgd. The CLI should become the supported automation boundary for local data discovery and analysis workflows.The CLI should:
jq,grep,sed,awk,xargs,cat, andls,Initial Scope
Phase 1:
mgdCLI.@magda/mgdwithout requiring users to clone the MAGDA repository.Phase 2:
Child Issues
Acceptance Criteria
Out of Scope