-
Notifications
You must be signed in to change notification settings - Fork 35
Open
Labels
1-new-project-wgNew Project or Working Group applicationNew Project or Working Group application
Description
Project description
ai-dataset-health-zos is a small, open-source Python tool that:
- Lists repository files (future: z/OS datasets via z/OSMF Jobs & Files APIs).
- Computes a simple dataset health score (initial rule: zero-byte file detection).
- Exposes a CLI (
list_files.py --health) with human-readable output. - Ships with modern CI gates: Ruff (lint), Black (format), MyPy (types), Pytest (coverage).
Why valuable: Mainframe teams often lack lightweight, open, automatable checks for dataset health (empties, size/naming anomalies, staleness). This project provides an approachable baseline that fits CI/CD and sets a path to add AI/ONNX scoring and z/OSMF integration on Z.
Origin/history: Built as a focused MVP to demonstrate CI-first mainframe tooling with a clear evolution to z/OSMF and AI-assisted rules.
Statement on alignment with Open Mainframe Project Mission and Vision statements
- Promotes open source innovation on mainframe by offering a minimal, extensible dataset health checker.
- Lowers the barrier for new contributors (pure Python + modern CI).
- Encourages interoperability with existing Z ecosystems via planned z/OSMF adapters, without vendor lock-in.
- Creates community building blocks to apply AI techniques (e.g., ONNX) to mainframe operations data.
Are there similar/related projects out there?
- Zowe CLI provides dataset operations, but not a focused health scoring workflow nor an AI-ready scoring path.
- Proprietary tools exist for checks/audits, but they are not open, CI-first, or designed for community rule extensions.
Differentiators: - Explicit health scoring model (start simple, grow rules).
- CI-first repo (ruff/black/mypy/pytest/coverage, smoke artifact).
- Clear roadmap to z/OSMF adapters and ONNX inference.
Sponsor from TAC
To be appointed
Proposed Project Stage
Sandbox
License and contribution guidelines
- Current license: MIT (OSI-approved).
- If accepted into OMP: willing to relicense to Apache-2.0.
- Contribution flow: GitHub PRs, DCO sign-off (
Signed-off-by), code style via Ruff/Black, type hints via MyPy, tests via Pytest with coverage gate.
Current or desired source control repository
GitHub (current): https://github.com/marbatis/ai-dataset-health-zos
External dependencies (including licenses)
Runtime: Python 3.11+ (stdlib only for MVP).
Dev/CI: ruff (MIT), black (MIT), mypy (MIT), pytest/pytest-cov (MIT).
Planned (optional): onnxruntime (MIT) for AI scoring.
No commercial or non-redistributable software required.
Initial committers
- Marcelo Silveira (@marbatis) — creator/maintainer; all initial commits.
- Current community size: 1 maintainer; goal is to add 2–3 co-maintainers within 3–6 months via OMP community.
Infrastructure requests
- CI: GitHub Actions (already in place: ruff, black, mypy, pytest, coverage ≥80%, “health smoke” artifact).
- Request access to an OMP open z/OS environment with z/OSMF Jobs & Files to validate dataset APIs.
- (Optional later) A project Slack channel and a simple GitHub Pages site (if recommended).
Communication channels
- Request a project Slack channel in the OMP workspace (e.g., #ai-dataset-health-zos).
- GitHub Discussions in the repo for Q&A and design notes.
- (Optional) Mailing list if OMP prefers; otherwise Discussions is sufficient at MVP stage.
Communication channels
GitHub Issues: https://github.com/marbatis/ai-dataset-health-zos/issues
Website
Use the repository README as the landing page for now. (Optionally add GitHub Pages after acceptance.)
Release methodology and mechanics
- SemVer. Start with 0.x pre-releases while MVP evolves.
- GitHub Releases + tags; changelog in RELEASE_NOTES.md.
- Every release built from passing CI (ruff/black/mypy/pytest/coverage).
Social media accounts
None yet. Will coordinate with OMP social channels after acceptance.
Community size and any existing sponsorship
- Size: single-maintainer MVP (1).
- No commercial sponsorship; seeking community contributions and TAC mentorship.
Metadata
Metadata
Assignees
Labels
1-new-project-wgNew Project or Working Group applicationNew Project or Working Group application
Type
Projects
Status
On Hold