Thanks to visit codestin.com
Credit goes to github.com

Skip to content

New Project Proposal - [Sandbox Proposal] AI Dataset Health for z/OS (ai-dataset-health-zos) #895

@marbatis

Description

@marbatis

Project description

ai-dataset-health-zos is a small, open-source Python tool that:

  • Lists repository files (future: z/OS datasets via z/OSMF Jobs & Files APIs).
  • Computes a simple dataset health score (initial rule: zero-byte file detection).
  • Exposes a CLI (list_files.py --health) with human-readable output.
  • Ships with modern CI gates: Ruff (lint), Black (format), MyPy (types), Pytest (coverage).
    Why valuable: Mainframe teams often lack lightweight, open, automatable checks for dataset health (empties, size/naming anomalies, staleness). This project provides an approachable baseline that fits CI/CD and sets a path to add AI/ONNX scoring and z/OSMF integration on Z.
    Origin/history: Built as a focused MVP to demonstrate CI-first mainframe tooling with a clear evolution to z/OSMF and AI-assisted rules.

Statement on alignment with Open Mainframe Project Mission and Vision statements

  • Promotes open source innovation on mainframe by offering a minimal, extensible dataset health checker.
  • Lowers the barrier for new contributors (pure Python + modern CI).
  • Encourages interoperability with existing Z ecosystems via planned z/OSMF adapters, without vendor lock-in.
  • Creates community building blocks to apply AI techniques (e.g., ONNX) to mainframe operations data.

Are there similar/related projects out there?

  • Zowe CLI provides dataset operations, but not a focused health scoring workflow nor an AI-ready scoring path.
  • Proprietary tools exist for checks/audits, but they are not open, CI-first, or designed for community rule extensions.
    Differentiators:
  • Explicit health scoring model (start simple, grow rules).
  • CI-first repo (ruff/black/mypy/pytest/coverage, smoke artifact).
  • Clear roadmap to z/OSMF adapters and ONNX inference.

Sponsor from TAC

To be appointed

Proposed Project Stage

Sandbox

License and contribution guidelines

  • Current license: MIT (OSI-approved).
  • If accepted into OMP: willing to relicense to Apache-2.0.
  • Contribution flow: GitHub PRs, DCO sign-off (Signed-off-by), code style via Ruff/Black, type hints via MyPy, tests via Pytest with coverage gate.

Current or desired source control repository

GitHub (current): https://github.com/marbatis/ai-dataset-health-zos

External dependencies (including licenses)

Runtime: Python 3.11+ (stdlib only for MVP).
Dev/CI: ruff (MIT), black (MIT), mypy (MIT), pytest/pytest-cov (MIT).
Planned (optional): onnxruntime (MIT) for AI scoring.
No commercial or non-redistributable software required.

Initial committers

  • Marcelo Silveira (@marbatis) — creator/maintainer; all initial commits.
  • Current community size: 1 maintainer; goal is to add 2–3 co-maintainers within 3–6 months via OMP community.

Infrastructure requests

  • CI: GitHub Actions (already in place: ruff, black, mypy, pytest, coverage ≥80%, “health smoke” artifact).
  • Request access to an OMP open z/OS environment with z/OSMF Jobs & Files to validate dataset APIs.
  • (Optional later) A project Slack channel and a simple GitHub Pages site (if recommended).

Communication channels

  • Request a project Slack channel in the OMP workspace (e.g., #ai-dataset-health-zos).
  • GitHub Discussions in the repo for Q&A and design notes.
  • (Optional) Mailing list if OMP prefers; otherwise Discussions is sufficient at MVP stage.

Communication channels

GitHub Issues: https://github.com/marbatis/ai-dataset-health-zos/issues

Website

Use the repository README as the landing page for now. (Optionally add GitHub Pages after acceptance.)

Release methodology and mechanics

  • SemVer. Start with 0.x pre-releases while MVP evolves.
  • GitHub Releases + tags; changelog in RELEASE_NOTES.md.
  • Every release built from passing CI (ruff/black/mypy/pytest/coverage).

Social media accounts

None yet. Will coordinate with OMP social channels after acceptance.

Community size and any existing sponsorship

  • Size: single-maintainer MVP (1).
  • No commercial sponsorship; seeking community contributions and TAC mentorship.

Metadata

Metadata

Assignees

No one assigned

    Labels

    1-new-project-wgNew Project or Working Group application

    Type

    No type

    Projects

    Status

    On Hold

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions