Thanks to visit codestin.com
Credit goes to github.com

Skip to content

GareBear99/gh-ai-operator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ARC GitHub AI Operator

CI License: MIT Python 3.10+ Built for ARC-Core

Evidence-first GitHub scout + single-target code reviewer for the ARC ecosystem. Discovers related repositories, clones them one at a time, collects a structural snapshot, runs an AI (or heuristic) review, and either writes a local report or posts a verdict comment back to the Portfolio.

The operator ships two entry points:

Entry point What it does
scout.py Seed-driven discovery of related public repos + batch review + draft/approval-queue issue output.
review_target.py Single-URL review. Called by the Portfolio's code-review dispatch workflow. Takes one GitHub URL, posts a pre-review verdict back to the originating issue.
cloudflare/worker.js HTTPS front-end deployed to Cloudflare Workers. Anyone can POST /review with a JSON body and the Worker fires repository_dispatch at this repo. See cloudflare/README.md.
training_export.py Emits every live review as an LLMBuilder-compatible seed-examples JSONL record. Runs automatically at the end of each review; opt out with EMIT_TRAINING_DATA=0.

🧠 Continuous-learning worker for ARC-Neuron-LLMBuilder

This operator is not just an AI helper β€” it's the live-deployment training signal generator for ARC-Neuron-LLMBuilder. Every production review is emitted as a supervised training example in LLMBuilder's seed-examples schema and uploaded as the llmbuilder-training-export artifact (90-day retention). LLMBuilder's nightly workflow pulls those artifacts and grows its critique corpus automatically.

flowchart LR
    P["Portfolio<br/>code-review issue"] --> OP["this repo<br/>review_target.py"]
    OP -- "verdict comment" --> P
    OP -- "training_export.py<br/>seed-examples JSONL" --> A["llmbuilder-training-export<br/>artifact"]
    A --> LB["ARC-Neuron-LLMBuilder<br/>ingest-operator-reviews.yml (daily)"]
    LB --> C["data/critique/operator_reviews.jsonl"]
    C --> GATE["next Gate v2 candidate"]
    P -. follow-up .-> COR["export_correction<br/>tag: correction"]
    COR --> A
    style OP fill:#0366d6,stroke:#fff,color:#fff
    style LB fill:#7057ff,stroke:#fff,color:#fff
Loading
  • Schema: id / capability=critique / domain=code / difficulty / input.task / target.{analysis,confidence,verdict,findings} / tags / provenance β€” matches LLMBuilder's existing data/critique/seed_examples.jsonl exactly, no glue code.
  • Correction lane: a Portfolio Follow-up issue emits a second record tagged correction + human-follow-up; LLMBuilder's ingest bumps its confidence by +0.05 so Gate v2 training weights human overrides higher than the baseline stream.
  • Provenance: every record carries provenance.source, provenance.emitted_at, provenance.target_url. The critique corpus is fully auditable.
  • Separate shard: ingested data lands in data/critique/operator_reviews.jsonl, not the curated seed_examples.jsonl. Promotion to the canonical seed file stays a human decision.

Full pipeline: ARC-Neuron-LLMBuilder/docs/LIVE_DEPLOYMENT_LEARNING.md.

Live-run evidence: the first real run against a non-toy codebase (FreeEQ8) is captured in docs/FIRST_LIVE_RUN.md with the full snapshot, verdict, training record, and ingest manifest. Phase status, activation checklist, and red-team model in docs/QUALITY_PROOF_PLAN.md.

☁️ Runs fully on Cloudflare's free tier

  • Cloudflare Workers AI is the top-priority LLM backend in free_llm_client.py. Set CLOUDFLARE_ACCOUNT_ID + CLOUDFLARE_API_TOKEN and reviews are generated by @cf/meta/llama-3.3-70b-instruct-fp8-fast on Cloudflare's edge. Free tier: ~10k neurons/day, no credit card.
  • Cloudflare Worker proxy (cloudflare/worker.js) gives the operator a permanent HTTPS endpoint so the Portfolio, VALLIS_Liquidity, ADMENSION, a CLI, or a plain curl can enqueue a review over HTTP. Free tier: ~100k requests/day.
  • GitHub Actions orchestrates the run and posts the verdict back to the originating Portfolio issue. Free for public repos.

Set up docs: cloudflare/README.md.

Safety posture

Default posture is safe:

  • posting.enabled = false
  • posting.draft_only = true
  • confidence threshold + evidence gate + allowlist / denylist + duplicate-title / body checks + cooldown history + per-run and per-day caps.

The single-target review_target.py never auto-creates new issues anywhere; it only comments on the specific Portfolio issue you pass to it.

How the Portfolio dispatch flow works

flowchart LR
    A["User opens Code-Review<br/>issue on Portfolio"] --> B["Portfolio workflow<br/>.github/workflows/ai-pre-review.yml"]
    B -- "repository_dispatch<br/>code-review-request" --> C["gh-ai-operator workflow<br/>.github/workflows/ai-review-dispatch.yml"]
    C --> D["review_target.py<br/>clone + snapshot + heuristic + optional AI"]
    D --> E["gh issue comment<br/>β†’ Portfolio issue"]
    E --> F["User reads verdict Β· optionally opens Follow-up"]
Loading

The workflow also runs standalone via workflow_dispatch β€” trigger it manually from the Actions tab with a target_url + optional portfolio_issue.

Install

python3 -m venv .venv
source .venv/bin/activate
pip install -e ./github_ai_operator[dev]
export GITHUB_TOKEN="$(gh auth token)"

This exposes two console scripts:

  • arc-operator-scout β€” seed-driven discovery.
  • arc-operator-review β€” single-target review.

Quickstart: single-target review

arc-operator-review \
  --target https://github.com/octocat/Hello-World \
  --review-type "Full repo review" \
  --depth standard \
  --focus "Check README, license, and tests"

Writes github_ai_operator/output/target_review/<slug>.md.

Pass --portfolio-issue GareBear99/Portfolio#123 to also comment the review on that issue (requires a token with issues:write on the Portfolio repo).

Quickstart: scout mode

cd github_ai_operator
cp config.example.json config.json
python scout.py --config config.json --print-queries --dry-run
python scout.py --config config.json    # draft mode

Configuration

See github_ai_operator/config.example.json. Key sections:

  • seed_repos β€” which repos to learn keywords from.
  • search β€” mode: related | custom | hybrid, min_stars, languages, required_topics, pushed_after.
  • limits β€” max_repos_per_run, max_issue_posts_per_day, min_similarity_score, min_issue_confidence, repost_cooldown_days, duplicate_title_overlap_threshold.
  • posting β€” enabled, draft_only, require_manual_approval, allowlist, denylist, labels.
  • ai β€” optional LLM backend (api_url, api_key_env, model). Falls back to heuristic review if no AI is configured.

Tests

cd github_ai_operator
pip install pytest
python -m pytest tests/ -q

Tests cover the parts the Portfolio dispatch flow depends on: config loading, URL parsing, verdict derivation, markdown rendering.

GitHub Actions workflows

  • .github/workflows/ci.yml β€” runs pytest on every push / PR across Python 3.10 / 3.11 / 3.12, plus a smoke dry-run of the scout.
  • .github/workflows/ai-review-dispatch.yml β€” listens for repository_dispatch of type code-review-request from the Portfolio, runs the single-target review, and comments the verdict back on the originating Portfolio issue. Also exposed as a workflow_dispatch so it can be triggered manually from the Actions tab.

Required secret for cross-repo commenting

To post reviews back to Portfolio issues, set a PAT with issues:write on GareBear99/Portfolio as the PORTFOLIO_WRITE_TOKEN secret in this repo (Settings β†’ Secrets and variables β†’ Actions). The workflow prefers it over the built-in GITHUB_TOKEN, which is scoped to this repo only.

Project layout

gh-ai-operator/
β”œβ”€β”€ .github/
β”‚   β”œβ”€β”€ FUNDING.yml
β”‚   └── workflows/
β”‚       β”œβ”€β”€ ci.yml
β”‚       └── ai-review-dispatch.yml
└── github_ai_operator/
    β”œβ”€β”€ github_ai_operator/
    β”‚   β”œβ”€β”€ __init__.py
    β”‚   β”œβ”€β”€ ai_client.py
    β”‚   β”œβ”€β”€ anthropic_client.py
    β”‚   β”œβ”€β”€ config.py
    β”‚   β”œβ”€β”€ delay.py
    β”‚   β”œβ”€β”€ engine.py
    β”‚   β”œβ”€β”€ free_llm_client.py
    β”‚   β”œβ”€β”€ github_api.py
    β”‚   β”œβ”€β”€ issue_writer.py
    β”‚   β”œβ”€β”€ models.py
    β”‚   β”œβ”€β”€ review.py
    β”‚   └── similarity.py
    β”œβ”€β”€ tests/
    β”‚   └── test_basic.py
    β”œβ”€β”€ config.example.json
    β”œβ”€β”€ pyproject.toml
    β”œβ”€β”€ requirements.txt
    β”œβ”€β”€ review_target.py
    └── scout.py

Current limits

This is not a full semantic code auditor. It works best as:

  • a scout,
  • a triage assistant,
  • a first-pass reviewer on a URL,
  • a manual-review accelerator.

It does not run CI, execute tests inside target repos, or guarantee correctness of AI-generated review text. Use it as a pre-review gate, not a human replacement.

πŸ’– Support

Related ARC repos

  • ARC-Core β€” event + receipt spine. Every operator run is an ARC-Core-shaped event; every verdict is a receipt.
  • ARC-Neuron-LLMBuilder β€” the learner this operator feeds. Live reviews become supervised training data for the next Gate v2 candidate (see the continuous-learning section above).
  • Portfolio β€” the intake surface. Four issue templates dispatch code-review requests to this operator via repository_dispatch.
  • omnibinary-runtime + Arc-RAR β€” any-OS portability.

License

MIT β€” see the project LICENSE file.

About

ARC GitHub Operator is a lightweight autonomous system that discovers repositories related to your projects, reviews them, catalogs insights, and optionally contributes feedback.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors