v4.2.1 - Wheel sync + worker bug fixes
v4.2.0 was tagged but the publish workflow was cancelled before any
PyPI upload because orca/engine.c (the wheel-build source) was stale
relative to the runtime-fallback engine.c at repo root. The wheels
would have shipped without board_encode_state_full and a few other
C entry points. v4.2.1 is the first 4.2 release that actually
publishes to PyPI.
Fixes
orca/engine.csynced with rootengine.c. Wheel-build now
includesboard_encode_state_full,mcts_tree_new, and the rest of
the May 14 engine additions._drundefined inorca/data.py:950. Reference to a leftover
variable name after a refactor; replaced withDISTANT_RANGE. Self-
play workers no longer crash withNameErrorwhen distant-exploration
triggers.- PR smoke CI now also skips
tests/test_c_mcts.py(stale tests
against an older C engine ABI; cleanup tracked separately).
Everything from v4.2.0 below is included in v4.2.1.
v4.2.0 - Training Workflow & Observability
Training Workflow Polish
- Hardware profiles (
--profile=mps-laptop|cuda-single|cuda-multi|cpu-only|colab-t4) - pick sensible defaults for batch size, workers, MCTS sims, and games-per-iter in one flag. Explicit CLI args still win. python -m orca init <name>- scaffolder that drops a templated project with config,train.sh,play.sh,plugins.py, and README. Lowers first-run friction from "read the wiki" to "run two commands."- Atomic checkpoint and replay-buffer writes -
torch.saveandpickle.dumpnow go via.tmpthenos.replace(). SIGKILL mid-write no longer corrupts the canonical file. - Checkpoint metadata - every
.ptcarries a_hexbot_metadict (schema_version, arch, iter, elo, git_sha, hexbot_version, timestamp). Loaders no longer need to infer architecture from filename. - ETA + moving-average iteration timer - rolling 8-iter window prints projected completion at the start of every iteration.
--auto-tuner-dry-run- preview AutoTuner decisions without applying them.- Plateau detection wiring - the
PLATEAU_*config values and--plateau-*CLI flags were stored but never read; they now actually trigger an MCTS sim boost when ELO stalls.
Observability
- TensorBoard writer (
--tensorboard, opt-in) - logsloss/total,loss/policy,loss/value,elo/current,lr,time/iter_seconds,buffer/size,games/completedtoruns/<id>/. - Weights & Biases integration (
--wandb, opt-in viapip install 'hexbot[wandb]') - same metric stream, same step indexing. - Run manifest (
runs/<id>/manifest.json) - CLI args, config snapshot, git sha, hostname, GPU info, hexbot/PyTorch versions, written at run start. - Worker error log (
runs/<id>/workers.log) - process pool failures now log full tracebacks with iteration, timestamp, and source site instead of being swallowed or inlined.
Community & Discoverability
- Featured Community Bots table in README, auto-regenerated from
leaderboard.jsonby a scheduled GitHub Action (or push toleaderboard.json). - Colab quickstart notebook (
notebooks/colab_quickstart.ipynb) - one-click train + TensorBoard view on a free T4 GPU. - PR smoke CI (
.github/workflows/pr-smoke.yml) - tests + 2-iteration training + scaffolder check on every PR and push to main.
Cleanup
orca/distributed.py-MultiGPUTrainerandRayTrainerstubs now emit aUserWarningand carry STUB labels in docstrings. They previously advertised DDP / Ray scaling but silently fell back to single-GPUOrcaTrainer.- SealBot expert demo samples auto-enabled once
_last_policy_loss < 3.0, the same threshold used for the soft MCTS target switch.
Stretch
- Optuna sweep adapter (
python -m orca.sweep, requirespip install 'hexbot[sweep]') - hyperparameter sweep overlr,batch_size,mcts_sims,train_stepswith final ELO as the objective.
Optional extras added
hexbot[tensorboard],hexbot[wandb],hexbot[sweep].hexbot[all]now pulls all three.
Backwards Compatible
All v4.1.4 API unchanged. New features are opt-in via flags; existing scripts run unchanged.