Run GitHub Actions jobs on Android (Termux self-hosted runner)#1
Open
kwuite wants to merge 27 commits into
Open
Run GitHub Actions jobs on Android (Termux self-hosted runner)#1kwuite wants to merge 27 commits into
kwuite wants to merge 27 commits into
Conversation
Adds contrib/android/ with the device-specific glue needed to run the
GitHub Actions self-hosted runner on aarch64 Android phones under Termux,
plus two minimal source patches required for the runner to start under
bionic libc.
contrib/android/install.sh one-shot installer (pkgs, build, patch,
register --disableupdate, boot symlink)
contrib/android/patch-layout.sh re-publishes runner exes framework-
dependent, drops dotnet shell shims,
swaps glibc node for Termux node, and
neuters the glibc ldd probe in config.sh
contrib/android/start-runner.sh boot/restart launcher (nohup + setsid +
wake lock, no runit per local stability
issues with the existing sv stack)
contrib/android/runner-android-ctl start/stop/restart/status/logs/boot ctl
contrib/android/README.md rationale, setup, caveats
Source patches:
src/global.json - relax SDK pin from 8.0.419 to 8.0.100 + latestFeature
rollForward, so the Termux-native dotnet-sdk-8.0 (8.0.125) is accepted
instead of forcing dotnet-install.sh to download a glibc SDK.
src/Runner.Sdk/Util/IOUtil.cs - ValidateExecutePermission now treats
/data, /data/data and / as readable-enough; on Android these dirs are
mode 0711 by design and tripping on them is a false positive.
Self-update is disabled at registration time because GitHub's update flow
re-deploys the upstream linux-arm64 tarball over the patched layout, which
restores the bionic-incompatible apphost binaries (TLS alignment 8, bionic
needs 64) and the glibc-linked libcoreclr/libhostpolicy.
The original installer was a happy-path-only flow. Make it the single entry point for fresh install, reinstall, update, and uninstall: ./contrib/android/install.sh # install or rebuild ./contrib/android/install.sh --update # git pull + rebuild ./contrib/android/install.sh --uninstall # stop, deregister, clean up Key changes: - always stop any running runner before touching _layout - wipe _layout on every (re)install — stale self-update artifacts (bin.X.Y.Z, broken symlinks, restored upstream config.sh) silently break the bionic patches if they're left behind - best-effort GitHub-side unregister of the prior instance with the same name before re-registering, so we don't accumulate stale offline runners - poll the GitHub API after start to confirm 'online' status - factor common helpers (token fetch, URL resolution, stop_runner) - colored log/warn/die helpers for clearer output - README documents all three modes
start-runner.sh now re-execs itself with RUNNER_ANDROID_WATCHDOG=1 and runs run.sh in a `while true` loop with exponential backoff (1s → 60s, reset to 1s after >5 min uptime). Without this, run.sh's `exit 0` on unknown error codes (including SIGKILL → 137) leaves nothing running — boot autostart works, but a crash mid-job kills the runner permanently. runner-android-ctl now: - reports both watchdog and listener pids in `status` - kills the watchdog FIRST in `stop` so it can't respawn the listener under our feet Verified by SIGKILLing the listener: with the old script the runner stayed dead; with the watchdog it respawns within seconds.
pgrep -f matches /proc/PID/cmdline, which only contains argv. The RUNNER_ANDROID_WATCHDOG=1 env var prefix on the nohup line never made it into the watchdog process's cmdline, so the ctl status check and the double-start guard couldn't find the watchdog. Switch to a real --watchdog argv flag. Verified by SIGKILLing the listener: respawned by the watchdog, and `ctl status` now reports both pids correctly.
…peline The runner is now consumed by the asd-engineering/.asd termux-build-runner.yml workflow, which on the package job runs scripts/termux/build-termux-release.sh natively on the phone. That script needs golang/clang/make/binutils/caddy/ttyd/python3 to build code-server + asd-tunnel + caddy + ttyd, plus jq for the workflow's own summary step. Without these the workflow fails on a fresh runner. Also adding strace to make on-device debugging less painful when the next bun-on-Termux quirk surfaces. These were installed by hand on asd-phone during the bring-up; this commit makes the dependency list reproducible so the next phone joining the fleet via install.sh comes up ready to take an asd build job.
b95404a to
d360c64
Compare
… broken asd-phone shipped without the Termux:Boot Android addon installed, so the ~/.termux/boot/runner-android.sh symlink we drop during install was inert — the runner would not have come back after a reboot. Caught it only by manually checking pm path com.termux.boot. Surface this and similar misconfigurations automatically and document the fix. Three additions: contrib/android/TERMUX_SETUP.md End-to-end checklist for prepping a fresh Android phone: which Android packages are required (com.termux, com.termux.boot, com.termux.api — none installable via pkg, all separate APKs from the same signing key), three install methods ranked by speed (wired adb / wireless adb / curl + termux-open), the required one-time taps (Android refuses to fire BOOT_COMPLETED for an app that's never been launched), Samsung battery optimization, SSH setup, and the verify-on-reboot procedure. contrib/android/install-termux-boot.sh One-shot helper for the from-inside-Termux install path. Downloads the F-Droid Termux:Boot APK and calls termux-open on it, which fires Android's package installer with a single user tap. Skips cleanly if already installed. Used when no laptop / cable is available. contrib/android/runner-android-ctl status now warns when the boot symlink is missing (run enable-boot) AND when the Termux:Boot Android addon is not installed (run install-termux-boot.sh). Both warnings go to stderr so scripted callers parsing stdout still get the same one-line status. contrib/android/README.md Points new contributors at TERMUX_SETUP.md before the implementation reference.
…alse positive) Android 11+ package visibility blocks `pm path com.termux.boot` from non-system uids unless the caller declares <queries> in its AndroidManifest. We can't add manifest entries to upstream Termux, so the check returned "not installed" even when Termux:Boot was actually installed and working — verified on asd-phone after a successful adb install. Drop the warning to avoid the false positive. The boot-symlink check stays (that one is purely a filesystem stat, no visibility issue). Verifying that the Termux:Boot Android addon is installed must be done from the laptop side via `adb shell pm path com.termux.boot`, which is not subject to the same restriction. Documented in TERMUX_SETUP.md §5.
…ux/boot symlink When the watchdog is started by runner-android-ctl restart, /proc/PID/cmdline shows .../contrib/android/start-runner.sh --watchdog. When it's started by Termux:Boot at device boot, the receiver invokes the script via the ~/.termux/boot/runner-android.sh symlink — and /proc/PID/cmdline reflects the SYMLINK path, not the resolved target. So the same watchdog process appears under two different basenames depending on how it was launched. The old pattern only matched start-runner.sh, so a perfectly healthy boot-launched watchdog (the only kind that matters in production) was reported as "watchdog is gone". Verified on asd-phone after a real device reboot: cmdline was "runner-android.sh --watchdog" and ctl status falsely reported the watchdog missing while the listener was correctly running underneath it. Match either basename.
Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…iles (actions#4329) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Salman Chishti <[email protected]>
…ntroller, CodePages, Threading.Channels, @actions/glob, @typescript-eslint/parser, lint-staged, picomatch (actions#4333) Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: Salman Chishti <[email protected]>
Co-authored-by: Tingluo Huang <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: luketomlinson <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…ressionFunc/hashFiles (actions#4360) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ns#4362) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Salman Chishti <[email protected]>
…isc/expressionFunc/hashFiles (actions#4359) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Salman Chishti <[email protected]>
…ctions#4358) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Salman Chishti <[email protected]>
…iles (actions#4353) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Salman Chishti <[email protected]>
…4339) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Salman Chishti <[email protected]>
# Conflicts: # src/global.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds support for running the upstream
actions/runneron aarch64 Android phones under Termux, plus the ops glue (installer, watchdog supervisor, control wrapper, Termux:Boot integration) needed to run it as a service. Used in production byasd-phone(Galaxy Note 20 Ultra) which now serves theself-hosted-androidlabel for the asd-engineering org and is the build host for the new asd-engineering/.asd Termux release pipeline (PR actions#213).Why a fork is needed
Five things break the upstream
linux-arm64build under Termux's bionic libc — all five fixed incontrib/android/patch-layout.shplus two minimal source patches:Runner.{Listener,Worker,PluginHost}apphosts have 8-byte TLS alignment; bionic ARM64 requires 64. Fix: re-publish framework-dependent (-p:UseAppHost=false) and replace the apphost binaries with shell shims thatexec dotnet *.dll "$@".libcoreclr.so,libhostpolicy.soetc. are linked againstlibdl.so.2which doesn't exist on bionic. Fix: delete them and rely on Termux's bionic-native dotnet runtime._layout/externals/node{20,24}/bin/nodeare glibc binaries. Fix: symlink to Termux's$PREFIX/bin/node.IOUtil.ValidateExecutePermissionwalks parent dirs and trips on/data/data(mode 0711 by Android design). Fix: patch insrc/Runner.Sdk/Util/IOUtil.csto treat/,/data,/data/dataas readable-enough.config.shrunslddagainst the bundled libs. Fix:patch-layout.shneuters the check after layout is built.Plus one papercut:
src/global.jsonpinned SDK 8.0.419 but Termux ships 8.0.125. Relaxed to8.0.100+latestFeaturerollForward.What's in
contrib/android/install.sh--install(default) /--update/--uninstall. Installs Termux pkgs, builds layout, applies bionic patches, registers with--disableupdate, installs Termux:Boot symlink, starts the runner, polls GitHub foronline.patch-layout.shdev.sh layoutpatcher — re-publishes runner exes framework-dependent, drops dotnet shell shims, swaps node, neuters config.sh's ldd probe.start-runner.sh--watchdog, runsrun.shinwhile truewith exponential backoff 1s → 60s, resets on >5min uptime).runner-android-ctlREADME.mdNo runit /
sv/termux-servicesdependency. The existingasd-buildrunit setup on the dev S22U has been unstable, so this stack is deliberately flat: nohup + setsid + a small bash watchdog loop. Verified by SIGKILL'ing the listener — watchdog respawns it within seconds.Self-update is disabled at registration time. GitHub's runner update flow re-deploys the upstream
linux-arm64tarball over_layout/, restoring the bionic-incompatible apphost binaries. Without--disableupdatethe runner self-destructs on first use. This means there's no in-place upgrade path;install.sh --updatedoes git pull + full rebuild + re-register.Commits
Verification
Production proof: PR actions#213 in asd-engineering/.asd merged using this runner. Latest end-to-end build: bundle (ubuntu) ✓ 18s, package (asd-phone) ✓ 2m47s, 8 smoke tests pass on the produced tarball, post-cleanup verified to wipe asd state from
$HOMEbetween runs.Per-device verification on a fresh phone:
git clone https://github.com/asd-engineering/runner-android.git cd runner-android GITHUB_PAT=ghp_xxx GITHUB_ORG=asd-engineering ./contrib/android/install.shThat installs the package list, builds the runner from source (~2 min), patches the layout, registers with
--disableupdate, installs the Termux:Boot symlink, starts the watchdog, and polls GitHub until the runner reportsonline. Subsequent runs of the same command are safe via--replace.Test plan
install.sh. Expect: runner online in <5 minutes.~/runner-android/contrib/android/runner-android-ctl statusreports both watchdog and listener pids.kill -9 $(pgrep -f "dotnet .*Runner\.Listener\.dll"). Expect: watchdog respawns it within ~10 seconds.gh workflow run termux-build-runner.yml --repo asd-engineering/.asd. Expect: green run in ~3 minutes.