-
Couldn't load subscription status.
- Fork 1.4k
Add example for Bazel #575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| ## * - Bazel | ||
|
|
||
| Bazel cache has a good handle to check if cached content should be rebuild or not based on its inputs like a hash(command + files). So using the latest cache of the branch is enough, no need to suffix with `hashFiles('**/...')`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is technically correct, GitHub Action's cache is not updated on a cache hit, so without using hashFiles(...) Bazel will always have a stale cache (from the first cache hit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PiotrSikora Interesting! Thanks for sharing, this is very useful... could this be a better example:
key: ${{ runner.os }}-bazel-${{ steps.cache-key.outputs.uniq }}-${{ hashFiles('WORKSPACE', '.bazelrc', '.bazelversion', 'BUILD', 'BUILD.bazel', '**/BUILD', '**/BUILD.bazel', 'MODULE', 'MODULE.bazel', 'maven_install.json', 'bazel/dependencies.bzl', 'bazel/repositories.bzl') }}
restore-keys: ${{ runner.os }}-bazel-${{ steps.cache-key.outputs.uniq }}-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I don't really fully understand the steps.cache-key.outputs.uniq (from the bazel query --output build //external:${{ matrix.repo }} | grep -E 'sha256|commit' | cut -d\" -f2 in the linked example) - that's probably not useful / required, for a generic example? So probably just:
key: ${{ runner.os }}-bazel-${{ hashFiles('WORKSPACE', '.bazelrc', '.bazelversion', 'BUILD', 'BUILD.bazel', '**/BUILD', '**/BUILD.bazel', 'MODULE', 'MODULE.bazel', 'maven_install.json', 'bazel/dependencies.bzl', 'bazel/repositories.bzl') }}
restore-keys: ${{ runner.os }}-bazel-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic example should probably be as simple as:
key: ${{ hashFiles('.bazelrc', '.bazelversion', 'WORKSPACE', 'WORKSPACE.bazel', 'MODULE.bazel' }}
We only care about dependencies and not about build rules, so including BUILD / BUILD.bazel in the hash probably doesn't make sense, unless you want to add multiple restore-keys.
Also, bazel/dependencies.bzl and bazel/repositories.bzl are used in Proxy-Wasm C++ Host, but they are not standard. maven_install.json also is not very generic example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be individual things within a Bazel cache (checksum'd HTTP artifacts) that could be stored.
The choice of Bazel vs Bazelisk installed Bazel is opinionated. If user's don't want the quota hit and are willing to forego the features of Bazelisk, then the user could bake Bazel into their Docker image. I imagine the layer would be cached on runners. I say this knowing very little about GitHub Actions so take that with a grain of salt. The Bazelisk cache is very small (60mb~ for v8) in comparison to other cached artifacts.
With caching already being so opinionated, I think having a simple example here that works for 95% of the user base has value. The example should broadly guide the user to how caching could be approached and prompt them to think deeper about what caching problem they are trying to solve. For the 5% "power users", I think it would be futile to try and write a perfect example since as I said before, caching is opinionated, and perhaps even more so, their organizational's stack, compliance rules, etc.
| uses: actions/cache@v2 | ||
| with: | ||
| path: | | ||
| ~/.cache/bazelisk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching Bazelisk artifacts counts against the quota, so it's not a great choice for larger repos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PiotrSikora I've explored this a bit further. The size increase is negligible, comparatively... because with Bazel, even for a relatively simple e.g. Java Bazel project (but with https://rules-proto-grpc.com/en/latest/example.html#step-3-write-a-build-file; which pulls in a shitload of C++ building, because of this), one GitHub Actions Cache is 920 MB for me - and ~/.cache/bazelisk/ (filled with bazel-6.0.0-linux-x86_64) only adds another 51M on top of that. With GitHub now providing 10 GB total size of all caches in a repository, adding this seems like a worthwhile trade-off (to me). Even if it doesn't noticeable decrease build speed (at least for me), it still seems "nicer" and perhaps also more reliable not to see a Downloading https://releases.bazel.build/6.0.0/release/bazel-6.0.0-linux-x86_64... in EVERY build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but it's 50MB per hash, not total, so if your cache is otherwise 500MB, then it means that you're losing two slots in the cache.
| with: | ||
| path: | | ||
| ~/.cache/bazelisk | ||
| ~/.cache/bazel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for Linux, but you're missing macOS (/private/var/tmp/_bazel_runner/) and Windows (???) paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. Windows, https://bazel.build/remote/output-directories says "on Windows it defaults to %HOME% if set, else %USERPROFILE% if set, else the result of calling SHGetKnownFolderPath() with the FOLDERID_Profile flag set." - I'm not entirely sure what to make of that...
|
I second @PiotrSikora comments on this example will not be very useful if we are not using a hash to indicate outdated dependencies. You'll need to use the commit id in the key if you want that. Closing for now. If the comments are addressed, feel free to reopen. Thank you. |
I would be willing to raise a new PR (or you could open this one if you like, but not sure I'll be able to push to it?) - pending confirmation on my new comments above. |
I have doubt