Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@davidB
Copy link
Contributor

@davidB davidB commented Apr 29, 2021

I have doubt

  • about what is the correct title for the section as bazel does not target a single programming language ecosystem.
  • the anchor href generated

@davidB davidB requested a review from a team as a code owner April 29, 2021 18:01

## * - Bazel

Bazel cache has a good handle to check if cached content should be rebuild or not based on its inputs like a hash(command + files). So using the latest cache of the branch is enough, no need to suffix with `hashFiles('**/...')`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is technically correct, GitHub Action's cache is not updated on a cache hit, so without using hashFiles(...) Bazel will always have a stale cache (from the first cache hit).

See: Proxy-Wasm C++ Host's configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora Interesting! Thanks for sharing, this is very useful... could this be a better example:

key: ${{ runner.os }}-bazel-${{ steps.cache-key.outputs.uniq }}-${{ hashFiles('WORKSPACE', '.bazelrc', '.bazelversion', 'BUILD', 'BUILD.bazel', '**/BUILD', '**/BUILD.bazel', 'MODULE', 'MODULE.bazel', 'maven_install.json', 'bazel/dependencies.bzl', 'bazel/repositories.bzl') }}
restore-keys: ${{ runner.os }}-bazel-${{ steps.cache-key.outputs.uniq }}-

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I don't really fully understand the steps.cache-key.outputs.uniq (from the bazel query --output build //external:${{ matrix.repo }} | grep -E 'sha256|commit' | cut -d\" -f2 in the linked example) - that's probably not useful / required, for a generic example? So probably just:

key: ${{ runner.os }}-bazel-${{ hashFiles('WORKSPACE', '.bazelrc', '.bazelversion', 'BUILD', 'BUILD.bazel', '**/BUILD', '**/BUILD.bazel', 'MODULE', 'MODULE.bazel', 'maven_install.json', 'bazel/dependencies.bzl', 'bazel/repositories.bzl') }}
restore-keys: ${{ runner.os }}-bazel-

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic example should probably be as simple as:

key: ${{ hashFiles('.bazelrc', '.bazelversion', 'WORKSPACE', 'WORKSPACE.bazel', 'MODULE.bazel' }}

We only care about dependencies and not about build rules, so including BUILD / BUILD.bazel in the hash probably doesn't make sense, unless you want to add multiple restore-keys.

Also, bazel/dependencies.bzl and bazel/repositories.bzl are used in Proxy-Wasm C++ Host, but they are not standard. maven_install.json also is not very generic example.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be individual things within a Bazel cache (checksum'd HTTP artifacts) that could be stored.

The choice of Bazel vs Bazelisk installed Bazel is opinionated. If user's don't want the quota hit and are willing to forego the features of Bazelisk, then the user could bake Bazel into their Docker image. I imagine the layer would be cached on runners. I say this knowing very little about GitHub Actions so take that with a grain of salt. The Bazelisk cache is very small (60mb~ for v8) in comparison to other cached artifacts.


With caching already being so opinionated, I think having a simple example here that works for 95% of the user base has value. The example should broadly guide the user to how caching could be approached and prompt them to think deeper about what caching problem they are trying to solve. For the 5% "power users", I think it would be futile to try and write a perfect example since as I said before, caching is opinionated, and perhaps even more so, their organizational's stack, compliance rules, etc.

uses: actions/cache@v2
with:
path: |
~/.cache/bazelisk

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caching Bazelisk artifacts counts against the quota, so it's not a great choice for larger repos.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PiotrSikora I've explored this a bit further. The size increase is negligible, comparatively... because with Bazel, even for a relatively simple e.g. Java Bazel project (but with https://rules-proto-grpc.com/en/latest/example.html#step-3-write-a-build-file; which pulls in a shitload of C++ building, because of this), one GitHub Actions Cache is 920 MB for me - and ~/.cache/bazelisk/ (filled with bazel-6.0.0-linux-x86_64) only adds another 51M on top of that. With GitHub now providing 10 GB total size of all caches in a repository, adding this seems like a worthwhile trade-off (to me). Even if it doesn't noticeable decrease build speed (at least for me), it still seems "nicer" and perhaps also more reliable not to see a Downloading https://releases.bazel.build/6.0.0/release/bazel-6.0.0-linux-x86_64... in EVERY build.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but it's 50MB per hash, not total, so if your cache is otherwise 500MB, then it means that you're losing two slots in the cache.

with:
path: |
~/.cache/bazelisk
~/.cache/bazel

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for Linux, but you're missing macOS (/private/var/tmp/_bazel_runner/) and Windows (???) paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. Windows, https://bazel.build/remote/output-directories says "on Windows it defaults to %HOME% if set, else %USERPROFILE% if set, else the result of calling SHGetKnownFolderPath() with the FOLDERID_Profile flag set." - I'm not entirely sure what to make of that...

@vsvipul
Copy link
Contributor

vsvipul commented Feb 23, 2022

I second @PiotrSikora comments on this example will not be very useful if we are not using a hash to indicate outdated dependencies. You'll need to use the commit id in the key if you want that. Closing for now. If the comments are addressed, feel free to reopen. Thank you.

@vsvipul vsvipul closed this Feb 23, 2022
@vorburger
Copy link
Contributor

I second @PiotrSikora comments on this example will not be very useful if we are not using a hash to indicate outdated dependencies. You'll need to use the commit id in the key if you want that. Closing for now. If the comments are addressed, feel free to reopen. Thank you.

I would be willing to raise a new PR (or you could open this one if you like, but not sure I'll be able to push to it?) - pending confirmation on my new comments above.

@vorburger vorburger mentioned this pull request Mar 11, 2023
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants