Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fix: Don't bust cache so aggressively when compiling test sources #1110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

regenvanwalbeek-wf
Copy link

@regenvanwalbeek-wf regenvanwalbeek-wf commented Mar 22, 2022

When using latest gopherjs master, we noticed that the time to run gopherjs test in CI for our package increased from ~3m13s (on 1.17.1+go1.17.3) to ~14m36s - This appears to have started occurring after #1105

After turning on info level logs, we noticed that we were consistently getting cache misses for packages that were shared.

Example:

2022-03-22T16:46:07.273977954Z time="2022-03-22T16:46:07Z" level=info msg="No cached package archive for \"testing\"."
2022-03-22T16:46:07.312493246Z time="2022-03-22T16:46:07Z" level=info msg="Successfully stored build archive \"compiler.Archive{testing}\" as \"/root/.cache/gopherjs/build_cache/f9/f9ec20f1c4472f467363b634f5f3e8ba88680c83c5545804c8159daa58a51c7d\"."
...
2022-03-22T16:46:14.025498059Z PASS
2022-03-22T16:46:14.025502425Z ok  	github.com/Workiva/library_name/pkg_1	1.156s
...
2022-03-22T16:46:17.948303901Z time="2022-03-22T16:46:17Z" level=info msg="No cached package archive for \"testing\"."
2022-03-22T16:46:17.982488945Z time="2022-03-22T16:46:17Z" level=info msg="Successfully stored build archive \"compiler.Archive{testing}\" as \"/root/.cache/gopherjs/build_cache/78/787f945b5c7a4c7b698fd4325452682aab49c613813d9fbbea17f4b8a317587a\"."
...
2022-03-22T16:46:24.437850143Z PASS
2022-03-22T16:46:24.437852433Z ok  	github.com/Workiva/library_name/pkg_2	1.910s

Further investigation showed that the build cache generates a cache key based on the package being tested.

This means that if 2 packages share a dependency, they will not share the cache for that dependency.

I updated the cache key to:

  1. Stop including the package under test
  2. Start including a boolean that represents whether *_test.go sources are compiled into that cache.

By doing this, our gopherjs test CI time is back down to ~3m11s

@@ -464,7 +464,6 @@ func NewSession(options *Options) (*Session, error) {
GOPATH: options.GOPATH,
BuildTags: options.BuildTags,
Minify: options.Minify,
TestedPackage: options.TestedPackage,
Copy link
Author

@regenvanwalbeek-wf regenvanwalbeek-wf Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that's not clear to me is if we should share a cache between gopherjs test and gopherjs build.

For example, let's say foo depends on shared, and you run gopherjs test foo/.... This will create a cache for shared. If you run a subsequent gopherjs build, is it okay to reuse the cache for shared?

I expect this would be fine as long as the rest of the BuildCache args are the same.

@nevkontakte
Copy link
Member

Thanks for sending this PR. I agree that the current cache busting is more conservative than it has to be, although in this case it's better to be slow than produce invalid results. The fix looks good overall, but I'll need a bit of time to think all implications through.

@regenvanwalbeek-wf
Copy link
Author

👍 Sounds great, thank you!

I definitely agree it's better to be slow and correct than fast and incorrect. I'm definitely not an expert on the go/gopherjs build system by any means -- I appreciate the consideration.

@nevkontakte
Copy link
Member

Unfortunately I found a case where this change would lead to incorrect results 😟 Consider package sort in the standard library. When we are testing it we build a test variant of sort package itself (with all *_test.go sources with package sort) and an "external test" package (all *_test.go sources with package sort_test). Turns out, the following dependency chain exists: sort_test → testing → regexp → sort [for sort.test]. Which means that regexp package built here has a potential of being different because it depends on a different variant of the sort package than if we do a simple gopherjs build regexp.

In practical terms this means that the compiledWithTests := testedPackage == importPath cache key condition is not sufficient. Instead, if should be compiledWithTests := importPath depends on testedPackage. Given the lazy approach gopherjs currently takes to loading imports, checking that condition is quite a bit more difficult.

I do have a long-term goal of making the build process consider build graph more explicitly (which would unlock smarter build caching strategies among other things), but there are quite a few steps to get there. #693 (comment) is the first of them, so I'll take this opportunity to advertise https://github.com/nevkontakte/gopherjs/tree/syscall2 branch and encourage early testing 🙃

@regenvanwalbeek-wf
Copy link
Author

🤯 that's a good catch. That answers my uncertainty here

In addition to this change, would it make sense to also add an IsForTest flag to BuildCache which is true when we have a TestedPackage? That would guarantee that gopherjs test and gopherjs build always use different caches. Then I think the compiledWithTests flag will always hold up.

(That said, I don't want to propose a change if you're just going to take things in a completely different direction and it's going to create more work for you that you'll just have to rip out. Perhaps my solution is just patching holes and opening us up to other issues rather than fixing the core problem)

I'll definitely check out your syscall2 branch tomorrow (we use darwin locally, so I'm excited to drop GOOS=linux from our test targets 😄 )

@nevkontakte
Copy link
Member

Unfortunately, a boolean IsForTest flag won't be sufficient either. Consider my example with sort_test above: when we are building regexp package for sort_test, it would be built with sort [for sort.test] as an input and cached with BuildCache{IsForTest: true}, compiledForTest = false. Then if we try testing net package, regexp will still be built (since testing depends on it) with the normal sort as an input and cached with the same BuildCache{IsForTest: true}, compiledForTest = false despite having different inputs compared to the first example.

And so we do really have to partition build cache by TestedPackage to ensure correctness, or get significantly smarter when analyzing build dependencies. The former is what we have now, the latter requires some significant refactoring to replace our use of go/build with golang.org/x/tools/go/packages. We'll get there eventually, but it will take some time. If you are interested in contributing towards that goal — let me know 😉

That said, I'm really surprised that the compilation itself increases your CI time by whole 10 minutes. In my experience test execution dominates build time by a good margin. Is your code base open source by any chance? It could be a good material to profile the compiler against and maybe find the root cause of the slowness.

@regenvanwalbeek-wf
Copy link
Author

👍 Gotchya, that makes sense 😞

I don't think we have the resources to contribute in a big way right now, but I can bring it up. I'm certainly interested in the health of gopherjs.

Unfortunately, our code base is not open source. We have a pretty large codebase (about 250k lines of production and test code each, not counting dependencies). If you think it would help, I can look into scrubbing "sensitive" output from the info logs during a run of gopherjs test and providing that.

FWIW, we do run the gopherjs tests in parallel with the rest of our build/tests, and this is not the bottleneck in our total CI time. I was just hoping this was a relatively easy fix, but clearly it's not 😆

@nevkontakte
Copy link
Member

Thanks for the offer, info logs gopherjs currently produces are not very helpful for performance analysis. The best option would be to capture a pprof CPU profile of the gopherjs tool. I thought I added a flag for that but apparently I didn't ¯_(ツ)_/¯ I guess we could park it until I have more time to look at performance and I might request your assistance then 🙂

As for this pull request, I think it is clear that it can't be merged without compromising output correctness, so I am going to close it. I do appreciate the effort and we will be happy to accept future contributions, big or small!

@regenvanwalbeek-wf
Copy link
Author

👍 Sounds great, I appreciate the guidance on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants