Minor: Move depcheck out of datafusion crate (200 less crates to compile)#9865
Conversation
935c53f to
f61a9c2
Compare
|
@andygrove I wonder if you have any thoughts / reactions to running the circular dependency check as its own CI job? |
|
windows machine still slow as hell, and it is damned slow there because of compilation latency as well |
| [dev-dependencies] | ||
| async-trait = { workspace = true } | ||
| bigdecimal = { workspace = true } | ||
| cargo = "0.78.1" |
There was a problem hiding this comment.
By removing this line, cargo test now requires 200 fewer crates to compile
Sorry -- I should have made it clear, this line is what helps https://github.com/apache/arrow-datafusion/pull/9865/files#r1545339002 Note that the circular dependency check was already present, I just moved it out of the main workspace because the dependency requirements were massive |
| - name: Check dependencies | ||
| run: | | ||
| cd dev/depcheck | ||
| cargo run |
There was a problem hiding this comment.
Note this check was previously covered by cargo test -p datafusion
| /// checking the dependency graph. | ||
| /// | ||
| /// See https://github.com/apache/arrow-datafusion/issues/9278 for more details | ||
| fn main() -> CargoResult<()> { |
There was a problem hiding this comment.
note this check is not new, it just now runs in a separate binary which requires many fewer crates in the main workspace
Co-authored-by: Andrew Lamb <[email protected]>
I agree the windows test takes too long. On the other hand I think this PR actually improves the situation:
|
|
Merging this in to improve the Dev experience. Thanks @comphead for the review |
…ile) (apache#9865) * Minor: Move depcheck out of main datafusion test * Update dev/depcheck/README.md Co-authored-by: Andrew Lamb <[email protected]> --------- Co-authored-by: comphead <[email protected]>
|
Thanks for catching this @alamb |
Which issue does this PR close?
Related to #9278, follow on to #9292
Rationale for this change
I want DataFusion to compile quickly and be a nice development experience, so the faster the compilation / fewer dependencies the better in general.
I have been wondering why compiling datafusion takes so long and requires compiling crates like gix (which is an implementation of git) #9844 (review) got me inspired to look into this:
It turns out the dependencies were needed by the
depchecktest, added in #9292. There is no need to compile 200 extra crates everytime someone runscargo test, I think it is enough if this tool runs as part of CI.What changes are included in this PR?
move the
depchecktest into its own binary (not part of the main workspace) and only run that as part of CIAre these changes tested?
There is a new CI job and I manually tested that it catches issues
Manual Testing Details
Negative Test
I locally introduced a circular dependency:
The test now fails
When I remove the local dependency, the test passes:
Are there any user-facing changes?
Faster compilation times (200 less crates to compile!)
Previously, running
cargo testrequired compiling 838 crates!With this change we are down to 647 crates (still a crazy number)
It also seems to make the CI jobs significantly faster: