-
Notifications
You must be signed in to change notification settings - Fork 97
[SQL] Pass to compute static circuit statistics; new interfaces to categorize operators #5513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@mihaibudiu you're on a roll. I would love for us to track these in our CI and benchmarks for regressions. |
|
It's not always obvious what a "regression" is; a bigger circuit may be faster... |
|
maxWidth is the widest tuple computed, that's a useful one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces new interfaces to categorize operators and adds a pass to compute static circuit statistics. The changes refactor the operator type system by replacing boolean flags (like containsIntegrator) with explicit interface implementations, enabling better compile-time type checking and more accurate circuit analysis.
Changes:
- Introduced interface-based operator categorization (ILinear, IStateful, IContainsIntegrator, IJoin, etc.)
- Added CircuitStatistics visitor to compute and report circuit metrics
- Refactored operator constructors to remove
containsIntegratorboolean parameter - Moved IMultiOutput and IInputMapOperator to operator package
Reviewed changes
Copilot reviewed 79 out of 79 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| CircuitStatistics.java | New visitor class that computes circuit statistics including operator counts, width, and depth metrics |
| IStateful.java, ILinear.java, ILinearAggregate.java, INonLinearAggregate.java, IJoin.java, IContainsIntegrator.java, INonIncremental.java | New interfaces for categorizing operator types |
| GCOperator.java | Renamed interface from GCOperator to IGCOperator |
| DBSPSimpleOperator.java, DBSPUnaryOperator.java, DBSPBinaryOperator.java | Removed containsIntegrator boolean field and constructor parameter |
| Multiple operator classes | Implemented appropriate categorization interfaces (ILinear, IContainsIntegrator, etc.) |
| IMultiOutput.java, IInputMapOperator.java | Moved to operator package |
| Various visitor classes | Updated to use IGCOperator interface instead of GCOperator class |
| CircuitOptimizer.java | Added CircuitStatistics pass to optimization pipeline |
| ToDotNodesVisitor.java | Updated to conditionally display table/view names based on verbosity level |
Comments suppressed due to low confidence (2)
sql-to-dbsp-compiler/SQL-compiler/src/main/java/org/dbsp/sqlCompiler/circuit/operator/INonIncremental.java:1
- Corrected spelling of 'shold' to 'should'.
sql-to-dbsp-compiler/SQL-compiler/src/main/java/org/dbsp/sqlCompiler/compiler/backend/dot/ToDotNodesVisitor.java:1 - Corrected spelling of 'ommitted' to 'omitted'.
|
What is nested? And maxDepth I assume is pipeline depth? (I thought maxwidth was pipeline width) |
|
maxWidth is the width (of the tuple output) of the widest operator. |
Makes sense. I can imagine tuple widths getting wider, the number of linear operators going down, or number of gc operators going down might not be good. At least if we get flagged on changes we can take a second look. |
|
We just need to find a way to persist results from one test run to the next. I will convert the output to json to be easier to manipulate. |
|
@mihaibudiu yes, it'll be included in the benchmarking infra. |
…tegorize operators Signed-off-by: Mihai Budiu <[email protected]>
219a4dc to
4ce2e7f
Compare
Here are some example circuit statistics that can be obtained from the compiler: