Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@mihaibudiu
Copy link
Contributor

Here are some example circuit statistics that can be obtained from the compiler:

Statistics[
  totalOperators=725
  gcOperators=0
  tables=52
  views=69
  joins=217
  linear=343
  linearAggregates=9
  nonLinearAggregates=18
  nonIncremental=0
  windows=1
  nested=0
  stateful=260
  maxWidth=255
  totalWidth=23587
  maxDepth=30
]

Copilot AI review requested due to automatic review settings January 27, 2026 01:10
@lalithsuresh
Copy link
Contributor

@mihaibudiu you're on a roll. I would love for us to track these in our CI and benchmarks for regressions.

@mihaibudiu
Copy link
Contributor Author

It's not always obvious what a "regression" is; a bigger circuit may be faster...

@mihaibudiu
Copy link
Contributor Author

maxWidth is the widest tuple computed, that's a useful one

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces new interfaces to categorize operators and adds a pass to compute static circuit statistics. The changes refactor the operator type system by replacing boolean flags (like containsIntegrator) with explicit interface implementations, enabling better compile-time type checking and more accurate circuit analysis.

Changes:

  • Introduced interface-based operator categorization (ILinear, IStateful, IContainsIntegrator, IJoin, etc.)
  • Added CircuitStatistics visitor to compute and report circuit metrics
  • Refactored operator constructors to remove containsIntegrator boolean parameter
  • Moved IMultiOutput and IInputMapOperator to operator package

Reviewed changes

Copilot reviewed 79 out of 79 changed files in this pull request and generated no comments.

Show a summary per file
File Description
CircuitStatistics.java New visitor class that computes circuit statistics including operator counts, width, and depth metrics
IStateful.java, ILinear.java, ILinearAggregate.java, INonLinearAggregate.java, IJoin.java, IContainsIntegrator.java, INonIncremental.java New interfaces for categorizing operator types
GCOperator.java Renamed interface from GCOperator to IGCOperator
DBSPSimpleOperator.java, DBSPUnaryOperator.java, DBSPBinaryOperator.java Removed containsIntegrator boolean field and constructor parameter
Multiple operator classes Implemented appropriate categorization interfaces (ILinear, IContainsIntegrator, etc.)
IMultiOutput.java, IInputMapOperator.java Moved to operator package
Various visitor classes Updated to use IGCOperator interface instead of GCOperator class
CircuitOptimizer.java Added CircuitStatistics pass to optimization pipeline
ToDotNodesVisitor.java Updated to conditionally display table/view names based on verbosity level
Comments suppressed due to low confidence (2)

sql-to-dbsp-compiler/SQL-compiler/src/main/java/org/dbsp/sqlCompiler/circuit/operator/INonIncremental.java:1

  • Corrected spelling of 'shold' to 'should'.
    sql-to-dbsp-compiler/SQL-compiler/src/main/java/org/dbsp/sqlCompiler/compiler/backend/dot/ToDotNodesVisitor.java:1
  • Corrected spelling of 'ommitted' to 'omitted'.

@lalithsuresh
Copy link
Contributor

What is nested? And maxDepth I assume is pipeline depth? (I thought maxwidth was pipeline width)

@mihaibudiu
Copy link
Contributor Author

maxWidth is the width (of the tuple output) of the widest operator.
depth is the depth in operators - the longest chain from source to sink
totalWidth is not very meaningful - it adds the width of all outputs. it approximates the internal bandwidth of the circuit
nested is the number of recursive components
I don't have a computation of the widest "cut" of the pipeline

@lalithsuresh
Copy link
Contributor

lalithsuresh commented Jan 27, 2026

It's not always obvious what a "regression" is; a bigger circuit may be faster...

Makes sense. I can imagine tuple widths getting wider, the number of linear operators going down, or number of gc operators going down might not be good. At least if we get flagged on changes we can take a second look.

@mihaibudiu
Copy link
Contributor Author

We just need to find a way to persist results from one test run to the next. I will convert the output to json to be easier to manipulate.

@lalithsuresh
Copy link
Contributor

@mihaibudiu yes, it'll be included in the benchmarking infra.

@mihaibudiu mihaibudiu added this pull request to the merge queue Jan 27, 2026
Merged via the queue into main with commit d951a3c Jan 27, 2026
1 check passed
@mihaibudiu mihaibudiu deleted the compiler-stats branch January 27, 2026 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants