Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Dedup identical subtrees in Dcp2Cone canonicalization#3353

Closed
PTNobel wants to merge 5 commits into
masterfrom
ptn/dcp2cone-cse
Closed

Dedup identical subtrees in Dcp2Cone canonicalization#3353
PTNobel wants to merge 5 commits into
masterfrom
ptn/dcp2cone-cse

Conversation

@PTNobel
Copy link
Copy Markdown
Collaborator

@PTNobel PTNobel commented May 27, 2026

Description

Currently, cp.Problem(cp.Minimize(cp.norm1(x)), [cp.norm1(x) <= 1]) canonicalizes the two separately-constructed cp.norm1(x) expressions into independent epigraph variables and constraint pairs. This PR adds a per-apply() common-subexpression cache to Dcp2Cone so structurally identical Expression subtrees share one canonicalized expression and one set of auxiliary constraints within a single reduction pass.

For the example above, the canonicalized data the solver sees shrinks from 3 scalar variables / 5 constraint rows down to 2 scalar variables / 3 constraint rows.

How it works

  • Cache lives on the Dcp2Cone instance and is reset at the top of every apply() call.
  • Keyed on a structural digest of each Expression subtree: atom type, shape, hashable get_data(), child keys, leaf cvxpy ids for Variable/Parameter, and shape+dtype+value bytes for Constant.
  • affine_above is folded into the key only when a quad-canon-eligible descendant could make canonicalization depend on it; otherwise identical subtrees deduplicate across objective and constraint contexts (which start with different affine_above values).
  • Only Expression subtrees are cached — objectives, user constraints, and partial_problem subtrees are excluded so their IDs flow into inverse_data unchanged.
  • If a structural key cannot safely be built for an unusual get_data() entry, that subtree is silently skipped rather than risking incorrect reuse.

Type of change

  • New feature (backwards compatible)
  • New feature (breaking API changes)
  • Bug fix
  • Other (Documentation, CI, ...)

Contribution checklist

  • Add our license to new files.
  • Check that your code adheres to our coding style.
  • Write unittests.
  • Run the unittests and check that they're passing.
  • Run the benchmarks to make sure your change doesn't introduce a regression.

Test plan

  • New cvxpy/tests/test_dcp2cone_cse.py covers scalar dedup, vector dedup, distinct-subtree non-merge (norm1(x) vs norm1(-x)), end-to-end solve equivalence, and dedup across a shared Parameter subtree.
  • pytest cvxpy/tests/test_problem.py cvxpy/tests/test_qp_solvers.py cvxpy/tests/test_atoms.py cvxpy/tests/test_dgp2dcp.py cvxpy/tests/test_dqcp.py all green.

🤖 Generated with Claude Code

Add a per-apply common-subexpression cache to Dcp2Cone so that
structurally identical Expression subtrees share one canonicalized
expression and one set of auxiliary constraints within a reduction
pass. For cp.Problem(cp.Minimize(cp.norm1(x)), [cp.norm1(x) <= 1])
this collapses two epigraph variables and two pairs of abs-epigraph
inequalities down to one of each.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Comment thread cvxpy/reductions/dcp2cone/dcp2cone.py Outdated
Comment thread cvxpy/reductions/dcp2cone/dcp2cone.py Outdated
Comment thread cvxpy/tests/test_dcp2cone_cse.py
Previously the cache embedded `Constant.value.tobytes()` in keys, which
made the cache footprint scale with the problem's constant data (~8 MB
for an LP 1000x500). Switch Constant keying to object identity, which
still deduplicates the common case of a shared Constant reference and
brings cache size below ~16 KB across representative problems.

Also skip caching any subtree whose canonicalization went through the
quad branch. Those canonicalizers emit SymbolicQuadForm markers that
downstream code (replace_quad_forms, coeff_extractor) identifies by
Python id and assumes are distinct per occurrence; sharing one across
sites silently halves quadratic coefficients. Track this with a counter
incremented on quad branches; don't cache if it advances under a node.

Release the cache at the end of apply() so it does not outlive the
reduction.

Add a regression test using the QuadForm 0.5*qf + 0.5*qf pattern from
test_qp_solvers.py::rep_quad_form.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

Benchmarks that have improved:

   before           after         ratio
 [6b637368]       [c6d7d7fa]
  •    18.1±0ms         15.9±0ms     0.88  simple_LP_benchmarks.SimpleFullyParametrizedLPBenchmark.time_compile_problem
    
  •    28.2±0ms         24.8±0ms     0.88  high_dim_convex_plasticity.ConvexPlasticity.time_compile_problem
    
  •    20.9±0ms         17.4±0ms     0.83  simple_QP_benchmarks.ParametrizedQPBenchmark.time_compile_problem
    
  •     1.17±0s          966±0ms     0.82  gini_portfolio.Cajas.time_compile_problem
    

Benchmarks that have stayed the same:

   before           after         ratio
 [6b637368]       [c6d7d7fa]
      1.28±0s          1.37±0s     1.07  simple_QP_benchmarks.LeastSquares.time_compile_problem
      1.26±0s          1.30±0s     1.03  simple_LP_benchmarks.SimpleScalarParametrizedLPBenchmark.time_compile_problem
      7.28±0s          7.49±0s     1.03  svm_l1_regularization.SVMWithL1Regularization.time_compile_problem
      460±0ms          467±0ms     1.02  slow_pruning_1668_benchmark.SlowPruningBenchmark.time_compile_problem
      1.99±0s          2.01±0s     1.01  tv_inpainting.TvInpainting.time_compile_problem
      2.28±0s          2.30±0s     1.01  quantum_hilbert_matrix.QuantumHilbertMatrix.time_compile_problem
      21.4±0s          21.5±0s     1.00  finance.CVaRBenchmark.time_compile_problem
      5.89±0s          5.90±0s     1.00  optimal_advertising.OptimalAdvertising.time_compile_problem
      340±0ms          338±0ms     0.99  gini_portfolio.Murray.time_compile_problem
      6.31±0s          6.26±0s     0.99  huber_regression.HuberRegression.time_compile_problem
      1.79±0s          1.77±0s     0.99  finance.FactorCovarianceModel.time_compile_problem
      15.5±0s          15.3±0s     0.99  simple_LP_benchmarks.SimpleLPBenchmark.time_compile_problem
      619±0ms          610±0ms     0.99  semidefinite_programming.SemidefiniteProgramming.time_compile_problem
     48.0±0ms         47.3±0ms     0.98  matrix_stuffing.SmallMatrixStuffing.time_compile_problem
      1.42±0s          1.39±0s     0.98  matrix_stuffing.ParamConeMatrixStuffing.time_compile_problem
      33.0±0s          32.4±0s     0.98  sdp_segfault_1132_benchmark.SDPSegfault1132Benchmark.time_compile_problem
      294±0ms          287±0ms     0.98  matrix_stuffing.ParamSmallMatrixStuffing.time_compile_problem
      468±0ms          455±0ms     0.97  gini_portfolio.Yitzhaki.time_compile_problem
      916±0ms          888±0ms     0.97  matrix_stuffing.ConeMatrixStuffingBench.time_compile_problem
      321±0ms          311±0ms     0.97  simple_QP_benchmarks.SimpleQPBenchmark.time_compile_problem
      3.91±0s          3.67±0s     0.94  simple_QP_benchmarks.UnconstrainedQP.time_compile_problem

PTNobel and others added 3 commits May 27, 2026 14:16
replace_quad_form previously gave the placeholder Variable the same id
as the quad form it replaced. When the same SymbolicQuadForm appeared
at multiple positions in the objective expression (e.g. via CSE, or via
0.5*qf + 0.5*qf with a shared qf), the placeholders collapsed onto a
single row in get_var_offsets and the quadratic coefficient was halved.

Mint a fresh placeholder id per replacement. quad_forms is keyed by
placeholder id, so all downstream lookups continue to work; only the
incidental identity quad_form.id == placeholder.id is dropped.

With this in place, the CSE cache in Dcp2Cone can also dedup subtrees
that go through the quad branch, so the _quad_canon_count guard is
removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
By the time coeff_extractor invokes this, Dcp2Cone has already rewritten
every QuadForm into either a SymbolicQuadForm or sum_squares, so the
isinstance check on QuadForm is defensive rather than load-bearing.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- Cache hits discard the stored constraints anyway, so store just the
  canonical expression. Keeps the per-apply working set smaller.
- Add two tests that exercise Dcp2Cone(quad_obj=True): one for shared
  quad_over_lin subtrees within the objective (dedup to a single
  SymbolicQuadForm), one for the same subtree appearing in objective and
  constraint (different canonicalizations kept distinct by the
  affine_above component of the cache key).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@PTNobel
Copy link
Copy Markdown
Collaborator Author

PTNobel commented May 27, 2026

Combined into #3355, which now targets master directly.

@PTNobel PTNobel closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant