Codestin Search App

PTNobel · 2026-05-27T22:40:17Z

Description

When the same Expression subtree appears in two places in a problem — e.g. cp.norm1(x) in both the objective and a constraint — the recursive canonicalizers in Dcp2Cone and Dnlp2Smooth previously emitted a fresh set of auxiliary variables and epigraph constraints per occurrence. This PR adds a per-apply() structural-key cache so each canonicalizer fires once per structurally identical subtree.

Worked example for the reported case:

cp.Problem(cp.Minimize(cp.norm1(x)), [cp.norm1(x) <= 1])

The canonicalized problem now has one epigraph variable t shared across the objective and the constraint, with one pair of t >= x, t >= -x inequalities.

What's in the PR

cvxpy/reductions/subexpr_cache.py (new): shared structural-key helpers (expr_key, _constant_key, _hashable_value, UncacheableError). Keys treat two subtrees as equal exactly when they match on atom types, shapes, get_data() payloads, and Variable/Parameter ids at the leaves. Small Constants (≤ 64 elements) key by value so that the implicit Constant(0.5) minted for cp.huber(x)'s default M argument doesn't defeat the merge; larger arrays key by id() to avoid copying problem data into cache keys.
Dcp2Cone.canonicalize_tree: per-apply() cache keyed on (expr_key, affine_above-if-relevant). The affine_above component is included only when the subtree could reach the quad-canonicalization branch (which depends on affine_above); for purely cone-mode subtrees the result is independent of context and the merge is unconditional.
Dnlp2Smooth.canonicalize_tree: per-apply() cache keyed purely on expr_key. Dnlp2Smooth.canonicalize_expr doesn't branch on affine_above, so there's nothing else to include in the key.
cvxpy/utilities/replace_quad_forms.py: a latent bug shown by the new CSE — when two occurrences of the same SymbolicQuadForm share an object after CSE, the QP coefficient extractor's placeholder Variable mechanism keyed by quad_form.id collapsed them onto a single row and halved the quadratic coefficient. Fixed by minting a fresh placeholder id per replace_quad_form call. The QuadForm branch in replace_quad_forms is now documented as defensive: Dcp2Cone rewrites every QuadForm into SymbolicQuadForm or sum_squares before coeff_extractor calls this.

Why this is one PR

The Dcp2Cone change came first, the Dnlp2Smooth change followed once we'd refactored the structural-key helpers into a shared module. They share enough surface (the keying helpers, the _constant_key behavior for default-parameter Constants, the audit pattern for downstream placeholder-id assumptions) that splitting them would just create review churn.

Tests

New unit tests:

cvxpy/tests/test_dcp2cone_cse.py — 8 tests: scalar/vector norm1 dedup, distinct subtrees not merged, solve-matches-unduplicated, parameter subtree dedup, shared QuadForm solves correctly, quad-objective shared-subtree dedup, quad-objective cross-context (objective vs constraint) not merged.
cvxpy/tests/test_dnlp2smooth_cse.py — 7 tests: shared huber across obj/constraint, shared pnorm across two constraints, distinct subtrees not merged, parameter subtree dedup, constraint id preservation under dedup, per-apply cache isolation, dangling-aux sanity.

Existing tests run: full cvxpy/tests/nlp_tests/ (30 passed, 222 solver-skipped), test_qp_solvers, test_quad_form, test_quad_dpp, test_problem, test_atoms — all passing.

Downstream audits (no analog of the `replace_quad_forms` bug found)

NLP chain (cvxpy/reductions/solvers/nlp_solvers/diff_engine/): pure recursive tree walker, var/param dicts key on .id for lookup — that's the normal lookup pattern, unaffected by structural sharing.

Type of change

New feature (backwards compatible)
New feature (breaking API changes)
Bug fix
Other (Documentation, CI, ...)

Contribution checklist

Add our license to new files.
Check that your code adheres to our coding style.
Write unittests.
Run the unittests and check that they're passing.
Run the benchmarks to make sure your change doesn't introduce a regression.

Add a per-apply common-subexpression cache to Dcp2Cone so that structurally identical Expression subtrees share one canonicalized expression and one set of auxiliary constraints within a reduction pass. For cp.Problem(cp.Minimize(cp.norm1(x)), [cp.norm1(x) <= 1]) this collapses two epigraph variables and two pairs of abs-epigraph inequalities down to one of each. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Previously the cache embedded `Constant.value.tobytes()` in keys, which made the cache footprint scale with the problem's constant data (~8 MB for an LP 1000x500). Switch Constant keying to object identity, which still deduplicates the common case of a shared Constant reference and brings cache size below ~16 KB across representative problems. Also skip caching any subtree whose canonicalization went through the quad branch. Those canonicalizers emit SymbolicQuadForm markers that downstream code (replace_quad_forms, coeff_extractor) identifies by Python id and assumes are distinct per occurrence; sharing one across sites silently halves quadratic coefficients. Track this with a counter incremented on quad branches; don't cache if it advances under a node. Release the cache at the end of apply() so it does not outlive the reduction. Add a regression test using the QuadForm 0.5*qf + 0.5*qf pattern from test_qp_solvers.py::rep_quad_form. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

replace_quad_form previously gave the placeholder Variable the same id as the quad form it replaced. When the same SymbolicQuadForm appeared at multiple positions in the objective expression (e.g. via CSE, or via 0.5*qf + 0.5*qf with a shared qf), the placeholders collapsed onto a single row in get_var_offsets and the quadratic coefficient was halved. Mint a fresh placeholder id per replacement. quad_forms is keyed by placeholder id, so all downstream lookups continue to work; only the incidental identity quad_form.id == placeholder.id is dropped. With this in place, the CSE cache in Dcp2Cone can also dedup subtrees that go through the quad branch, so the _quad_canon_count guard is removed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

By the time coeff_extractor invokes this, Dcp2Cone has already rewritten every QuadForm into either a SymbolicQuadForm or sum_squares, so the isinstance check on QuadForm is defensive rather than load-bearing. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

- Cache hits discard the stored constraints anyway, so store just the canonical expression. Keeps the per-apply working set smaller. - Add two tests that exercise Dcp2Cone(quad_obj=True): one for shared quad_over_lin subtrees within the objective (dedup to a single SymbolicQuadForm), one for the same subtree appearing in objective and constraint (different canonicalizations kept distinct by the affine_above component of the cache key). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Moves the structural-key helpers from dcp2cone.py into a shared private module cvxpy/reductions/_cse.py so Dnlp2Smooth can reuse them, and wires an analogous per-apply cache into Dnlp2Smooth.canonicalize_tree. The NLP chain (Dnlp2Smooth -> NLP solver, via nlp_solving_chain.py) is not downstream of Dcp2Cone, so duplicate subtrees in DNLP problems were producing duplicate aux Variables and constraints; this PR fixes that the same way #3353 did for Dcp2Cone. Also tightens _constant_key: small Constants (<= 64 elements) are now keyed by value rather than id. This catches the case where two structurally identical user expressions embed distinct Constant objects for default scalar parameters (e.g. each cp.huber(x) call mints a fresh Constant(0.5) for the default M), which would otherwise defeat the merge. Large arrays stay id-keyed to avoid copying problem data into cache keys. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

cvxpy uses bare acronyms as module names (psd.py, soc_canon.py, scs_conif.py, dcp2cone/, dgp2dcp/, dnlp2smooth/, etc.) and does not prefix internal modules with an underscore. cse (Common Subexpression Elimination) is descriptive enough on its own. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

github-actions · 2026-05-27T22:45:55Z

Benchmarks that have improved:

   before           after         ratio
 [6b637368]       [ca2037b6]

    904±0ms          784±0ms     0.87  matrix_stuffing.ConeMatrixStuffingBench.time_compile_problem

Benchmarks that have stayed the same:

   before           after         ratio
 [6b637368]       [ca2037b6]
     14.9±0ms         15.4±0ms     1.04  simple_QP_benchmarks.ParametrizedQPBenchmark.time_compile_problem
      964±0ms          994±0ms     1.03  gini_portfolio.Cajas.time_compile_problem
      2.59±0s          2.66±0s     1.03  quantum_hilbert_matrix.QuantumHilbertMatrix.time_compile_problem
      705±0ms          719±0ms     1.02  simple_QP_benchmarks.LeastSquares.time_compile_problem
      245±0ms          249±0ms     1.02  simple_QP_benchmarks.SimpleQPBenchmark.time_compile_problem
      274±0ms          278±0ms     1.01  slow_pruning_1668_benchmark.SlowPruningBenchmark.time_compile_problem
      1.79±0s          1.80±0s     1.01  simple_QP_benchmarks.UnconstrainedQP.time_compile_problem
      324±0ms          326±0ms     1.00  gini_portfolio.Yitzhaki.time_compile_problem
      12.9±0s          13.0±0s     1.00  finance.CVaRBenchmark.time_compile_problem
     14.8±0ms         14.9±0ms     1.00  simple_LP_benchmarks.SimpleFullyParametrizedLPBenchmark.time_compile_problem
      887±0ms          889±0ms     1.00  simple_LP_benchmarks.SimpleScalarParametrizedLPBenchmark.time_compile_problem
      9.92±0s          9.93±0s     1.00  simple_LP_benchmarks.SimpleLPBenchmark.time_compile_problem
      518±0ms          518±0ms     1.00  semidefinite_programming.SemidefiniteProgramming.time_compile_problem
      293±0ms          292±0ms     1.00  matrix_stuffing.ParamSmallMatrixStuffing.time_compile_problem
      21.2±0s          21.1±0s     1.00  sdp_segfault_1132_benchmark.SDPSegfault1132Benchmark.time_compile_problem
      982±0ms          979±0ms     1.00  finance.FactorCovarianceModel.time_compile_problem
      235±0ms          233±0ms     0.99  gini_portfolio.Murray.time_compile_problem
      1.48±0s          1.47±0s     0.99  matrix_stuffing.ParamConeMatrixStuffing.time_compile_problem
      5.59±0s          5.54±0s     0.99  optimal_advertising.OptimalAdvertising.time_compile_problem
      1.56±0s          1.55±0s     0.99  tv_inpainting.TvInpainting.time_compile_problem
      4.49±0s          4.44±0s     0.99  svm_l1_regularization.SVMWithL1Regularization.time_compile_problem
     49.6±0ms         48.7±0ms     0.98  matrix_stuffing.SmallMatrixStuffing.time_compile_problem
     23.7±0ms         22.8±0ms     0.96  high_dim_convex_plasticity.ConvexPlasticity.time_compile_problem
      4.31±0s          3.93±0s     0.91  huber_regression.HuberRegression.time_compile_problem

'CSE' is compiler-optimization jargon and not widely understood outside that niche. 'subexpression cache' describes what the module supports in terms that don't require CS expertise. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Two small fixes for PR #3355: 1. PR review nit: expr_key's docstring still claimed "Constants key by object identity," but small Constants (<= 64 elements) now key by value; updated to reflect both branches. 2. CI: test_copt_mi_socp_1 fails by ~7e-5 on the CSE-deduplicated formulation. The continuous SOCP relaxation (verified at high precision via CLARABEL) sits at x[0] = -0.78510, while COPT lands at -0.78503 with CSE -- both within typical MI-SOCP precision, but the test's hardcoded -0.78510265 expects 4 decimals. MOSEK, CPLEX, and SCIP already use places=3 on this same test for the same tolerance reason; align COPT. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Matches the existing MOSEK/CPLEX/SCIP places=3 lines, which carry no comment. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Constant.__init__ stores float64 ndarrays by reference (no copy in ndarray_interface.const_to_matrix), so two cp.Constant(arr) wrappers around the same source ndarray share _value. Keying on id(expr.value) in that case catches the dedup without copying bytes into the cache key. Restricted to float64 ndarrays because other dtypes go through astype(float64) (which copies) or scipy sparse's csc_array constructor (which builds a fresh wrapper), so the id-of-underlying branch only fires when sharing is real. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

SteveDiamond · 2026-06-04T21:59:39Z

This generally looks really good. @Transurgeon can you review the changes to DNLP?

SteveDiamond

Reviewed with differential testing of shared-vs-unshared canonicalization across many atoms/solvers — the CSE correctness looks solid (the DAG-mutation risk is correctly defended by the fresh-placeholder-id fix, and the _affine_above_relevant key logic mirrors canonicalize_tree). A few inline notes below; the main one is the O(N²) canonicalization regression on deep expression trees.

SteveDiamond · 2026-06-04T22:00:01Z

+    """
+
+
+def expr_key(expr):


Performance: O(N²) canonicalization regression on deep trees.

expr_key recursively walks the entire subtree to build a key, and canonicalize_tree calls it at every node with no memoization → O(N²) where canonicalization was O(N). The key tuples are themselves O(subtree)-sized, so per-node hashing/storage is also O(subtree) → O(N²) memory.

Measured on a depth-N cp.abs(-(-(…-x))) chain (toggling only the cache-key construction):

depth master this PR slowdown

100 0.72 ms 3.79 ms 5×

400 3.11 ms 68.7 ms 22×

800 7.36 ms 311 ms 42×

Master scales ×2 per doubling (linear); this PR scales ×4 (quadratic), and the factor widens with N. Flat n-ary sums are unaffected (CVXPY flattens associative ops) — it bites genuinely deep trees: nested abs/reshape/neg/index chains.

Suggested fix: thread a per-apply {id(expr): key} memo through expr_key so each node's key is built once and child keys are reused (one bottom-up O(N) pass). Same change applies to the Dnlp2Smooth call site.

SteveDiamond · 2026-06-04T22:00:01Z

+            return (structural, bool(affine_above))
+        return (structural, None)
+
+    def _affine_above_relevant(self, expr) -> bool:


This is a second, independent O(N²) whole-subtree walk: _make_cache_key calls _affine_above_relevant at every node (when quad_obj=True), and it recurses over the full subtree each time. Instrumented call count tracks N²/2 exactly (5,155 calls at depth 100; 80,605 at depth 400).

It can fold into the same bottom-up pass suggested for expr_key: a node is relevant iff it is quad-eligible or any child is relevant — O(1) amortized per node.

SteveDiamond · 2026-06-04T22:00:01Z

+        cache_key = None
+        if isinstance(expr, Expression):
+            try:
+                cache_key = expr_key(expr)


Same O(N²) pattern as Dcp2Cone: expr_key(expr) is recomputed from scratch at every node with no reuse of child keys already built. The memoization fix (per-apply {id(expr): key} memo) should be applied here too.

SteveDiamond · 2026-06-04T22:00:01Z

+    except (TypeError, ValueError):
+        return ("const", id(expr))
+    if arr.size <= _CONSTANT_VALUE_HASH_MAX_SIZE:
+        return ("const-val", arr.shape, str(arr.dtype), arr.tobytes())


Sparse Constants are silently mis-keyed here. CVXPY Constants frequently wrap SciPy sparse matrices, and np.asarray(sparse) returns a 0-dim object array of size 1 — which passes the arr.size <= 64 guard. arr.tobytes() then returns the 8 bytes of id(value) (verified), not the contents, under the 'const-val' label, with the real shape recorded as ().

This is not a correctness bug — within one apply() all Constants are live, so ids are unique and no false merge can occur — but it (a) defeats CSE entirely for every sparse-constant subtree and (b) is a fragile footgun: a "by-value" path that actually encodes a transient pointer and discards shape.

Suggest guarding arr.dtype != object before the tobytes() value branch, and handling sparse explicitly (e.g. key on (data.tobytes(), indices.tobytes(), indptr.tobytes(), shape)), falling back to id(expr) (the wrapper, kept alive by the problem tree) rather than id(value).

SteveDiamond · 2026-06-04T22:00:01Z

+        # and user constraints are intentionally excluded so their IDs flow
+        # through to inverse_data unchanged.
+        cache_key = None
+        if isinstance(expr, Expression):


Minor / latent: Dcp2Cone explicitly excludes partial_problem from the cache (so embedded constraint ids flow through unchanged), but here the eligibility check is a bare isinstance(expr, Expression) — and PartialProblem is an Expression. It's currently latent (the NLP path doesn't support partial_optimize and would fail earlier), but it's an inconsistency that becomes a real id-collapse trap if NLP partial_optimize support is ever added. Worth mirroring the Dcp2Cone guard now.

SteveDiamond · 2026-06-04T22:00:01Z

    quad_form = expr.args[idx]
-    placeholder = Variable(quad_form.shape,
-                           var_id=quad_form.id)
+    placeholder = Variable(quad_form.shape)


The fresh-id change is correct. One trivial follow-up: the comment at coeff_extractor.py:131 ("var_id is the placeholder's ID (= the SymbolicQuadForm's ID)") is now stale — the placeholder no longer shares the quad form's id. The logic is unaffected (orig_id is recomputed from quad_forms[var_id][2].args[0].id), but updating that comment in this PR would prevent someone from re-introducing the var_id=quad_form.id assumption.

PTNobel and others added 7 commits May 27, 2026 13:45

Rename cse.py to subexpr_cache.py

ea92426

'CSE' is compiler-optimization jargon and not widely understood outside that niche. 'subexpression cache' describes what the module supports in terms that don't require CS expertise. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

claude Bot reviewed May 27, 2026

View reviewed changes

Comment thread cvxpy/reductions/subexpr_cache.py Outdated

PTNobel changed the base branch from ptn/dcp2cone-cse to master May 27, 2026 22:48

PTNobel changed the title ~~Dedup identical subtrees in Dnlp2Smooth canonicalization~~ Dedup identical subtrees in Dcp2Cone and Dnlp2Smooth canonicalization May 27, 2026

PTNobel mentioned this pull request May 27, 2026

Dedup identical subtrees in Dcp2Cone canonicalization #3353

Closed

11 tasks

PTNobel and others added 3 commits May 27, 2026 15:56

Drop explanatory comment on COPT MI-SOCP tolerance

5fafa4d

Matches the existing MOSEK/CPLEX/SCIP places=3 lines, which carry no comment. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

SteveDiamond added the PR backport needed label Jun 4, 2026

SteveDiamond reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dedup identical subtrees in Dcp2Cone and Dnlp2Smooth canonicalization#3355

Dedup identical subtrees in Dcp2Cone and Dnlp2Smooth canonicalization#3355
PTNobel wants to merge 11 commits into
masterfrom
ptn/dnlp2smooth-cse

PTNobel commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

SteveDiamond commented Jun 4, 2026

Uh oh!

SteveDiamond left a comment

Uh oh!

SteveDiamond Jun 4, 2026

Uh oh!

SteveDiamond Jun 4, 2026

Uh oh!

SteveDiamond Jun 4, 2026

Uh oh!

SteveDiamond Jun 4, 2026

Uh oh!

SteveDiamond Jun 4, 2026

Uh oh!

SteveDiamond Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

depth	master	this PR	slowdown
100	0.72 ms	3.79 ms	5×
400	3.11 ms	68.7 ms	22×
800	7.36 ms	311 ms	42×

Conversation

PTNobel commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What's in the PR

Why this is one PR

Tests

Downstream audits (no analog of the replace_quad_forms bug found)

Type of change

Contribution checklist

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

SteveDiamond commented Jun 4, 2026

Uh oh!

SteveDiamond left a comment

Choose a reason for hiding this comment

Uh oh!

SteveDiamond Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

SteveDiamond Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

SteveDiamond Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

SteveDiamond Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

SteveDiamond Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

SteveDiamond Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PTNobel commented May 27, 2026 •

edited

Loading

Downstream audits (no analog of the `replace_quad_forms` bug found)

github-actions Bot commented May 27, 2026 •

edited

Loading