`PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix` #9720

lianghx123 · 2025-12-25T03:43:08Z

Title:
PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix

Description:

Problem
The current implementation of _r_matrices in VECM explicitly constructs the projection/annihilator matrix $M$:
$$ M = I_T - X' (X X')^{-1} X $$
This matrix has dimensions $(T \times T)$.

Memory: For $T=30,000$, $M$ requires ~7.2 GB of RAM (float64), often leading to MemoryError or severe swapping.
Performance: The subsequent matrix multiplication is dominated by $O(T^2)$ operations.

Solution
Refactored _r_matrices to calculate residuals directly using the associativity of matrix multiplication, without forming $M$:
$$ R = Y - (Y X') (X X')^{-1} X $$
This reduces the memory complexity to $O(T)$ (for storing inputs/outputs) and computational complexity significantly, as intermediate matrices are now $(K \times N)$ or $(N \times N)$ where $N, K \ll T$.

Benchmark Results
Comparison against the current main branch across various scenarios (using time.time()).
Significant speedups are observed as $T$ increases, with ~113x speedup for $T=30,000$.

Scenario	Nobs	Variables	Lags	Time Original (s)	Time Optimized (s)	Speedup	Max Coef Diff
short_nc	600	3	1	0.0050	0.0010	5.0x	2.8e-17
large_nc	5,000	6	2	0.2923	0.0100	29.2x	2.4e-17
high_lag	12,000	5	3	1.0203	0.0190	53.7x	1.3e-17
dim_rank4	30,000	6	4	6.0690	0.0535	113.4x	1.6e-17

Click to see full comprehensive benchmark table

The following table compares the original implementation vs. this PR.
z_alpha etc. refer to the t-stats. Differences are within machine precision (float64 associativity noise).

scenario	nobs	vars	k_ar_diff	rank	det	seasons	exog_cols	t_orig_s	t_new_s	speedup_x	gamma_max_abs_diff	alpha_max_abs_diff	beta_max_abs_diff	llf_abs_diff	sigma_u_max_abs_diff	z_alpha_max_abs_diff	z_beta_max_abs_diff	z_gamma_max_abs_diff	aic_abs_diff	bic_abs_diff	hqic_abs_diff	fpe_abs_diff
short_nc	600	3	1	1	nc	0	0	0.005	0.001	5	2.77556e-17	1.04083e-17	4.44089e-15	0	1.11022e-16	3.9968e-15	1.15463e-14	6.66134e-16	0	0	0	0
seasonal_rank_max	2400	4	2	3	co	4	0	0.0468	0.0035	13.34	1.40946e-17	7.04731e-18	4.44089e-16	0	1.73472e-17	2.9976e-15	1.9984e-15	8.88178e-16	0	0	0	0
trend_exog	6000	3	2	1	ci	0	2	0.2622	0.0095	27.56	8.67362e-18	2.60209e-18	1.33227e-15	3.63798e-12	2.22045e-16	8.88178e-15	7.99361e-15	6.66134e-16	1.77636e-15	1.77636e-15	1.77636e-15	8.18545e-12
medium_ci	8000	4	1	1	ci	0	0	0.5024	0.0105	47.82	6.93889e-18	3.71339e-18	8.43769e-14	0	2.22045e-16	8.88178e-15	9.32587e-15	6.66134e-16	0	0	0	0
high_lag	12000	5	3	2	co	0	1	1.0203	0.019	53.7	1.30104e-17	5.3668e-18	6.21725e-15	1.45519e-11	4.44089e-16	1.5099e-14	1.33227e-14	1.44329e-15	1.77636e-15	1.77636e-15	1.77636e-15	2.32831e-09
broad_vars10	25000	10	2	5	co	0	2	3.9081	0.0576	67.83	1.9082e-17	9.05309e-18	2.17604e-14	0	2.22045e-16	6.68354e-14	1.1191e-13	3.10862e-15	0	0	0	0
large_nc	5000	6	2	2	nc	0	0	0.2923	0.01	29.23	2.42861e-17	3.14419e-17	6.12843e-14	0	3.33067e-16	3.01981e-14	3.09086e-13	1.9984e-15	0	0	0	0
dim_rank4	30000	6	4	4	co	0	0	6.069	0.0535	113.42	1.56125e-17	1.31188e-17	5.68434e-13	0	1.11022e-16	1.6942e-12	1.01696e-13	3.10862e-15	0	0	0	0

Numerical Check: All coefficient differences are $< 10^{-14}$. LLF differences are $< 10^{-11}$ (acceptable for large cumulative sums).

Mathematical Equivalence
The change relies on the identity $(Y M) = Y (I - P) = Y - YP$, where $P$ is the projection matrix onto $\Delta X$.

PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix

80f8c10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix` #9720

`PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix` #9720

lianghx123 commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix #9720

Are you sure you want to change the base?

PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix #9720

Conversation

lianghx123 commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix` #9720

`PERF: Optimize VECM memory/speed by avoiding O(T^2) projection matrix` #9720