Integration test for simple example #45

willGraham01 · 2025-05-01T09:10:05Z

No description provided.

willGraham01 · 2025-05-01T14:17:06Z

Writing myself a comment here so I don't forget (may transfer to an issue).

A problem that we're now hitting in the "optimisation" step is that the randomness we introduce when drawing samples / computing expectations causes gradient methods to be uspet. This is also coupled with the fact that jax apparently cannot do constrained optimisation (they recommend jaxopt which is pre-v1-package).

Our problem - at least analytically - is deterministic. Find parameters $\Theta$ that maximise some causal estimand $\sigma(\Theta)$ subject to some constraints $\phi(\Theta)$ being within $\epsilon$-tolerance of some observed data $\hat{\phi}$. Note that this is NOT Maximum Likelihood Estimation (we don't want to find the $\Theta$ that maximises the chances of observing $\hat{\phi}$, we want to minimise $\sigma$ whilst keeping $\hat{\phi}$ an $\epsilon$-viable outcome. Note that $\sigma$ and $\phi$ will also have well-defined $\Theta$-gradients (Jacobians) and Hessians, again at least analytically.

The difficultly is that we currently do not have a way to evaluate $\sigma$ or $\phi$ analytically. Right now we rely on methods that involve drawing samples from the underlying distributions and estimating things like the expectation and variance. This theoretically means that evaluating these functions at the same $\Theta$ could return different values, which obviously then messes up things like the gradient (if it can be inferred). As such, attempting to actually solve one of the minimisation problems more often than not just results in no convergence / nonsensical outputs, even when the function is given the analytic answer as the starting point.

Not sure what's out there to help us combat this. If we had access to the CDFs of the distributions, I think our issues would be solved (or even the PDFs maybe - we'd have to numerically integrate them sure but it wouldn't be too bad)? Basically using random samples as the sole basis for evaluating expectations and the like is coming back to bite us.

Reinstate the in-development integration test. This reverts commit 3114896.

willGraham01 · 2025-05-06T15:57:26Z

Further to the above point, swapping the un-commented lines out for their commented equivalents causes the optimisation to break, which at least confirms that it is something to do with the randomness (even fixed randomness) that is upsetting the optimiser.

cp = CausalProblem(graph, label="CP")
cp.set_causal_estimand(
    expectation_with_n_samples(),
    # rvs_to_nodes={"rv": "y"},
    rvs_to_nodes={"rv": "mu"},
    graph_argument="g",
)
cp.set_constraints(
    expectation_with_n_samples(),
    # rvs_to_nodes={"rv": "x"},
    rvs_to_nodes={"rv": "mu"},
    graph_argument="g",
)

Edit 2: We can in fact have rvs_to_nodes={"rv": "y"} in the first call, but not rvs_to_nodes={"rv": "x"} in the second. Having rvs_to_nodes={"rv": "x"} in the 2nd always causes non-convergence.

willGraham01 · 2025-05-07T12:57:37Z

Further to the above, some hacked-together testing of the affect of providing:

MOAR samples
The Jacobian of the causal estimand, the constraints function, both, or neither

All experiments done at a fixed RNG key (jax.random.key(0)) for the 2-normal-distribution problem.

It should be noted that in all cases in this experiment, providing the Jacobian of only the objective function always resulted in non-convergence, which I'm chalking up to the fact that providing the analytic Jacobian was somehow then at odds with the randomness introduced in the function evaluations and/or constraint function changes. The "no jacobians" or "constraints jacobian only" also always return the initial guess, as can be seen from the left-hand plot. This means that only the "provide both Jacobians" method was actually doing anything useful.

Otherwise, beyond some beneficial RNG for some sample sizes, it looks like we can expect the error to decay roughly as the square-root of the number of samples, all other factors being equal. Computation time appears approximately linear but this is likely only because the re-parametrisation trick with normal distributions only relies on element-wise multiplication. For sample sizes higher than $10^8$ my laptop runs out of memory.

Most important takeaway: a reliable Jacobian evaluation for both constaints and objective function is pretty much a requirement.

Some Further Thoughts

Wondering what the effect of providing the Hessian would be.
Wondering if it is possible to vectorise the causal_estimand and constraint evaluations? (Beyond the scope of this PR, and has internal difficulties what with the current handling of parameter values too).
Wondering if it is possible to encode the Jacobian and/or Hessian into the Distribution class, so they can potentially be "pre-constructed" by a CausalProblem instance when the causal_estimand is set? Related to Sampling via Parametrisations #50

willGraham01 added 6 commits May 1, 2025 09:57

Write constraints setter as inspired by sigma setter

691ec4c

Refactor common code for setting up CE and Constraints

c689141

Extend tests to cover both CE and constraints setters/evaluations

5d77f5b

Refactor evaluation of CE & constraints

f361eb3

Rename test to be general

958cb02

Skeleton integration test, doesn't quite work though

3c0cf27

willGraham01 added 2 commits May 2, 2025 10:35

Purge currently-failing IT to get working feature in

3114896

Revert "Purge currently-failing IT to get working feature in"

090712e

Reinstate the in-development integration test. This reverts commit 3114896.

willGraham01 changed the base branch from main to wgraham/causalproblem-constraints-fn May 2, 2025 09:42

willGraham01 added 2 commits May 2, 2025 14:37

Test now runs, and genuinely fails!

d15b007

Remove unused import

2e2f8e6

Confirm that stochasticity is behind convergence fails

6990016

Base automatically changed from wgraham/causalproblem-constraints-fn to main May 7, 2025 10:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration test for simple example #45

Integration test for simple example #45

willGraham01 commented May 1, 2025

willGraham01 commented May 1, 2025

willGraham01 commented May 6, 2025 •

edited

Loading

willGraham01 commented May 7, 2025 •

edited

Loading

Integration test for simple example #45

Are you sure you want to change the base?

Integration test for simple example #45

Conversation

willGraham01 commented May 1, 2025

willGraham01 commented May 1, 2025

willGraham01 commented May 6, 2025 • edited Loading

willGraham01 commented May 7, 2025 • edited Loading

Some Further Thoughts

willGraham01 commented May 6, 2025 •

edited

Loading

willGraham01 commented May 7, 2025 •

edited

Loading