Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Integration test for simple example #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: wgraham/causalproblem-constraints-fn
Choose a base branch
from

Conversation

willGraham01
Copy link
Collaborator

No description provided.

@willGraham01
Copy link
Collaborator Author

Writing myself a comment here so I don't forget (may transfer to an issue).

A problem that we're now hitting in the "optimisation" step is that the randomness we introduce when drawing samples / computing expectations causes gradient methods to be uspet. This is also coupled with the fact that jax apparently cannot do constrained optimisation (they recommend jaxopt which is pre-v1-package).

Our problem - at least analytically - is deterministic. Find parameters $\Theta$ that maximise some causal estimand $\sigma(\Theta)$ subject to some constraints $\phi(\Theta)$ being within $\epsilon$-tolerance of some observed data $\hat{\phi}$. Note that this is NOT Maximum Likelihood Estimation (we don't want to find the $\Theta$ that maximises the chances of observing $\hat{\phi}$, we want to minimise $\sigma$ whilst keeping $\hat{\phi}$ an $\epsilon$-viable outcome. Note that $\sigma$ and $\phi$ will also have well-defined $\Theta$-gradients (Jacobians) and Hessians, again at least analytically.

The difficultly is that we currently do not have a way to evaluate $\sigma$ or $\phi$ analytically. Right now we rely on methods that involve drawing samples from the underlying distributions and estimating things like the expectation and variance. This theoretically means that evaluating these functions at the same $\Theta$ could return different values, which obviously then messes up things like the gradient (if it can be inferred). As such, attempting to actually solve one of the minimisation problems more often than not just results in no convergence / nonsensical outputs, even when the function is given the analytic answer as the starting point.

Not sure what's out there to help us combat this. If we had access to the CDFs of the distributions, I think our issues would be solved (or even the PDFs maybe - we'd have to numerically integrate them sure but it wouldn't be too bad)? Basically using random samples as the sole basis for evaluating expectations and the like is coming back to bite us.

Reinstate the in-development integration test.

This reverts commit 3114896.
@willGraham01 willGraham01 changed the base branch from main to wgraham/causalproblem-constraints-fn May 2, 2025 09:42
@willGraham01
Copy link
Collaborator Author

willGraham01 commented May 6, 2025

Further to the above point, swapping the un-commented lines out for their commented equivalents causes the optimisation to break, which at least confirms that it is something to do with the randomness (even fixed randomness) that is upsetting the optimiser.

cp = CausalProblem(graph, label="CP")
cp.set_causal_estimand(
    expectation_with_n_samples(),
    # rvs_to_nodes={"rv": "y"},
    rvs_to_nodes={"rv": "mu"},
    graph_argument="g",
)
cp.set_constraints(
    expectation_with_n_samples(),
    # rvs_to_nodes={"rv": "x"},
    rvs_to_nodes={"rv": "mu"},
    graph_argument="g",
)

Edit 2: We can in fact have rvs_to_nodes={"rv": "y"} in the first call, but not rvs_to_nodes={"rv": "x"} in the second. Having rvs_to_nodes={"rv": "x"} in the 2nd always causes non-convergence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant