-
Notifications
You must be signed in to change notification settings - Fork 0
Integration test for simple example #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Writing myself a comment here so I don't forget (may transfer to an issue). A problem that we're now hitting in the "optimisation" step is that the randomness we introduce when drawing samples / computing expectations causes gradient methods to be uspet. This is also coupled with the fact that Our problem - at least analytically - is deterministic. Find parameters The difficultly is that we currently do not have a way to evaluate Not sure what's out there to help us combat this. If we had access to the CDFs of the distributions, I think our issues would be solved (or even the PDFs maybe - we'd have to numerically integrate them sure but it wouldn't be too bad)? Basically using random samples as the sole basis for evaluating expectations and the like is coming back to bite us. |
Reinstate the in-development integration test. This reverts commit 3114896.
Further to the above point, swapping the un-commented lines out for their commented equivalents causes the optimisation to break, which at least confirms that it is something to do with the randomness (even fixed randomness) that is upsetting the optimiser. cp = CausalProblem(graph, label="CP")
cp.set_causal_estimand(
expectation_with_n_samples(),
# rvs_to_nodes={"rv": "y"},
rvs_to_nodes={"rv": "mu"},
graph_argument="g",
)
cp.set_constraints(
expectation_with_n_samples(),
# rvs_to_nodes={"rv": "x"},
rvs_to_nodes={"rv": "mu"},
graph_argument="g",
) Edit 2: We can in fact have |
Further to the above, some hacked-together testing of the affect of providing:
All experiments done at a fixed RNG key ( It should be noted that in all cases in this experiment, providing the Jacobian of only the objective function always resulted in non-convergence, which I'm chalking up to the fact that providing the analytic Jacobian was somehow then at odds with the randomness introduced in the function evaluations and/or constraint function changes. The "no jacobians" or "constraints jacobian only" also always return the initial guess, as can be seen from the left-hand plot. This means that only the "provide both Jacobians" method was actually doing anything useful. Otherwise, beyond some beneficial RNG for some sample sizes, it looks like we can expect the error to decay roughly as the square-root of the number of samples, all other factors being equal. Computation time appears approximately linear but this is likely only because the re-parametrisation trick with normal distributions only relies on element-wise multiplication. For sample sizes higher than Most important takeaway: a reliable Jacobian evaluation for both constaints and objective function is pretty much a requirement. Some Further Thoughts
|
No description provided.