feat: Accelerate discrete inv_cdf search #51

riga · 2025-05-10T15:19:04Z

This PR improves the search for inverse cdf values, mainly in PoissonDiscrete.inv_cdf.

Background

While implementing the Poisson-modelled stat errors I noticed that the conversion of nuisance (parameter) values to the pdf space via inv_cdf took far too long (in the order of 30s for a single histogram with 53 bins). After digging deeper I could identify two separate reasons for this:

The cond_fn for jax.lax.while_loop is meant to return a global stopping decision, rather than per element in the input arrays (thus the jnp.any, but thinking about the XLA lowering, this is the only way to go). However, this causes the body_fn to be processed for each array element, even though this element might already be solved. In a setup with O(100) elements where the number of loop iterations is mainly driven by a single element (say, one element needs 1000k iterations while all others could be done in O(10)), this causes an overhead of 98%.
The current implementation of PoissonDiscrete.inv_cdf performs a one-sided search starting at 0. However, for large $\lambda$ values, and thus wide distributions, the algorithm could profit from a sensible starting value to avoid expensive cdf evaluations in regions far off the target value.

Changes

One a high-level, I added a generalized discrete_inv_cdf_search helper that is used now by PoissonDiscrete.inv_cdf, plus doc strings and test cases.

Issue 1 is solved by flattening the input, passing it through a vmapped version of the search function based on jax.lax.while_loop (which then performs only as many iterations as needed per element), and then reshaping the resulting k values to the desired output shape.
The algorithm itself now accepts starting values, which in the case of Poisson distributions are taken from a normal approximation ($\lambda + \text{normal ppf}(x) * \sqrt{\lambda}$). The larger $\lambda$, the more important is a good starting value, but fortunately also the better the normal approximation 🙂 The search is also capable of striding to smaller values. With that, the amount of necessary Poisson cdf evaluations should never exceed 2.

So far, performance looks good, but I'm planning more large-scale tests with the full stats error handling.

src/evermore/pdf.py

pfackeldey · 2025-05-10T17:13:40Z

This is amazing!
Do you have numbers huch much speedup this gives for low, medium, and high lambda? My hope initially was that in Staterrors we only use PoissonDiscrete when n_eff (lambda) is ~smallish.
Anyway, I'm excited to get this in asap!

src/evermore/pdf.py

riga · 2025-05-10T17:34:27Z

Do you have numbers huch much speedup this gives for low, medium, and high lambda?

I'll do some more tests in a larger example 👍

While doing that, I found a small detail regarding the shape handling which should be solved I think. I'll remove draft mode and provide some performance numbers once finished.

…ix/poisson_inv_cdf

pfackeldey · 2025-08-14T14:28:11Z

@riga I solved the shape "issue" and did some smaller polishing. If the CI is green, I'll merge this 👍

Next step is then to make use of it for the StatErrors, but this will be a different PR.

Thanks again! I could confirm locally that this drastically improves performance for many bins 👍

Add discrete_inv_cdf_search, use in PoissonDiscrete.

b7f57b9

riga requested a review from pfackeldey May 10, 2025 15:19

riga added the enhancement New feature or request label May 10, 2025

riga added 3 commits May 10, 2025 17:19

Typo.

f9b6f61

Typos.

ec147a4

Prefer | over logical_or.

0dd4174

pfackeldey reviewed May 10, 2025

View reviewed changes

src/evermore/pdf.py Outdated Show resolved Hide resolved

pfackeldey reviewed May 10, 2025

View reviewed changes

src/evermore/pdf.py Outdated Show resolved Hide resolved

Add rounding choice check.

a4ab401

riga marked this pull request as draft May 10, 2025 17:34

pfackeldey and others added 5 commits May 14, 2025 09:59

Merge branch 'main' into fix/poisson_inv_cdf

554ab4c

Merge branch 'main' into fix/poisson_inv_cdf

16d75e5

merge main; resolve conflicts

dacc72e

Merge branch 'fix/poisson_inv_cdf' of github.com:riga/evermore into f…

ec25400

…ix/poisson_inv_cdf

avoid shape manipulation & polish

0502617

pfackeldey marked this pull request as ready for review August 14, 2025 14:28

pfackeldey approved these changes Aug 14, 2025

View reviewed changes

pfackeldey merged commit e0497d5 into pfackeldey:main Aug 14, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Accelerate discrete inv_cdf search #51

feat: Accelerate discrete inv_cdf search #51

Uh oh!

riga commented May 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

pfackeldey commented May 10, 2025

Uh oh!

Uh oh!

riga commented May 10, 2025

Uh oh!

pfackeldey commented Aug 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat: Accelerate discrete inv_cdf search #51

feat: Accelerate discrete inv_cdf search #51

Uh oh!

Conversation

riga commented May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

Uh oh!

Uh oh!

pfackeldey commented May 10, 2025

Uh oh!

Uh oh!

riga commented May 10, 2025

Uh oh!

pfackeldey commented Aug 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

riga commented May 10, 2025 •

edited

Loading