PERF: Bezier root finding speedup#31005
Conversation
350fc98 to
3fdfa93
Compare
3fdfa93 to
95c8fcb
Compare
|
Can you please split the two changes? I would like to review and discuss them separately. For the coefficient matrices, I'd like to have a built-in guarantee that the matrix and the formula yield the same result. This can be achieved in one of two ways:
|
95c8fcb to
af05432
Compare
a9548a7 to
4206313
Compare
|
Please fix the stubtest |
76c6962 to
52cd7dd
Compare
|
rebased on main |
| coeffs in ascending order: c0 + c1*x + c2*x**2 + ... | ||
| """ | ||
| deg = len(coeffs) - 1 | ||
| n_samples = max(8, deg * 2) |
There was a problem hiding this comment.
I don't think this guarantees you will find all roots, does it? (your initial sieve could be too wide and leave two roots in the same bin, leading to no sign change seen.)
I would think you need something like https://en.wikipedia.org/wiki/Sturm%27s_theorem#Root_isolation
In practice you could also say that we don't care about the degree>=4 case (because Paths can only represent up to cubic beziers), error out for those, and implement the explicit formula for cubics, which is long but reasonably doable (the trig formula https://en.wikipedia.org/wiki/Cubic_equation#Trigonometric_and_hyperbolic_solutions is easier to handle in my experience).
There was a problem hiding this comment.
That's a really good point which I overlooked.
I didn't realize that Paths only go up to cubic though, which really simplifies things. axis_aligned_extrema does root finding on the derivative of the curve's coefficients, so we are ever actually only root finding for degree <= 2. I took out the bisection altogether and swapped in np.roots as a fallback for higher orders. This makes the code a good bit simpler with no speed impact in practice.
7e3b70f to
7f4eb0b
Compare
7f4eb0b to
d007209
Compare
f3a7be2 to
9a25648
Compare
|
@scottshambaugh Do you want to squash? |
|
Squash merge works for me! |

PR summary
This speeds up
Path.get_extentswhich calls these bezier operations.Instead of
np.rootswhich does expensive eigenvalue decomposition, we can take advantage of the fact that we are only looking for roots on the limited [0, 1] interval and just do bisection for degree >= 3. Or we can calculate exactly for degree == 2. These both end up being much faster.This code section is circled below in the profiler runs.
Before:

After:

Profiling test script:
PR checklist