DOC: Poisson criterion is not slower than MSE in decision trees #32203

cakedev0 · 2025-09-17T08:59:27Z

In the user guide, remove the sentence:

Note that it fits much slower than the MSE criterion.

From:

Setting criterion="poisson" might be a good choice if your target is a count or a frequency (count per some unit). In any case,
is a necessary condition to use this criterion. Note that it fits much slower than the MSE criterion. For performance reasons the actual implementation minimizes the half mean poisson deviance, i.e. the mean poisson deviance divided by 2.

As it's not true: poisson criterion is only ~10% slower than MSE criterion. Did the experiment with the same script than for this PR #32181 for both criteria, the execution time is vastly dominated by the sort (sort_samples_and_feature_values).

github-actions · 2025-09-17T09:00:26Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: acaa721. Link to the linter CI: here}

adam2392 · 2025-09-17T12:40:29Z

For completeness and self containment of the PR, can you link/copy the code and relevant results comparing poisson and MSE?

Thanks!

cakedev0 · 2025-09-17T12:56:07Z

Benchmark script:

from time import perf_counter
import numpy
from sklearn.tree import DecisionTreeRegressor

if __name__ == "__main__":
    d = 20
    n = 3_000_000 // d
    n_fit = 10
    for criterion in ["squared_error", "poisson"]:
        dt = 0
        for _ in range(n_fit):
            X = numpy.random.rand(n, d)
            y = numpy.random.rand(n) + X.sum(axis=1)
            t = perf_counter()
            tree = DecisionTreeRegressor(max_depth=4, max_features=d,
                                        criterion=criterion).fit(X, y)
            dt += perf_counter() - t
        print(f"{criterion}: {dt / n_fit:.3f}s")

Results:

squared_error: 0.990s
poisson: 1.139s

Also a flame graph for a run with just criterion="poisson" showing that sort dominates (and hence, poisson loss computation is not very significant in the exec time):

adam2392 · 2025-09-17T13:00:21Z

@thomasjpfan do you have an historical context why the docs say poisson is much slower than MSE?

cakedev0 · 2025-09-17T13:14:58Z

historical context why the docs say poisson is much slower

The PR that added poisson loss: #17386

The original code was basically the same as today.

I quickly browsed the reviews and I think this statement was just not challenged (which seems fair, I tend to challenge people when say it's fast but much less when they say it's slow 😂 ).

My hypothesis: it's true that the criterion-related computations are much slower for Poisson loss than for MSE, that's why the PR author added this comment. But because tree building is sort-dominated, this doesn't change much the total exec. time.

thomasjpfan · 2025-09-17T16:06:25Z

From memory it was because the Poisson's node_impurity needs to recompute the loss by going through the data again:

scikit-learn/sklearn/tree/_criterion.pyx

Lines 1580 to 1588 in be9ac7a

    
               cdef float64_t node_impurity(self) noexcept nogil: 
        
                   """Evaluate the impurity of the current node. 
        
                   Evaluate the Poisson criterion as impurity of the current node, 
        
                   i.e. the impurity of sample_indices[start:end]. The smaller the impurity the 
        
                   better. 
        
                   """ 
        
                   return self.poisson_loss(self.start, self.end, self.sum_total, 
        
                                            self.weighted_n_node_samples)

where MSE can compute the node_impurity without going through the data:

scikit-learn/sklearn/tree/_criterion.pyx

Lines 1076 to 1080 in be9ac7a

    
           impurity = self.sq_sum_total / self.weighted_n_node_samples 
        
           for k in range(self.n_outputs): 
        
               impurity -= (self.sum_total[k] / self.weighted_n_node_samples)**2.0 
        
           return impurity / self.n_outputs

Although, "much slower" could be an overstatement. @lorentzenchr Do you recall why poisson was marked as "much slower" in the docs?

cakedev0 · 2025-09-26T12:30:54Z

Bump here.

While it remaines unclear why this statement was added in the doc, I feel that we have enough proofs to remove it:

theorically: compared to squared error, poisson loss only changes a "O(n) part" of the O(n log n) algorithm, so even if the constant in this O(n) part gets quite bigger, we don't expect a significant impact on execution time.
flame graph from py-spy confirms that the O(n log n) part is dominant in the execution time.
benchmarks confirm that is not that much slower (max 25% slower).

Here is a new extensive benchmark, trying to explore cases were the sort might be less dominant in the execution time (duplicates, no max_depth). In all cases, poisson is less than 25% slower than MSE.

from time import perf_counter
import numpy
from sklearn.tree import DecisionTreeRegressor

if __name__ == "__main__":
    n_fit = 15
    n_skip = 5
    for d in [2, 20]:
        for with_duplicates in [False, True]:
            for max_depth in [4, None]:
                for criterion in ["squared_error", "poisson"]:
                    n = 2_000_000 // d
                    dts = []    
                    for _ in range(n_fit):
                        X = numpy.random.rand(n, d)
                        if with_duplicates:
                            X = X.round(2)
                        y = numpy.random.rand(n) + X.sum(axis=1)
                        t = perf_counter()
                        tree = DecisionTreeRegressor(
                            criterion=criterion,
                            max_features=d, max_depth=max_depth,
                        )
                        tree.fit(X, y)
                        dts.append(perf_counter() - t)
                    avg = numpy.mean(dts[n_skip:])
                    std = numpy.std(dts[n_skip:])
                    print(
                        f"d={d}; with_duplicates={with_duplicates}; "
                        f"max_depth={max_depth}; criterion={criterion}:"
                        f" {avg:.2f} ± {std:.3f}s"
                    )
                print()

Results:

d=2; with_duplicates=False; max_depth=4; criterion=squared_error: 0.97 ± 0.025s
d=2; with_duplicates=False; max_depth=4; criterion=poisson: 1.13 ± 0.017s

d=2; with_duplicates=False; max_depth=None; criterion=squared_error: 4.57 ± 0.331s
d=2; with_duplicates=False; max_depth=None; criterion=poisson: 5.43 ± 0.083s

d=2; with_duplicates=True; max_depth=4; criterion=squared_error: 0.33 ± 0.014s
d=2; with_duplicates=True; max_depth=4; criterion=poisson: 0.35 ± 0.007s

d=2; with_duplicates=True; max_depth=None; criterion=squared_error: 0.56 ± 0.007s
d=2; with_duplicates=True; max_depth=None; criterion=poisson: 0.69 ± 0.006s

d=20; with_duplicates=False; max_depth=4; criterion=squared_error: 0.68 ± 0.002s
d=20; with_duplicates=False; max_depth=4; criterion=poisson: 0.77 ± 0.008s

d=20; with_duplicates=False; max_depth=None; criterion=squared_error: 2.01 ± 0.026s
d=20; with_duplicates=False; max_depth=None; criterion=poisson: 2.42 ± 0.058s

d=20; with_duplicates=True; max_depth=4; criterion=squared_error: 0.26 ± 0.006s
d=20; with_duplicates=True; max_depth=4; criterion=poisson: 0.26 ± 0.002s

d=20; with_duplicates=True; max_depth=None; criterion=squared_error: 1.23 ± 0.007s
d=20; with_duplicates=True; max_depth=None; criterion=poisson: 1.38 ± 0.006s

remove sentence

acaa721

github-actions bot added the Documentation label Sep 17, 2025

cakedev0 changed the title ~~DOC: Poisson criterion is not slower than MSE~~ DOC: Poisson criterion is not slower than MSE in decision trees Sep 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC: Poisson criterion is not slower than MSE in decision trees #32203

DOC: Poisson criterion is not slower than MSE in decision trees #32203

Uh oh!

cakedev0 commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

adam2392 commented Sep 17, 2025

Uh oh!

cakedev0 commented Sep 17, 2025

Uh oh!

adam2392 commented Sep 17, 2025

Uh oh!

cakedev0 commented Sep 17, 2025

Uh oh!

thomasjpfan commented Sep 17, 2025

Uh oh!

cakedev0 commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

DOC: Poisson criterion is not slower than MSE in decision trees #32203

Are you sure you want to change the base?

DOC: Poisson criterion is not slower than MSE in decision trees #32203

Uh oh!

Conversation

cakedev0 commented Sep 17, 2025

Uh oh!

github-actions bot commented Sep 17, 2025

✔️ Linting Passed

Uh oh!

adam2392 commented Sep 17, 2025

Uh oh!

cakedev0 commented Sep 17, 2025

Uh oh!

adam2392 commented Sep 17, 2025

Uh oh!

cakedev0 commented Sep 17, 2025

Uh oh!

thomasjpfan commented Sep 17, 2025

Uh oh!

cakedev0 commented Sep 26, 2025

Uh oh!

Uh oh!