Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: implement fit_regularized for HurdleCountModel#9746

Open
wcwatson wants to merge 4 commits intostatsmodels:mainfrom
wcwatson:fix__hurdle_fit_reg
Open

BUG: implement fit_regularized for HurdleCountModel#9746
wcwatson wants to merge 4 commits intostatsmodels:mainfrom
wcwatson:fix__hurdle_fit_reg

Conversation

@wcwatson
Copy link

@wcwatson wcwatson commented Jan 19, 2026

This PR implements HurdleCountModel.fit_reqularized(...) so that it handles the two "component models" of the mixture. Inheriting from superclasses raises a ValueError. See the original bug report for additional details. Changes made in support of that implementation include:

  • Implementations of auxiliary methods in HurdleCountModel: score_obs, score, and hessian.
    • In future these might be folded into HurdleCountModel.fit(...) for greater parallelism with other classes, but I have left that as out-of-scope for this PR.
  • Changes to L1HurdleCountResults to avoid a diamond inheritance problem.
    • The existing multiple inheritance implementation of L1HurdleCountResults(L1CountResults, HurdleCountResults) causes either df_model or df_resid to be incorrectly defined, depending on the order of the superclasses. Adding passthrough **kwargs in relevant places does not solve the issue.
    • My solution was to remove L1CountResults as a superclass and just duplicate the post-super().__init__() portion of the L1 results class in L1HurdleCountResults. This doesn't feel ideal and there might be a clever mixin solution, but that's out-of-scope for this PR.
  • Three new test classes, all of which will fail in the main branch:
    • TestRegularizedHurdleSimulated runs the CheckHurdlePredict suite of tests to verify that the results object has attributes correctly populated and that predicted values are close to expected values. In order to match expected values, regularization in the "zero model" must be very weak.
    • TestHurdleL1 runs the CheckLikelihoodModelL1 suite of tests to verify that results match "external" benchmarks. Since the implementation in R's pscl library does not fit regularized hurdle models, I simply ran a regularized model on the docvis dataset and recorded the results.
    • TestHurdleL1Compatibility runs the CheckL1Compatibility suite of tests to verify that (i) weak/zero regularization yields coefficients equal to an unregularized model, and (ii) extremely strong regularization zeroes out coefficients.

In addition to these changes, I've left some inline comments to document some odd behavior I observed in right-censored negative binomial models (used in the HurdleCountModel where zerodist="negbin"). Specifically, start_params needs to be nonzero for regularized fitting to successfully converge. This seems possibly related to #9156 and resolving it is out-of-scope for this PR.

Since this PR only implements functionality that the existing documentation already covers and claims to exist, I have not made any documentation changes.

The Azure pipeline is failing because of an error that arises only in the python_314t_parallel instance when testing functionality in statsmodels.nonparametric.... The new tests included in this PR pass.

@wcwatson wcwatson marked this pull request as ready for review January 21, 2026 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: fit_regularized in HurdleCountModel raises a ValueError

1 participant