Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: Add no cross terms option to White's test for heteroscedasticity#9691

Open
IntegralIndefinida wants to merge 7 commits intostatsmodels:mainfrom
IntegralIndefinida:het_white
Open

ENH: Add no cross terms option to White's test for heteroscedasticity#9691
IntegralIndefinida wants to merge 7 commits intostatsmodels:mainfrom
IntegralIndefinida:het_white

Conversation

@IntegralIndefinida
Copy link

@IntegralIndefinida IntegralIndefinida commented Nov 16, 2025

This PR adds a interaction_terms parameter to the het_white function to allow users to choose whether to include interaction terms in White's heteroskedasticity test. This is useful since adding the cross terms consumes degrees of freedom. Additionally, White's heteroscedasticity test with cross terms can also be a specification test, and, according to Richard Harris (as cited in Gujarati's chapter on heteroscedasticity), if we remove the cross terms, it constitutes a pure heteroscedasticity test.

Changes

  • Added interaction_terms parameter (default True) to statsmodels.stats.diagnostic.het_white
  • When interaction_terms=False, the test uses only squared terms (x1², x2², ...) without interaction terms (x1x2, x1x3, ...)

Tests

  • Added test_het_white_no_interaction_terms to verify the interaction_terms=False option

  • Reference values verified against EViews (I also verified the base variant results)

  • [ x ] tests added / passed.

  • [ x ] code/documentation is well formatted.

  • [ x ] properly formatted commit message. See
    NumPy's guide.

Details

Notes:

  • It is essential that you add a test when making code changes. Tests are not
    needed for doc changes.
  • When adding a new function, test values should usually be verified in another package (e.g., R/SAS/Stata).
  • When fixing a bug, you must add a test that would produce the bug in main and
    then show that it is fixed with the new code.
  • New code additions must be well formatted. Changes should pass flake8. If on Linux or OSX, you can
    verify you changes are well formatted by running
    git diff upstream/main -u -- "*.py" | flake8 --diff --isolated
    
    assuming flake8 is installed. This command is also available on Windows
    using the Windows System for Linux once flake8 is installed in the
    local Linux environment. While passing this test is not required, it is good practice and it help
    improve code quality in statsmodels.
  • Docstring additions must render correctly, including escapes and LaTeX.


hw = smdia.het_white(res.resid, res.model.exog,cross_terms=False)
hw_values = (
13.25091965953952
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commas are missing at end of lines



def het_white(resid, exog):
def het_white(resid, exog,cross_terms=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after comma

exog : array_like
The explanatory variables for the variance. Squares and interaction
terms are automatically included in the auxiliary regression.
The explanatory variables for the variance. Squares terms are automatically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep closer to original just add "by default":

Squares and, by default, interaction terms are automatically included in the auxiliary regression.

@josef-pkt
Copy link
Member

Looks good overall

I guess this will then be equivalent to
het_breuschpagan(resid, exog**2)

So, not really needed but addition is fine with me.

@IntegralIndefinida
Copy link
Author

Thank you for your comments. I also changed the flag to interaction_terms to maintain the same terminology.

@josef-pkt
Copy link
Member

Thanks,
PR looks good.
Waiting for the CI to finish, but it looks to me it's ready for merging.

@josef-pkt
Copy link
Member

ci fails

>       hw = smdia.het_white(res.resid, res.model.exog, interaction_terms=False)
             ^^^^^
E       NameError: name 'smdia' is not defined

and one style failure with white space

Copy link
Member

@bashtage bashtage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes please.

i0, i1 = np.triu_indices(nvars0)
exog = x[:, i0] * x[:, i1]
nobs, nvars = exog.shape
assert nvars == nvars0 * (nvars0 - 1) / 2. + nvars0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No asserts please.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was in the original function, should I remove it anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please remove.

terms are automatically included in the auxiliary regression.
The explanatory variables for the variance. Squares and, by default,
interaction terms are automatically included in the auxiliary regression.
interaction_terms : bool, default True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be False for now since this would change the output of tests without a deprecation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interaction_term = True
is the current behavior and needed for backwards compatibility and is the "proper" White test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - read title backward. Yes, True by default.

@IntegralIndefinida
Copy link
Author

ci fails

>       hw = smdia.het_white(res.resid, res.model.exog, interaction_terms=False)
             ^^^^^
E       NameError: name 'smdia' is not defined

and one style failure with white space

sorry, I'll fix that

@bashtage
Copy link
Member

Really should add a check that x has a constant, which is required for White's test (even if the original model doesn't).

@IntegralIndefinida
Copy link
Author

Really should add a check that x has a constant, which is required for White's test (even if the original model doesn't).

That's done by _check_het_test:

def _check_het_test(x: np.ndarray, test_name: str) -> None:
    """
    Check validity of the exogenous regressors in a heteroskedasticity test

    Parameters
    ----------
    x : ndarray
        The exogenous regressor array
    test_name : str
        The test name for the exception
    """
    x_max = x.max(axis=0)
    if (
        not np.any(((x_max - x.min(axis=0)) == 0) & (x_max != 0))
        or x.shape[1] < 2
    ):
        raise ValueError(
            f"{test_name} test requires exog to have at least "
            "two columns where one is a constant."
        )

@IntegralIndefinida
Copy link
Author

I also removed the line

question: does f-statistic make sense? constant ?

because, as @bashtage says, a constant term is required and the F-statistic indeed makes sense, since the White's test only distributes $\chi^2$ asymptotically.

Copy link
Member

@bashtage bashtage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change to improve doc generation.


References
----------
Greene section 11.4.1 5th edition p. 222. Test statistic reproduces
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the references to have the propper format. Should be like

.. [1] Greene, William H. Econometric analysis. 5th Edition. Pearson Education, 2002.
.. [2] Damodar N. Gujarati, Basic Econometrics, section 11.5. Pg 387.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that'd be better

@bashtage
Copy link
Member

Close and reopen to force CI run

@bashtage bashtage closed this Nov 26, 2025
@bashtage bashtage reopened this Nov 26, 2025
@bashtage
Copy link
Member

statsmodels/stats/diagnostic.py:827:1: E302 expected 2 blank lines, found 1

Lint failure

@IntegralIndefinida
Copy link
Author

I fixed the missing line, but the test still fails with the following linting errors:

Running flake8 linting
Linting all files with limited rules
statsmodels/discrete/discrete_model.py:522:13: B043 Do not call delattr with a constant attribute value, it is not any safer than normal property access.
statsmodels/discrete/discrete_model.py:1054:13: B043 Do not call delattr with a constant attribute value, it is not any safer than normal property access.
statsmodels/discrete/discrete_model.py:1056:13: B043 Do not call delattr with a constant attribute value, it is not any safer than normal property access.
statsmodels/genmod/generalized_linear_model.py:375:13: B043 Do not call delattr with a constant attribute value, it is not any safer than normal property access.
statsmodels/genmod/generalized_linear_model.py:377:13: B043 Do not call delattr with a constant attribute value, it is not any safer than normal property access.
Changed files failed linting using the required set of rules.
Additions and changes must conform to Python code style rules.
No new files to lint
Running isort
Skipped 1 files

##[error]Bash exited with code '1'.
Finishing: Check style

Those are unrelated to my commits

@bashtage
Copy link
Member

bashtage commented Jan 8, 2026

Closing and reopening to see CI run

@bashtage bashtage closed this Jan 8, 2026
@bashtage bashtage reopened this Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants